Unexplained PLE drop
Our SQL server has 512 GB ram with 448 GB available to SQL 2012 (sp2, cu7) serving a typical OLTP workload. The server is only for SQL other than SIOS for geo clustering. There is not a big swing in target server memory so I don't think its related to OS pressure.
At irregular intervals, SQL will drop PLE by 10k - 40k seconds. Yesterday was a good example, PLE went from 84,675 to 45,796. I know 45k is still a good PLE and PLE will fluctuate, but I'd love to know what to tune.
We've setup logging to capture Buffer Cache usage by DB, and Buffer Cache usage per table for the 2 largest and most active DBs (runs every 10 minutes). For yesterday's example, we saw our most active DB lost 157 MB of cache and only a few table's buffer cache really changed, and at most that was a drop of 223 MB. Those size changes seem very insignificant to me to drop PLE by 40k. We've captured several examples of the drop in PLE and nothing jumps out as extreme/explainable based on the logging data we've captured.
My next thought was that we must be reading different data in from the same tables, but we don't see any corresponding spike in disk read counters. If we're flushing data out, and overall buffer size stays the same we must be reading data in??
We also have Sentry monitoring the server and there are no massive read queries that would explain what query ran and needed to read "all" that new data from disk (granted Sentry is only capturing queries with over 1 second duration so we're not getting everything).
We don't see any correlation with increased memory usage for compiled plans either, so I don't think the plan cache is the cause.
Are there any ideas on what would be dropping PLE or what other monitoring we can implement to track it down?
Thanks in advance!
memory page-life-expectancy
bumped to the homepage by Community♦ 9 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
add a comment |
Our SQL server has 512 GB ram with 448 GB available to SQL 2012 (sp2, cu7) serving a typical OLTP workload. The server is only for SQL other than SIOS for geo clustering. There is not a big swing in target server memory so I don't think its related to OS pressure.
At irregular intervals, SQL will drop PLE by 10k - 40k seconds. Yesterday was a good example, PLE went from 84,675 to 45,796. I know 45k is still a good PLE and PLE will fluctuate, but I'd love to know what to tune.
We've setup logging to capture Buffer Cache usage by DB, and Buffer Cache usage per table for the 2 largest and most active DBs (runs every 10 minutes). For yesterday's example, we saw our most active DB lost 157 MB of cache and only a few table's buffer cache really changed, and at most that was a drop of 223 MB. Those size changes seem very insignificant to me to drop PLE by 40k. We've captured several examples of the drop in PLE and nothing jumps out as extreme/explainable based on the logging data we've captured.
My next thought was that we must be reading different data in from the same tables, but we don't see any corresponding spike in disk read counters. If we're flushing data out, and overall buffer size stays the same we must be reading data in??
We also have Sentry monitoring the server and there are no massive read queries that would explain what query ran and needed to read "all" that new data from disk (granted Sentry is only capturing queries with over 1 second duration so we're not getting everything).
We don't see any correlation with increased memory usage for compiled plans either, so I don't think the plan cache is the cause.
Are there any ideas on what would be dropping PLE or what other monitoring we can implement to track it down?
Thanks in advance!
memory page-life-expectancy
bumped to the homepage by Community♦ 9 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
1
Any bulk inserts happening ? CPU power setting to high performance ? Check [sys.dm_os_ring_buffers ](sqlskills.com/blogs/jonathan/…) for any memory internal or external pressure. Check the plan cache for any large index scans that can trash your PLE values.
– Kin
Dec 23 '16 at 19:15
What is the plan cache doing when the PLE drops? (I know you said what it's not doing.) Does it fluctuate at all?
– Randolph West
Dec 24 '16 at 9:21
No bulk inserts are happening and the CPUs is set to high performance. When PLE drops, the plan cache doesn't really fluctuate at all (technically it grew by 43 mb).
– Doley
Jan 3 '17 at 14:17
add a comment |
Our SQL server has 512 GB ram with 448 GB available to SQL 2012 (sp2, cu7) serving a typical OLTP workload. The server is only for SQL other than SIOS for geo clustering. There is not a big swing in target server memory so I don't think its related to OS pressure.
At irregular intervals, SQL will drop PLE by 10k - 40k seconds. Yesterday was a good example, PLE went from 84,675 to 45,796. I know 45k is still a good PLE and PLE will fluctuate, but I'd love to know what to tune.
We've setup logging to capture Buffer Cache usage by DB, and Buffer Cache usage per table for the 2 largest and most active DBs (runs every 10 minutes). For yesterday's example, we saw our most active DB lost 157 MB of cache and only a few table's buffer cache really changed, and at most that was a drop of 223 MB. Those size changes seem very insignificant to me to drop PLE by 40k. We've captured several examples of the drop in PLE and nothing jumps out as extreme/explainable based on the logging data we've captured.
My next thought was that we must be reading different data in from the same tables, but we don't see any corresponding spike in disk read counters. If we're flushing data out, and overall buffer size stays the same we must be reading data in??
We also have Sentry monitoring the server and there are no massive read queries that would explain what query ran and needed to read "all" that new data from disk (granted Sentry is only capturing queries with over 1 second duration so we're not getting everything).
We don't see any correlation with increased memory usage for compiled plans either, so I don't think the plan cache is the cause.
Are there any ideas on what would be dropping PLE or what other monitoring we can implement to track it down?
Thanks in advance!
memory page-life-expectancy
Our SQL server has 512 GB ram with 448 GB available to SQL 2012 (sp2, cu7) serving a typical OLTP workload. The server is only for SQL other than SIOS for geo clustering. There is not a big swing in target server memory so I don't think its related to OS pressure.
At irregular intervals, SQL will drop PLE by 10k - 40k seconds. Yesterday was a good example, PLE went from 84,675 to 45,796. I know 45k is still a good PLE and PLE will fluctuate, but I'd love to know what to tune.
We've setup logging to capture Buffer Cache usage by DB, and Buffer Cache usage per table for the 2 largest and most active DBs (runs every 10 minutes). For yesterday's example, we saw our most active DB lost 157 MB of cache and only a few table's buffer cache really changed, and at most that was a drop of 223 MB. Those size changes seem very insignificant to me to drop PLE by 40k. We've captured several examples of the drop in PLE and nothing jumps out as extreme/explainable based on the logging data we've captured.
My next thought was that we must be reading different data in from the same tables, but we don't see any corresponding spike in disk read counters. If we're flushing data out, and overall buffer size stays the same we must be reading data in??
We also have Sentry monitoring the server and there are no massive read queries that would explain what query ran and needed to read "all" that new data from disk (granted Sentry is only capturing queries with over 1 second duration so we're not getting everything).
We don't see any correlation with increased memory usage for compiled plans either, so I don't think the plan cache is the cause.
Are there any ideas on what would be dropping PLE or what other monitoring we can implement to track it down?
Thanks in advance!
memory page-life-expectancy
memory page-life-expectancy
asked Dec 23 '16 at 16:57
DoleyDoley
534
534
bumped to the homepage by Community♦ 9 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
bumped to the homepage by Community♦ 9 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
1
Any bulk inserts happening ? CPU power setting to high performance ? Check [sys.dm_os_ring_buffers ](sqlskills.com/blogs/jonathan/…) for any memory internal or external pressure. Check the plan cache for any large index scans that can trash your PLE values.
– Kin
Dec 23 '16 at 19:15
What is the plan cache doing when the PLE drops? (I know you said what it's not doing.) Does it fluctuate at all?
– Randolph West
Dec 24 '16 at 9:21
No bulk inserts are happening and the CPUs is set to high performance. When PLE drops, the plan cache doesn't really fluctuate at all (technically it grew by 43 mb).
– Doley
Jan 3 '17 at 14:17
add a comment |
1
Any bulk inserts happening ? CPU power setting to high performance ? Check [sys.dm_os_ring_buffers ](sqlskills.com/blogs/jonathan/…) for any memory internal or external pressure. Check the plan cache for any large index scans that can trash your PLE values.
– Kin
Dec 23 '16 at 19:15
What is the plan cache doing when the PLE drops? (I know you said what it's not doing.) Does it fluctuate at all?
– Randolph West
Dec 24 '16 at 9:21
No bulk inserts are happening and the CPUs is set to high performance. When PLE drops, the plan cache doesn't really fluctuate at all (technically it grew by 43 mb).
– Doley
Jan 3 '17 at 14:17
1
1
Any bulk inserts happening ? CPU power setting to high performance ? Check [sys.dm_os_ring_buffers ](sqlskills.com/blogs/jonathan/…) for any memory internal or external pressure. Check the plan cache for any large index scans that can trash your PLE values.
– Kin
Dec 23 '16 at 19:15
Any bulk inserts happening ? CPU power setting to high performance ? Check [sys.dm_os_ring_buffers ](sqlskills.com/blogs/jonathan/…) for any memory internal or external pressure. Check the plan cache for any large index scans that can trash your PLE values.
– Kin
Dec 23 '16 at 19:15
What is the plan cache doing when the PLE drops? (I know you said what it's not doing.) Does it fluctuate at all?
– Randolph West
Dec 24 '16 at 9:21
What is the plan cache doing when the PLE drops? (I know you said what it's not doing.) Does it fluctuate at all?
– Randolph West
Dec 24 '16 at 9:21
No bulk inserts are happening and the CPUs is set to high performance. When PLE drops, the plan cache doesn't really fluctuate at all (technically it grew by 43 mb).
– Doley
Jan 3 '17 at 14:17
No bulk inserts are happening and the CPUs is set to high performance. When PLE drops, the plan cache doesn't really fluctuate at all (technically it grew by 43 mb).
– Doley
Jan 3 '17 at 14:17
add a comment |
1 Answer
1
active
oldest
votes
Try sp_whoisactive. It's free and easy to implement.
What do the dmv's and wait types point you towards? I would start looking at the various wait types and how much activity are in those wait types. See which ones are occurring just before and during the time frames where you see PLE drops while also looking at the other server metrics. From there, I'd break down what queries are the main contributors to those wait types.
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "182"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f159050%2funexplained-ple-drop%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Try sp_whoisactive. It's free and easy to implement.
What do the dmv's and wait types point you towards? I would start looking at the various wait types and how much activity are in those wait types. See which ones are occurring just before and during the time frames where you see PLE drops while also looking at the other server metrics. From there, I'd break down what queries are the main contributors to those wait types.
add a comment |
Try sp_whoisactive. It's free and easy to implement.
What do the dmv's and wait types point you towards? I would start looking at the various wait types and how much activity are in those wait types. See which ones are occurring just before and during the time frames where you see PLE drops while also looking at the other server metrics. From there, I'd break down what queries are the main contributors to those wait types.
add a comment |
Try sp_whoisactive. It's free and easy to implement.
What do the dmv's and wait types point you towards? I would start looking at the various wait types and how much activity are in those wait types. See which ones are occurring just before and during the time frames where you see PLE drops while also looking at the other server metrics. From there, I'd break down what queries are the main contributors to those wait types.
Try sp_whoisactive. It's free and easy to implement.
What do the dmv's and wait types point you towards? I would start looking at the various wait types and how much activity are in those wait types. See which ones are occurring just before and during the time frames where you see PLE drops while also looking at the other server metrics. From there, I'd break down what queries are the main contributors to those wait types.
edited Nov 13 '18 at 23:10
Community♦
1
1
answered Dec 23 '16 at 17:35
ShaulinatorShaulinator
2,4781622
2,4781622
add a comment |
add a comment |
Thanks for contributing an answer to Database Administrators Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f159050%2funexplained-ple-drop%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
Any bulk inserts happening ? CPU power setting to high performance ? Check [sys.dm_os_ring_buffers ](sqlskills.com/blogs/jonathan/…) for any memory internal or external pressure. Check the plan cache for any large index scans that can trash your PLE values.
– Kin
Dec 23 '16 at 19:15
What is the plan cache doing when the PLE drops? (I know you said what it's not doing.) Does it fluctuate at all?
– Randolph West
Dec 24 '16 at 9:21
No bulk inserts are happening and the CPUs is set to high performance. When PLE drops, the plan cache doesn't really fluctuate at all (technically it grew by 43 mb).
– Doley
Jan 3 '17 at 14:17