Unexplained PLE drop












0















Our SQL server has 512 GB ram with 448 GB available to SQL 2012 (sp2, cu7) serving a typical OLTP workload. The server is only for SQL other than SIOS for geo clustering. There is not a big swing in target server memory so I don't think its related to OS pressure.



At irregular intervals, SQL will drop PLE by 10k - 40k seconds. Yesterday was a good example, PLE went from 84,675 to 45,796. I know 45k is still a good PLE and PLE will fluctuate, but I'd love to know what to tune.



We've setup logging to capture Buffer Cache usage by DB, and Buffer Cache usage per table for the 2 largest and most active DBs (runs every 10 minutes). For yesterday's example, we saw our most active DB lost 157 MB of cache and only a few table's buffer cache really changed, and at most that was a drop of 223 MB. Those size changes seem very insignificant to me to drop PLE by 40k. We've captured several examples of the drop in PLE and nothing jumps out as extreme/explainable based on the logging data we've captured.



My next thought was that we must be reading different data in from the same tables, but we don't see any corresponding spike in disk read counters. If we're flushing data out, and overall buffer size stays the same we must be reading data in??



We also have Sentry monitoring the server and there are no massive read queries that would explain what query ran and needed to read "all" that new data from disk (granted Sentry is only capturing queries with over 1 second duration so we're not getting everything).



We don't see any correlation with increased memory usage for compiled plans either, so I don't think the plan cache is the cause.



Are there any ideas on what would be dropping PLE or what other monitoring we can implement to track it down?



Thanks in advance!










share|improve this question














bumped to the homepage by Community 9 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.











  • 1





    Any bulk inserts happening ? CPU power setting to high performance ? Check [sys.dm_os_ring_buffers ](sqlskills.com/blogs/jonathan/…) for any memory internal or external pressure. Check the plan cache for any large index scans that can trash your PLE values.

    – Kin
    Dec 23 '16 at 19:15











  • What is the plan cache doing when the PLE drops? (I know you said what it's not doing.) Does it fluctuate at all?

    – Randolph West
    Dec 24 '16 at 9:21













  • No bulk inserts are happening and the CPUs is set to high performance. When PLE drops, the plan cache doesn't really fluctuate at all (technically it grew by 43 mb).

    – Doley
    Jan 3 '17 at 14:17


















0















Our SQL server has 512 GB ram with 448 GB available to SQL 2012 (sp2, cu7) serving a typical OLTP workload. The server is only for SQL other than SIOS for geo clustering. There is not a big swing in target server memory so I don't think its related to OS pressure.



At irregular intervals, SQL will drop PLE by 10k - 40k seconds. Yesterday was a good example, PLE went from 84,675 to 45,796. I know 45k is still a good PLE and PLE will fluctuate, but I'd love to know what to tune.



We've setup logging to capture Buffer Cache usage by DB, and Buffer Cache usage per table for the 2 largest and most active DBs (runs every 10 minutes). For yesterday's example, we saw our most active DB lost 157 MB of cache and only a few table's buffer cache really changed, and at most that was a drop of 223 MB. Those size changes seem very insignificant to me to drop PLE by 40k. We've captured several examples of the drop in PLE and nothing jumps out as extreme/explainable based on the logging data we've captured.



My next thought was that we must be reading different data in from the same tables, but we don't see any corresponding spike in disk read counters. If we're flushing data out, and overall buffer size stays the same we must be reading data in??



We also have Sentry monitoring the server and there are no massive read queries that would explain what query ran and needed to read "all" that new data from disk (granted Sentry is only capturing queries with over 1 second duration so we're not getting everything).



We don't see any correlation with increased memory usage for compiled plans either, so I don't think the plan cache is the cause.



Are there any ideas on what would be dropping PLE or what other monitoring we can implement to track it down?



Thanks in advance!










share|improve this question














bumped to the homepage by Community 9 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.











  • 1





    Any bulk inserts happening ? CPU power setting to high performance ? Check [sys.dm_os_ring_buffers ](sqlskills.com/blogs/jonathan/…) for any memory internal or external pressure. Check the plan cache for any large index scans that can trash your PLE values.

    – Kin
    Dec 23 '16 at 19:15











  • What is the plan cache doing when the PLE drops? (I know you said what it's not doing.) Does it fluctuate at all?

    – Randolph West
    Dec 24 '16 at 9:21













  • No bulk inserts are happening and the CPUs is set to high performance. When PLE drops, the plan cache doesn't really fluctuate at all (technically it grew by 43 mb).

    – Doley
    Jan 3 '17 at 14:17
















0












0








0








Our SQL server has 512 GB ram with 448 GB available to SQL 2012 (sp2, cu7) serving a typical OLTP workload. The server is only for SQL other than SIOS for geo clustering. There is not a big swing in target server memory so I don't think its related to OS pressure.



At irregular intervals, SQL will drop PLE by 10k - 40k seconds. Yesterday was a good example, PLE went from 84,675 to 45,796. I know 45k is still a good PLE and PLE will fluctuate, but I'd love to know what to tune.



We've setup logging to capture Buffer Cache usage by DB, and Buffer Cache usage per table for the 2 largest and most active DBs (runs every 10 minutes). For yesterday's example, we saw our most active DB lost 157 MB of cache and only a few table's buffer cache really changed, and at most that was a drop of 223 MB. Those size changes seem very insignificant to me to drop PLE by 40k. We've captured several examples of the drop in PLE and nothing jumps out as extreme/explainable based on the logging data we've captured.



My next thought was that we must be reading different data in from the same tables, but we don't see any corresponding spike in disk read counters. If we're flushing data out, and overall buffer size stays the same we must be reading data in??



We also have Sentry monitoring the server and there are no massive read queries that would explain what query ran and needed to read "all" that new data from disk (granted Sentry is only capturing queries with over 1 second duration so we're not getting everything).



We don't see any correlation with increased memory usage for compiled plans either, so I don't think the plan cache is the cause.



Are there any ideas on what would be dropping PLE or what other monitoring we can implement to track it down?



Thanks in advance!










share|improve this question














Our SQL server has 512 GB ram with 448 GB available to SQL 2012 (sp2, cu7) serving a typical OLTP workload. The server is only for SQL other than SIOS for geo clustering. There is not a big swing in target server memory so I don't think its related to OS pressure.



At irregular intervals, SQL will drop PLE by 10k - 40k seconds. Yesterday was a good example, PLE went from 84,675 to 45,796. I know 45k is still a good PLE and PLE will fluctuate, but I'd love to know what to tune.



We've setup logging to capture Buffer Cache usage by DB, and Buffer Cache usage per table for the 2 largest and most active DBs (runs every 10 minutes). For yesterday's example, we saw our most active DB lost 157 MB of cache and only a few table's buffer cache really changed, and at most that was a drop of 223 MB. Those size changes seem very insignificant to me to drop PLE by 40k. We've captured several examples of the drop in PLE and nothing jumps out as extreme/explainable based on the logging data we've captured.



My next thought was that we must be reading different data in from the same tables, but we don't see any corresponding spike in disk read counters. If we're flushing data out, and overall buffer size stays the same we must be reading data in??



We also have Sentry monitoring the server and there are no massive read queries that would explain what query ran and needed to read "all" that new data from disk (granted Sentry is only capturing queries with over 1 second duration so we're not getting everything).



We don't see any correlation with increased memory usage for compiled plans either, so I don't think the plan cache is the cause.



Are there any ideas on what would be dropping PLE or what other monitoring we can implement to track it down?



Thanks in advance!







memory page-life-expectancy






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Dec 23 '16 at 16:57









DoleyDoley

534




534





bumped to the homepage by Community 9 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.







bumped to the homepage by Community 9 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.










  • 1





    Any bulk inserts happening ? CPU power setting to high performance ? Check [sys.dm_os_ring_buffers ](sqlskills.com/blogs/jonathan/…) for any memory internal or external pressure. Check the plan cache for any large index scans that can trash your PLE values.

    – Kin
    Dec 23 '16 at 19:15











  • What is the plan cache doing when the PLE drops? (I know you said what it's not doing.) Does it fluctuate at all?

    – Randolph West
    Dec 24 '16 at 9:21













  • No bulk inserts are happening and the CPUs is set to high performance. When PLE drops, the plan cache doesn't really fluctuate at all (technically it grew by 43 mb).

    – Doley
    Jan 3 '17 at 14:17
















  • 1





    Any bulk inserts happening ? CPU power setting to high performance ? Check [sys.dm_os_ring_buffers ](sqlskills.com/blogs/jonathan/…) for any memory internal or external pressure. Check the plan cache for any large index scans that can trash your PLE values.

    – Kin
    Dec 23 '16 at 19:15











  • What is the plan cache doing when the PLE drops? (I know you said what it's not doing.) Does it fluctuate at all?

    – Randolph West
    Dec 24 '16 at 9:21













  • No bulk inserts are happening and the CPUs is set to high performance. When PLE drops, the plan cache doesn't really fluctuate at all (technically it grew by 43 mb).

    – Doley
    Jan 3 '17 at 14:17










1




1





Any bulk inserts happening ? CPU power setting to high performance ? Check [sys.dm_os_ring_buffers ](sqlskills.com/blogs/jonathan/…) for any memory internal or external pressure. Check the plan cache for any large index scans that can trash your PLE values.

– Kin
Dec 23 '16 at 19:15





Any bulk inserts happening ? CPU power setting to high performance ? Check [sys.dm_os_ring_buffers ](sqlskills.com/blogs/jonathan/…) for any memory internal or external pressure. Check the plan cache for any large index scans that can trash your PLE values.

– Kin
Dec 23 '16 at 19:15













What is the plan cache doing when the PLE drops? (I know you said what it's not doing.) Does it fluctuate at all?

– Randolph West
Dec 24 '16 at 9:21







What is the plan cache doing when the PLE drops? (I know you said what it's not doing.) Does it fluctuate at all?

– Randolph West
Dec 24 '16 at 9:21















No bulk inserts are happening and the CPUs is set to high performance. When PLE drops, the plan cache doesn't really fluctuate at all (technically it grew by 43 mb).

– Doley
Jan 3 '17 at 14:17







No bulk inserts are happening and the CPUs is set to high performance. When PLE drops, the plan cache doesn't really fluctuate at all (technically it grew by 43 mb).

– Doley
Jan 3 '17 at 14:17












1 Answer
1






active

oldest

votes


















0














Try sp_whoisactive. It's free and easy to implement.



What do the dmv's and wait types point you towards? I would start looking at the various wait types and how much activity are in those wait types. See which ones are occurring just before and during the time frames where you see PLE drops while also looking at the other server metrics. From there, I'd break down what queries are the main contributors to those wait types.






share|improve this answer

























    Your Answer








    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "182"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f159050%2funexplained-ple-drop%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0














    Try sp_whoisactive. It's free and easy to implement.



    What do the dmv's and wait types point you towards? I would start looking at the various wait types and how much activity are in those wait types. See which ones are occurring just before and during the time frames where you see PLE drops while also looking at the other server metrics. From there, I'd break down what queries are the main contributors to those wait types.






    share|improve this answer






























      0














      Try sp_whoisactive. It's free and easy to implement.



      What do the dmv's and wait types point you towards? I would start looking at the various wait types and how much activity are in those wait types. See which ones are occurring just before and during the time frames where you see PLE drops while also looking at the other server metrics. From there, I'd break down what queries are the main contributors to those wait types.






      share|improve this answer




























        0












        0








        0







        Try sp_whoisactive. It's free and easy to implement.



        What do the dmv's and wait types point you towards? I would start looking at the various wait types and how much activity are in those wait types. See which ones are occurring just before and during the time frames where you see PLE drops while also looking at the other server metrics. From there, I'd break down what queries are the main contributors to those wait types.






        share|improve this answer















        Try sp_whoisactive. It's free and easy to implement.



        What do the dmv's and wait types point you towards? I would start looking at the various wait types and how much activity are in those wait types. See which ones are occurring just before and during the time frames where you see PLE drops while also looking at the other server metrics. From there, I'd break down what queries are the main contributors to those wait types.







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Nov 13 '18 at 23:10









        Community

        1




        1










        answered Dec 23 '16 at 17:35









        ShaulinatorShaulinator

        2,4781622




        2,4781622






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Database Administrators Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f159050%2funexplained-ple-drop%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Ronny Ackermann

            Köttigit

            MySQL 8.0.15 starts normally but any connection hangs