Galera Cluster Setup - Primary and Secondary Site Scenario












0















I'm very new to Galera Cluster and is exploring a potential setup with reasonable resiliency to node failure and network failure. Looking at the very bottom part of this documentation, the Weighted Quorum for a Primary and Secondary Site Scenario is quite promising. For ease of reading, I've extracted the setup from the document as follows:




When configuring quorum weights for primary and secondary sites, use
the following pattern:



Primary Site:
node1: pc.weight = 2
node2: pc.weight = 2

Secondary Site:
node3: pc.weight = 1
node4: pc.weight = 1


Under this pattern, some nodes are located at the primary site while
others are at the secondary site. In the event that the secondary site
goes down or if network connectivity is lost between the sites, the
nodes at the primary site remain the Primary Component. Additionally,
either node1 or node2 can crash without the rest of the nodes becoming
non-primary components.




But there seems to be two drawbacks:




  1. If there are two failed nodes and one of them happened to be on the primary site, the quorum will be <= 50% and the remaining two nodes will become non-primary components.

  2. Despite pc.weight is a dynamic option that can be changed while the server is running, flipping between primary site and secondary site requires modification on all nodes, which is a bit troublesome.


So I've come up with another idea in mind - leave the weight as 1 for all nodes, and in the primary site add a Galera Arbitrator. In this case:




  • The primary site will remain the Primary Component on network issue,
    just like the original setup.

  • The cluster still functions even if two nodes failed.

  • Flipping between primary and secondary site just require a move of the Galera Arbitrator.


May I know if there's anything wrong with my idea, or if there will be any practical difficulties? Appreciate if you can share your thoughts with me.










share|improve this question














bumped to the homepage by Community 23 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.




















    0















    I'm very new to Galera Cluster and is exploring a potential setup with reasonable resiliency to node failure and network failure. Looking at the very bottom part of this documentation, the Weighted Quorum for a Primary and Secondary Site Scenario is quite promising. For ease of reading, I've extracted the setup from the document as follows:




    When configuring quorum weights for primary and secondary sites, use
    the following pattern:



    Primary Site:
    node1: pc.weight = 2
    node2: pc.weight = 2

    Secondary Site:
    node3: pc.weight = 1
    node4: pc.weight = 1


    Under this pattern, some nodes are located at the primary site while
    others are at the secondary site. In the event that the secondary site
    goes down or if network connectivity is lost between the sites, the
    nodes at the primary site remain the Primary Component. Additionally,
    either node1 or node2 can crash without the rest of the nodes becoming
    non-primary components.




    But there seems to be two drawbacks:




    1. If there are two failed nodes and one of them happened to be on the primary site, the quorum will be <= 50% and the remaining two nodes will become non-primary components.

    2. Despite pc.weight is a dynamic option that can be changed while the server is running, flipping between primary site and secondary site requires modification on all nodes, which is a bit troublesome.


    So I've come up with another idea in mind - leave the weight as 1 for all nodes, and in the primary site add a Galera Arbitrator. In this case:




    • The primary site will remain the Primary Component on network issue,
      just like the original setup.

    • The cluster still functions even if two nodes failed.

    • Flipping between primary and secondary site just require a move of the Galera Arbitrator.


    May I know if there's anything wrong with my idea, or if there will be any practical difficulties? Appreciate if you can share your thoughts with me.










    share|improve this question














    bumped to the homepage by Community 23 mins ago


    This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.


















      0












      0








      0








      I'm very new to Galera Cluster and is exploring a potential setup with reasonable resiliency to node failure and network failure. Looking at the very bottom part of this documentation, the Weighted Quorum for a Primary and Secondary Site Scenario is quite promising. For ease of reading, I've extracted the setup from the document as follows:




      When configuring quorum weights for primary and secondary sites, use
      the following pattern:



      Primary Site:
      node1: pc.weight = 2
      node2: pc.weight = 2

      Secondary Site:
      node3: pc.weight = 1
      node4: pc.weight = 1


      Under this pattern, some nodes are located at the primary site while
      others are at the secondary site. In the event that the secondary site
      goes down or if network connectivity is lost between the sites, the
      nodes at the primary site remain the Primary Component. Additionally,
      either node1 or node2 can crash without the rest of the nodes becoming
      non-primary components.




      But there seems to be two drawbacks:




      1. If there are two failed nodes and one of them happened to be on the primary site, the quorum will be <= 50% and the remaining two nodes will become non-primary components.

      2. Despite pc.weight is a dynamic option that can be changed while the server is running, flipping between primary site and secondary site requires modification on all nodes, which is a bit troublesome.


      So I've come up with another idea in mind - leave the weight as 1 for all nodes, and in the primary site add a Galera Arbitrator. In this case:




      • The primary site will remain the Primary Component on network issue,
        just like the original setup.

      • The cluster still functions even if two nodes failed.

      • Flipping between primary and secondary site just require a move of the Galera Arbitrator.


      May I know if there's anything wrong with my idea, or if there will be any practical difficulties? Appreciate if you can share your thoughts with me.










      share|improve this question














      I'm very new to Galera Cluster and is exploring a potential setup with reasonable resiliency to node failure and network failure. Looking at the very bottom part of this documentation, the Weighted Quorum for a Primary and Secondary Site Scenario is quite promising. For ease of reading, I've extracted the setup from the document as follows:




      When configuring quorum weights for primary and secondary sites, use
      the following pattern:



      Primary Site:
      node1: pc.weight = 2
      node2: pc.weight = 2

      Secondary Site:
      node3: pc.weight = 1
      node4: pc.weight = 1


      Under this pattern, some nodes are located at the primary site while
      others are at the secondary site. In the event that the secondary site
      goes down or if network connectivity is lost between the sites, the
      nodes at the primary site remain the Primary Component. Additionally,
      either node1 or node2 can crash without the rest of the nodes becoming
      non-primary components.




      But there seems to be two drawbacks:




      1. If there are two failed nodes and one of them happened to be on the primary site, the quorum will be <= 50% and the remaining two nodes will become non-primary components.

      2. Despite pc.weight is a dynamic option that can be changed while the server is running, flipping between primary site and secondary site requires modification on all nodes, which is a bit troublesome.


      So I've come up with another idea in mind - leave the weight as 1 for all nodes, and in the primary site add a Galera Arbitrator. In this case:




      • The primary site will remain the Primary Component on network issue,
        just like the original setup.

      • The cluster still functions even if two nodes failed.

      • Flipping between primary and secondary site just require a move of the Galera Arbitrator.


      May I know if there's anything wrong with my idea, or if there will be any practical difficulties? Appreciate if you can share your thoughts with me.







      mysql mariadb high-availability galera multi-master






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Oct 6 '17 at 15:51









      CLDevCLDev

      1062




      1062





      bumped to the homepage by Community 23 mins ago


      This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.







      bumped to the homepage by Community 23 mins ago


      This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
























          1 Answer
          1






          active

          oldest

          votes


















          0














          "Weighting" was added late in the game, when they realized that a 2-datacenter setup was too vulnerable. (3 datacenters is resilient, and can use garbd in one of them.) The example you quote is resilient to any single server, datacenter, or network outage.



          As I read the last sentence of the quote, node1 or node2 died but the other three nodes are alive and talking to each other. That is, there is a Quorum, and the system is still reliable.



          However, I agree that the sentence is ambiguous -- It can be read that after the network died, node1 or node 2 died. This leaves three clumps: (node1), (node2), (node3,node4), each with a weight of 2. None should be considered "primary" because none has Quorum.



          You bring up garbd, yet it is not in the example?? And where would you put it?



          You should not be changing the configuration while the system is hobbled -- you should be fixing the broken components.



          The main goal of is to allow a single point of failure -- a single node, the network, a data center. It would take a really large and complex system to survive two failures. For example, I think it would require 5 datacenters to survive 2 network failures.



          So, focus on a single point of failure.






          share|improve this answer























            Your Answer








            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "182"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: false,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: null,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f187875%2fgalera-cluster-setup-primary-and-secondary-site-scenario%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            0














            "Weighting" was added late in the game, when they realized that a 2-datacenter setup was too vulnerable. (3 datacenters is resilient, and can use garbd in one of them.) The example you quote is resilient to any single server, datacenter, or network outage.



            As I read the last sentence of the quote, node1 or node2 died but the other three nodes are alive and talking to each other. That is, there is a Quorum, and the system is still reliable.



            However, I agree that the sentence is ambiguous -- It can be read that after the network died, node1 or node 2 died. This leaves three clumps: (node1), (node2), (node3,node4), each with a weight of 2. None should be considered "primary" because none has Quorum.



            You bring up garbd, yet it is not in the example?? And where would you put it?



            You should not be changing the configuration while the system is hobbled -- you should be fixing the broken components.



            The main goal of is to allow a single point of failure -- a single node, the network, a data center. It would take a really large and complex system to survive two failures. For example, I think it would require 5 datacenters to survive 2 network failures.



            So, focus on a single point of failure.






            share|improve this answer




























              0














              "Weighting" was added late in the game, when they realized that a 2-datacenter setup was too vulnerable. (3 datacenters is resilient, and can use garbd in one of them.) The example you quote is resilient to any single server, datacenter, or network outage.



              As I read the last sentence of the quote, node1 or node2 died but the other three nodes are alive and talking to each other. That is, there is a Quorum, and the system is still reliable.



              However, I agree that the sentence is ambiguous -- It can be read that after the network died, node1 or node 2 died. This leaves three clumps: (node1), (node2), (node3,node4), each with a weight of 2. None should be considered "primary" because none has Quorum.



              You bring up garbd, yet it is not in the example?? And where would you put it?



              You should not be changing the configuration while the system is hobbled -- you should be fixing the broken components.



              The main goal of is to allow a single point of failure -- a single node, the network, a data center. It would take a really large and complex system to survive two failures. For example, I think it would require 5 datacenters to survive 2 network failures.



              So, focus on a single point of failure.






              share|improve this answer


























                0












                0








                0







                "Weighting" was added late in the game, when they realized that a 2-datacenter setup was too vulnerable. (3 datacenters is resilient, and can use garbd in one of them.) The example you quote is resilient to any single server, datacenter, or network outage.



                As I read the last sentence of the quote, node1 or node2 died but the other three nodes are alive and talking to each other. That is, there is a Quorum, and the system is still reliable.



                However, I agree that the sentence is ambiguous -- It can be read that after the network died, node1 or node 2 died. This leaves three clumps: (node1), (node2), (node3,node4), each with a weight of 2. None should be considered "primary" because none has Quorum.



                You bring up garbd, yet it is not in the example?? And where would you put it?



                You should not be changing the configuration while the system is hobbled -- you should be fixing the broken components.



                The main goal of is to allow a single point of failure -- a single node, the network, a data center. It would take a really large and complex system to survive two failures. For example, I think it would require 5 datacenters to survive 2 network failures.



                So, focus on a single point of failure.






                share|improve this answer













                "Weighting" was added late in the game, when they realized that a 2-datacenter setup was too vulnerable. (3 datacenters is resilient, and can use garbd in one of them.) The example you quote is resilient to any single server, datacenter, or network outage.



                As I read the last sentence of the quote, node1 or node2 died but the other three nodes are alive and talking to each other. That is, there is a Quorum, and the system is still reliable.



                However, I agree that the sentence is ambiguous -- It can be read that after the network died, node1 or node 2 died. This leaves three clumps: (node1), (node2), (node3,node4), each with a weight of 2. None should be considered "primary" because none has Quorum.



                You bring up garbd, yet it is not in the example?? And where would you put it?



                You should not be changing the configuration while the system is hobbled -- you should be fixing the broken components.



                The main goal of is to allow a single point of failure -- a single node, the network, a data center. It would take a really large and complex system to survive two failures. For example, I think it would require 5 datacenters to survive 2 network failures.



                So, focus on a single point of failure.







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Oct 8 '17 at 16:25









                Rick JamesRick James

                41.2k22258




                41.2k22258






























                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Database Administrators Stack Exchange!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f187875%2fgalera-cluster-setup-primary-and-secondary-site-scenario%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Liste der Baudenkmale in Friedland (Mecklenburg)

                    Single-Malt-Whisky

                    Czorneboh