pg_repack slows down PostgreSQL replication
I have a master PostgreSQL 9.5 server and a standby server. For replication I use repmgr
(WAL streaming). Typically the delay between master and standby is <5s:
$ psql -t -c "SELECT extract(epoch from now() - pg_last_xact_replay_timestamp());"
0.044554
Periodically pg_repack
is invoked on master in order to optimize indexes and tables. Repacking tables causes massive changes in WAL streaming and significantly slows down replication, so that the difference between master and standby could be more than 1h.
Is there a way how to reduce such delay? Is it possible to synchronize newly incoming data with higher priority than repack changes?
postgresql replication postgresql-9.5 repmgr
bumped to the homepage by Community♦ 27 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
add a comment |
I have a master PostgreSQL 9.5 server and a standby server. For replication I use repmgr
(WAL streaming). Typically the delay between master and standby is <5s:
$ psql -t -c "SELECT extract(epoch from now() - pg_last_xact_replay_timestamp());"
0.044554
Periodically pg_repack
is invoked on master in order to optimize indexes and tables. Repacking tables causes massive changes in WAL streaming and significantly slows down replication, so that the difference between master and standby could be more than 1h.
Is there a way how to reduce such delay? Is it possible to synchronize newly incoming data with higher priority than repack changes?
postgresql replication postgresql-9.5 repmgr
bumped to the homepage by Community♦ 27 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
add a comment |
I have a master PostgreSQL 9.5 server and a standby server. For replication I use repmgr
(WAL streaming). Typically the delay between master and standby is <5s:
$ psql -t -c "SELECT extract(epoch from now() - pg_last_xact_replay_timestamp());"
0.044554
Periodically pg_repack
is invoked on master in order to optimize indexes and tables. Repacking tables causes massive changes in WAL streaming and significantly slows down replication, so that the difference between master and standby could be more than 1h.
Is there a way how to reduce such delay? Is it possible to synchronize newly incoming data with higher priority than repack changes?
postgresql replication postgresql-9.5 repmgr
I have a master PostgreSQL 9.5 server and a standby server. For replication I use repmgr
(WAL streaming). Typically the delay between master and standby is <5s:
$ psql -t -c "SELECT extract(epoch from now() - pg_last_xact_replay_timestamp());"
0.044554
Periodically pg_repack
is invoked on master in order to optimize indexes and tables. Repacking tables causes massive changes in WAL streaming and significantly slows down replication, so that the difference between master and standby could be more than 1h.
Is there a way how to reduce such delay? Is it possible to synchronize newly incoming data with higher priority than repack changes?
postgresql replication postgresql-9.5 repmgr
postgresql replication postgresql-9.5 repmgr
asked Jan 31 '17 at 9:16
TombartTombart
231111
231111
bumped to the homepage by Community♦ 27 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
bumped to the homepage by Community♦ 27 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
I was about to ask a similar question, but it is related to the replication streaming ("physical"/block related) that streams/replicates the actual data writes to the disk(s). With vacuum full (reindexes), and truncates/restores, and now pg_repack, the tables are rewritten to disk, causing a lot of data writes that need to be streamed to the other side...
Thus, no, I don't believe you'll be able to do the "prioritization" as the moment the rebuilt table are "swapped" into active table, the new updates/writes on the master, will be going to the rebuild table, not the old table, and then the replica needs that table "available"!
I've been getting into the habit of killing the replication, doing the major data changes (perhaps a good practise to have it as a "backup" available before the data changes) and then doing a new full pg_basebackup/replication restarts
Hope this helps to explain the situation you are in and how I've been solving it till now :)
That said do go read: https://www.depesz.com/2013/06/21/bloat-removal-by-tuples-moving/
Depesz explains a mechanism that helped him move data to the beginning of the table "on the fly" with data available all the time using code similar to:
with x as (
delete from test where id in (999997,999998,999999) returning *
)
insert into test
select * from x;
this is then run in batches, with $vacuum$ statements running together to clean up the space. Doing this in a slow/managed method, you could be able to do the "repack" with not too far behind replicas.
Just be careful of triggers on update/insert/delete !
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "182"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f162719%2fpg-repack-slows-down-postgresql-replication%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
I was about to ask a similar question, but it is related to the replication streaming ("physical"/block related) that streams/replicates the actual data writes to the disk(s). With vacuum full (reindexes), and truncates/restores, and now pg_repack, the tables are rewritten to disk, causing a lot of data writes that need to be streamed to the other side...
Thus, no, I don't believe you'll be able to do the "prioritization" as the moment the rebuilt table are "swapped" into active table, the new updates/writes on the master, will be going to the rebuild table, not the old table, and then the replica needs that table "available"!
I've been getting into the habit of killing the replication, doing the major data changes (perhaps a good practise to have it as a "backup" available before the data changes) and then doing a new full pg_basebackup/replication restarts
Hope this helps to explain the situation you are in and how I've been solving it till now :)
That said do go read: https://www.depesz.com/2013/06/21/bloat-removal-by-tuples-moving/
Depesz explains a mechanism that helped him move data to the beginning of the table "on the fly" with data available all the time using code similar to:
with x as (
delete from test where id in (999997,999998,999999) returning *
)
insert into test
select * from x;
this is then run in batches, with $vacuum$ statements running together to clean up the space. Doing this in a slow/managed method, you could be able to do the "repack" with not too far behind replicas.
Just be careful of triggers on update/insert/delete !
add a comment |
I was about to ask a similar question, but it is related to the replication streaming ("physical"/block related) that streams/replicates the actual data writes to the disk(s). With vacuum full (reindexes), and truncates/restores, and now pg_repack, the tables are rewritten to disk, causing a lot of data writes that need to be streamed to the other side...
Thus, no, I don't believe you'll be able to do the "prioritization" as the moment the rebuilt table are "swapped" into active table, the new updates/writes on the master, will be going to the rebuild table, not the old table, and then the replica needs that table "available"!
I've been getting into the habit of killing the replication, doing the major data changes (perhaps a good practise to have it as a "backup" available before the data changes) and then doing a new full pg_basebackup/replication restarts
Hope this helps to explain the situation you are in and how I've been solving it till now :)
That said do go read: https://www.depesz.com/2013/06/21/bloat-removal-by-tuples-moving/
Depesz explains a mechanism that helped him move data to the beginning of the table "on the fly" with data available all the time using code similar to:
with x as (
delete from test where id in (999997,999998,999999) returning *
)
insert into test
select * from x;
this is then run in batches, with $vacuum$ statements running together to clean up the space. Doing this in a slow/managed method, you could be able to do the "repack" with not too far behind replicas.
Just be careful of triggers on update/insert/delete !
add a comment |
I was about to ask a similar question, but it is related to the replication streaming ("physical"/block related) that streams/replicates the actual data writes to the disk(s). With vacuum full (reindexes), and truncates/restores, and now pg_repack, the tables are rewritten to disk, causing a lot of data writes that need to be streamed to the other side...
Thus, no, I don't believe you'll be able to do the "prioritization" as the moment the rebuilt table are "swapped" into active table, the new updates/writes on the master, will be going to the rebuild table, not the old table, and then the replica needs that table "available"!
I've been getting into the habit of killing the replication, doing the major data changes (perhaps a good practise to have it as a "backup" available before the data changes) and then doing a new full pg_basebackup/replication restarts
Hope this helps to explain the situation you are in and how I've been solving it till now :)
That said do go read: https://www.depesz.com/2013/06/21/bloat-removal-by-tuples-moving/
Depesz explains a mechanism that helped him move data to the beginning of the table "on the fly" with data available all the time using code similar to:
with x as (
delete from test where id in (999997,999998,999999) returning *
)
insert into test
select * from x;
this is then run in batches, with $vacuum$ statements running together to clean up the space. Doing this in a slow/managed method, you could be able to do the "repack" with not too far behind replicas.
Just be careful of triggers on update/insert/delete !
I was about to ask a similar question, but it is related to the replication streaming ("physical"/block related) that streams/replicates the actual data writes to the disk(s). With vacuum full (reindexes), and truncates/restores, and now pg_repack, the tables are rewritten to disk, causing a lot of data writes that need to be streamed to the other side...
Thus, no, I don't believe you'll be able to do the "prioritization" as the moment the rebuilt table are "swapped" into active table, the new updates/writes on the master, will be going to the rebuild table, not the old table, and then the replica needs that table "available"!
I've been getting into the habit of killing the replication, doing the major data changes (perhaps a good practise to have it as a "backup" available before the data changes) and then doing a new full pg_basebackup/replication restarts
Hope this helps to explain the situation you are in and how I've been solving it till now :)
That said do go read: https://www.depesz.com/2013/06/21/bloat-removal-by-tuples-moving/
Depesz explains a mechanism that helped him move data to the beginning of the table "on the fly" with data available all the time using code similar to:
with x as (
delete from test where id in (999997,999998,999999) returning *
)
insert into test
select * from x;
this is then run in batches, with $vacuum$ statements running together to clean up the space. Doing this in a slow/managed method, you could be able to do the "repack" with not too far behind replicas.
Just be careful of triggers on update/insert/delete !
edited Aug 14 '17 at 12:13
answered Aug 14 '17 at 12:01
HvisageHvisage
1113
1113
add a comment |
add a comment |
Thanks for contributing an answer to Database Administrators Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f162719%2fpg-repack-slows-down-postgresql-replication%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown