Reliability of mongodump
I understand that the recommended way to backup MongoDB data is to use file system level tools. However, mongodump utility is promoted as a valid alternative for smaller instances, when file system snapshots are not available for some reason. Many resources recommend running it with --oplog option to enable point-in-time restores.
I tried to understand how --oplog could enable point-in-time snapshots without proper multi-version concurrency control from the database engine. I think it can only work in the following way:
- We assume that _id values in all collections are monotonously increasing
- Mongodump blocks all writes and records last operation id in the oplog
- Mongodump starts dumping documents from all collections by navigating _id index in decreasing order
- After dump has started, mongodump unlocks writes to the collections
- We assume that all inserts from now on will have higher _id values and will appear only in oplog dump, not collections dump.
This system quickly breaks if _id values are not monotonously increasing or if write locks are not implemented and there is a race condition. Also, there is no support for secondary unique indexes.
From what I see, the following statements are true, meaning that mongodump cannot possibly enable point-in-time restores, so it's specification is misleading as it comes to its snapshot-like capabilities:
- Even build-in _id values generators do not guarantee total order
- mongodump does not issue any write locks
- mongodump does not traverse _id index in descending order, so inserted records will appear both in collection dump and oplog dump.
Is my understanding correct? Is there any value in --oplog option of mongodump utility?
Following are the tickets which seem to enforce my suggestions:
- https://jira.mongodb.org/browse/SERVER-24231
- https://jira.mongodb.org/browse/TOOLS-176
mongodb mongodump
bumped to the homepage by Community♦ 4 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
add a comment |
I understand that the recommended way to backup MongoDB data is to use file system level tools. However, mongodump utility is promoted as a valid alternative for smaller instances, when file system snapshots are not available for some reason. Many resources recommend running it with --oplog option to enable point-in-time restores.
I tried to understand how --oplog could enable point-in-time snapshots without proper multi-version concurrency control from the database engine. I think it can only work in the following way:
- We assume that _id values in all collections are monotonously increasing
- Mongodump blocks all writes and records last operation id in the oplog
- Mongodump starts dumping documents from all collections by navigating _id index in decreasing order
- After dump has started, mongodump unlocks writes to the collections
- We assume that all inserts from now on will have higher _id values and will appear only in oplog dump, not collections dump.
This system quickly breaks if _id values are not monotonously increasing or if write locks are not implemented and there is a race condition. Also, there is no support for secondary unique indexes.
From what I see, the following statements are true, meaning that mongodump cannot possibly enable point-in-time restores, so it's specification is misleading as it comes to its snapshot-like capabilities:
- Even build-in _id values generators do not guarantee total order
- mongodump does not issue any write locks
- mongodump does not traverse _id index in descending order, so inserted records will appear both in collection dump and oplog dump.
Is my understanding correct? Is there any value in --oplog option of mongodump utility?
Following are the tickets which seem to enforce my suggestions:
- https://jira.mongodb.org/browse/SERVER-24231
- https://jira.mongodb.org/browse/TOOLS-176
mongodb mongodump
bumped to the homepage by Community♦ 4 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
"mongodump" export data from "mongod" and also from "mongos". From which ("mongod" or "mongos") you want to know that reliability of "mongodump".
– Md Haidar Ali Khan
Apr 18 '18 at 7:36
add a comment |
I understand that the recommended way to backup MongoDB data is to use file system level tools. However, mongodump utility is promoted as a valid alternative for smaller instances, when file system snapshots are not available for some reason. Many resources recommend running it with --oplog option to enable point-in-time restores.
I tried to understand how --oplog could enable point-in-time snapshots without proper multi-version concurrency control from the database engine. I think it can only work in the following way:
- We assume that _id values in all collections are monotonously increasing
- Mongodump blocks all writes and records last operation id in the oplog
- Mongodump starts dumping documents from all collections by navigating _id index in decreasing order
- After dump has started, mongodump unlocks writes to the collections
- We assume that all inserts from now on will have higher _id values and will appear only in oplog dump, not collections dump.
This system quickly breaks if _id values are not monotonously increasing or if write locks are not implemented and there is a race condition. Also, there is no support for secondary unique indexes.
From what I see, the following statements are true, meaning that mongodump cannot possibly enable point-in-time restores, so it's specification is misleading as it comes to its snapshot-like capabilities:
- Even build-in _id values generators do not guarantee total order
- mongodump does not issue any write locks
- mongodump does not traverse _id index in descending order, so inserted records will appear both in collection dump and oplog dump.
Is my understanding correct? Is there any value in --oplog option of mongodump utility?
Following are the tickets which seem to enforce my suggestions:
- https://jira.mongodb.org/browse/SERVER-24231
- https://jira.mongodb.org/browse/TOOLS-176
mongodb mongodump
I understand that the recommended way to backup MongoDB data is to use file system level tools. However, mongodump utility is promoted as a valid alternative for smaller instances, when file system snapshots are not available for some reason. Many resources recommend running it with --oplog option to enable point-in-time restores.
I tried to understand how --oplog could enable point-in-time snapshots without proper multi-version concurrency control from the database engine. I think it can only work in the following way:
- We assume that _id values in all collections are monotonously increasing
- Mongodump blocks all writes and records last operation id in the oplog
- Mongodump starts dumping documents from all collections by navigating _id index in decreasing order
- After dump has started, mongodump unlocks writes to the collections
- We assume that all inserts from now on will have higher _id values and will appear only in oplog dump, not collections dump.
This system quickly breaks if _id values are not monotonously increasing or if write locks are not implemented and there is a race condition. Also, there is no support for secondary unique indexes.
From what I see, the following statements are true, meaning that mongodump cannot possibly enable point-in-time restores, so it's specification is misleading as it comes to its snapshot-like capabilities:
- Even build-in _id values generators do not guarantee total order
- mongodump does not issue any write locks
- mongodump does not traverse _id index in descending order, so inserted records will appear both in collection dump and oplog dump.
Is my understanding correct? Is there any value in --oplog option of mongodump utility?
Following are the tickets which seem to enforce my suggestions:
- https://jira.mongodb.org/browse/SERVER-24231
- https://jira.mongodb.org/browse/TOOLS-176
mongodb mongodump
mongodb mongodump
asked Apr 16 '18 at 9:10
KonstantinKonstantin
1112
1112
bumped to the homepage by Community♦ 4 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
bumped to the homepage by Community♦ 4 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
"mongodump" export data from "mongod" and also from "mongos". From which ("mongod" or "mongos") you want to know that reliability of "mongodump".
– Md Haidar Ali Khan
Apr 18 '18 at 7:36
add a comment |
"mongodump" export data from "mongod" and also from "mongos". From which ("mongod" or "mongos") you want to know that reliability of "mongodump".
– Md Haidar Ali Khan
Apr 18 '18 at 7:36
"mongodump" export data from "mongod" and also from "mongos". From which ("mongod" or "mongos") you want to know that reliability of "mongodump".
– Md Haidar Ali Khan
Apr 18 '18 at 7:36
"mongodump" export data from "mongod" and also from "mongos". From which ("mongod" or "mongos") you want to know that reliability of "mongodump".
– Md Haidar Ali Khan
Apr 18 '18 at 7:36
add a comment |
1 Answer
1
active
oldest
votes
Is my understanding correct? Is there any value in --oplog option of
mongodump utility?
As MongoDB Blog
documentation here Ensures that mongodump
creates a dump of the database that includes a partial oplog
containing operations from the duration of the mongodump
operation. This oplog
produces an effective point-in-time snapshot of the state of a mongod instance. To restore to a specific point-in-time backup, use the output created with this option in conjunction with mongorestore
--oplogReplay
.
Without --oplog, if there are write operations during the dump operation, the dump will not reflect a single moment in time. Changes made to the database during the update process can affect the output of the backup.
--oplog
has no effect when running mongodump against a mongos instance to dump the entire contents of a sharded cluster. However, you can use --oplog to dump individual shards.
--oplog
only works against nodes that maintain an oplog. This includes all members of a replica set, as well as master nodes in master/slave replication deployments.
--oplog
does not dump the oplog collection.
I understand that the recommended way to backup MongoDB data is to use
file system level tools. However, mongodump utility is promoted as a
valid alternative for smaller instances, when file system snapshots
are not available for some reason. Many resources recommend running it
with --oplog option to enable point-in-time restores.
You could use oplog option if you are taking backups of all the DBs.
mongodump --oplog --out /dump_location
Now while restoring do something like this :
mongorestore --oplogReplay
This way mongodump
won't require a write lock. Any writes that are made during the backup process will be written to oplog.bson
. While restoring, mongorestore
will use this oplog
to include the write operations.
For your further ref here, here and here
2
Thanks for your answer! However, rewriting official documentation doesn't really help answering the actual question of whether what is said in the documentation is correct or can even be correct at all.
– Konstantin
Apr 18 '18 at 11:43
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "182"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f204053%2freliability-of-mongodump%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Is my understanding correct? Is there any value in --oplog option of
mongodump utility?
As MongoDB Blog
documentation here Ensures that mongodump
creates a dump of the database that includes a partial oplog
containing operations from the duration of the mongodump
operation. This oplog
produces an effective point-in-time snapshot of the state of a mongod instance. To restore to a specific point-in-time backup, use the output created with this option in conjunction with mongorestore
--oplogReplay
.
Without --oplog, if there are write operations during the dump operation, the dump will not reflect a single moment in time. Changes made to the database during the update process can affect the output of the backup.
--oplog
has no effect when running mongodump against a mongos instance to dump the entire contents of a sharded cluster. However, you can use --oplog to dump individual shards.
--oplog
only works against nodes that maintain an oplog. This includes all members of a replica set, as well as master nodes in master/slave replication deployments.
--oplog
does not dump the oplog collection.
I understand that the recommended way to backup MongoDB data is to use
file system level tools. However, mongodump utility is promoted as a
valid alternative for smaller instances, when file system snapshots
are not available for some reason. Many resources recommend running it
with --oplog option to enable point-in-time restores.
You could use oplog option if you are taking backups of all the DBs.
mongodump --oplog --out /dump_location
Now while restoring do something like this :
mongorestore --oplogReplay
This way mongodump
won't require a write lock. Any writes that are made during the backup process will be written to oplog.bson
. While restoring, mongorestore
will use this oplog
to include the write operations.
For your further ref here, here and here
2
Thanks for your answer! However, rewriting official documentation doesn't really help answering the actual question of whether what is said in the documentation is correct or can even be correct at all.
– Konstantin
Apr 18 '18 at 11:43
add a comment |
Is my understanding correct? Is there any value in --oplog option of
mongodump utility?
As MongoDB Blog
documentation here Ensures that mongodump
creates a dump of the database that includes a partial oplog
containing operations from the duration of the mongodump
operation. This oplog
produces an effective point-in-time snapshot of the state of a mongod instance. To restore to a specific point-in-time backup, use the output created with this option in conjunction with mongorestore
--oplogReplay
.
Without --oplog, if there are write operations during the dump operation, the dump will not reflect a single moment in time. Changes made to the database during the update process can affect the output of the backup.
--oplog
has no effect when running mongodump against a mongos instance to dump the entire contents of a sharded cluster. However, you can use --oplog to dump individual shards.
--oplog
only works against nodes that maintain an oplog. This includes all members of a replica set, as well as master nodes in master/slave replication deployments.
--oplog
does not dump the oplog collection.
I understand that the recommended way to backup MongoDB data is to use
file system level tools. However, mongodump utility is promoted as a
valid alternative for smaller instances, when file system snapshots
are not available for some reason. Many resources recommend running it
with --oplog option to enable point-in-time restores.
You could use oplog option if you are taking backups of all the DBs.
mongodump --oplog --out /dump_location
Now while restoring do something like this :
mongorestore --oplogReplay
This way mongodump
won't require a write lock. Any writes that are made during the backup process will be written to oplog.bson
. While restoring, mongorestore
will use this oplog
to include the write operations.
For your further ref here, here and here
2
Thanks for your answer! However, rewriting official documentation doesn't really help answering the actual question of whether what is said in the documentation is correct or can even be correct at all.
– Konstantin
Apr 18 '18 at 11:43
add a comment |
Is my understanding correct? Is there any value in --oplog option of
mongodump utility?
As MongoDB Blog
documentation here Ensures that mongodump
creates a dump of the database that includes a partial oplog
containing operations from the duration of the mongodump
operation. This oplog
produces an effective point-in-time snapshot of the state of a mongod instance. To restore to a specific point-in-time backup, use the output created with this option in conjunction with mongorestore
--oplogReplay
.
Without --oplog, if there are write operations during the dump operation, the dump will not reflect a single moment in time. Changes made to the database during the update process can affect the output of the backup.
--oplog
has no effect when running mongodump against a mongos instance to dump the entire contents of a sharded cluster. However, you can use --oplog to dump individual shards.
--oplog
only works against nodes that maintain an oplog. This includes all members of a replica set, as well as master nodes in master/slave replication deployments.
--oplog
does not dump the oplog collection.
I understand that the recommended way to backup MongoDB data is to use
file system level tools. However, mongodump utility is promoted as a
valid alternative for smaller instances, when file system snapshots
are not available for some reason. Many resources recommend running it
with --oplog option to enable point-in-time restores.
You could use oplog option if you are taking backups of all the DBs.
mongodump --oplog --out /dump_location
Now while restoring do something like this :
mongorestore --oplogReplay
This way mongodump
won't require a write lock. Any writes that are made during the backup process will be written to oplog.bson
. While restoring, mongorestore
will use this oplog
to include the write operations.
For your further ref here, here and here
Is my understanding correct? Is there any value in --oplog option of
mongodump utility?
As MongoDB Blog
documentation here Ensures that mongodump
creates a dump of the database that includes a partial oplog
containing operations from the duration of the mongodump
operation. This oplog
produces an effective point-in-time snapshot of the state of a mongod instance. To restore to a specific point-in-time backup, use the output created with this option in conjunction with mongorestore
--oplogReplay
.
Without --oplog, if there are write operations during the dump operation, the dump will not reflect a single moment in time. Changes made to the database during the update process can affect the output of the backup.
--oplog
has no effect when running mongodump against a mongos instance to dump the entire contents of a sharded cluster. However, you can use --oplog to dump individual shards.
--oplog
only works against nodes that maintain an oplog. This includes all members of a replica set, as well as master nodes in master/slave replication deployments.
--oplog
does not dump the oplog collection.
I understand that the recommended way to backup MongoDB data is to use
file system level tools. However, mongodump utility is promoted as a
valid alternative for smaller instances, when file system snapshots
are not available for some reason. Many resources recommend running it
with --oplog option to enable point-in-time restores.
You could use oplog option if you are taking backups of all the DBs.
mongodump --oplog --out /dump_location
Now while restoring do something like this :
mongorestore --oplogReplay
This way mongodump
won't require a write lock. Any writes that are made during the backup process will be written to oplog.bson
. While restoring, mongorestore
will use this oplog
to include the write operations.
For your further ref here, here and here
answered Apr 18 '18 at 8:26
Md Haidar Ali KhanMd Haidar Ali Khan
3,62062340
3,62062340
2
Thanks for your answer! However, rewriting official documentation doesn't really help answering the actual question of whether what is said in the documentation is correct or can even be correct at all.
– Konstantin
Apr 18 '18 at 11:43
add a comment |
2
Thanks for your answer! However, rewriting official documentation doesn't really help answering the actual question of whether what is said in the documentation is correct or can even be correct at all.
– Konstantin
Apr 18 '18 at 11:43
2
2
Thanks for your answer! However, rewriting official documentation doesn't really help answering the actual question of whether what is said in the documentation is correct or can even be correct at all.
– Konstantin
Apr 18 '18 at 11:43
Thanks for your answer! However, rewriting official documentation doesn't really help answering the actual question of whether what is said in the documentation is correct or can even be correct at all.
– Konstantin
Apr 18 '18 at 11:43
add a comment |
Thanks for contributing an answer to Database Administrators Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f204053%2freliability-of-mongodump%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
"mongodump" export data from "mongod" and also from "mongos". From which ("mongod" or "mongos") you want to know that reliability of "mongodump".
– Md Haidar Ali Khan
Apr 18 '18 at 7:36