Multi-threaded replay of master activity for load/performance testing
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ margin-bottom:0;
}
Does anyone know of any existing tools/products that accomplish what I'm trying to do?
After searching around for a bit the only thing I could find surrounding what I'm trying to accomplish is someone else looking for the same thing http://forums.mysql.com/read.php?24,192250,192250
The idea is I'd like to capture all the traffic to my master to save for a replay log against a snapshot of the entire database taken when the monitoring started. Bin logs won't serve what I want since they only include writes. I want read activity to realistically view the effects of proposed changes with "real" production traffic.
Real production traffic meaning everything from all applications that are hitting the database to be modified. If there's some application level change, tests running just that app don't account for other activity going on in the system at the time. Running all applications in a test environment aren't guaranteed to have the same state of the database. I could take a test snapshot as a starting point as I fire them up but the applications don't have the ability to do an exact replay of their own activity.
I've use the tcpdump script from http://www.mysqlperformanceblog.com/2008/11/07/poor-mans-query-logging/ to monitor activity but this doesn't tell me which queries are coming from which connections. Part of the playback I'm wanting is a multi threaded approach that replays the activity from the same number of threads that were actually in use.
I can't afford to turn on general query logging b/c my production master wouldn't be able to handle the performance hit for that.
The whole snapshot part of the process is to have a golden start point database to ensure everything is the same during the start of each test run.
mysql performance performance-testing
bumped to the homepage by Community♦ 14 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
add a comment |
Does anyone know of any existing tools/products that accomplish what I'm trying to do?
After searching around for a bit the only thing I could find surrounding what I'm trying to accomplish is someone else looking for the same thing http://forums.mysql.com/read.php?24,192250,192250
The idea is I'd like to capture all the traffic to my master to save for a replay log against a snapshot of the entire database taken when the monitoring started. Bin logs won't serve what I want since they only include writes. I want read activity to realistically view the effects of proposed changes with "real" production traffic.
Real production traffic meaning everything from all applications that are hitting the database to be modified. If there's some application level change, tests running just that app don't account for other activity going on in the system at the time. Running all applications in a test environment aren't guaranteed to have the same state of the database. I could take a test snapshot as a starting point as I fire them up but the applications don't have the ability to do an exact replay of their own activity.
I've use the tcpdump script from http://www.mysqlperformanceblog.com/2008/11/07/poor-mans-query-logging/ to monitor activity but this doesn't tell me which queries are coming from which connections. Part of the playback I'm wanting is a multi threaded approach that replays the activity from the same number of threads that were actually in use.
I can't afford to turn on general query logging b/c my production master wouldn't be able to handle the performance hit for that.
The whole snapshot part of the process is to have a golden start point database to ensure everything is the same during the start of each test run.
mysql performance performance-testing
bumped to the homepage by Community♦ 14 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
Hmm, what kind of traffic does your production server get that it can't handle the 'performance' hit of the general_log ? An interesting case-study at the worst difference from 'no logging' was cpu bound with 20 concurrent connections and it still was able to handle 22k transactions/minute and faired much better on i/o bound (ugh) server
– Derek Downey
Sep 30 '11 at 20:40
Also don't want to be misunderstood...I am curious, not meant to be attacking.
– Derek Downey
Sep 30 '11 at 20:41
After rereading how I worded it I feel I kind of lied. It would have been more appropriate to say "I've always heard that was bad and didn't want to play around in production to get hard numbers."
– atxdba
Oct 4 '11 at 3:31
add a comment |
Does anyone know of any existing tools/products that accomplish what I'm trying to do?
After searching around for a bit the only thing I could find surrounding what I'm trying to accomplish is someone else looking for the same thing http://forums.mysql.com/read.php?24,192250,192250
The idea is I'd like to capture all the traffic to my master to save for a replay log against a snapshot of the entire database taken when the monitoring started. Bin logs won't serve what I want since they only include writes. I want read activity to realistically view the effects of proposed changes with "real" production traffic.
Real production traffic meaning everything from all applications that are hitting the database to be modified. If there's some application level change, tests running just that app don't account for other activity going on in the system at the time. Running all applications in a test environment aren't guaranteed to have the same state of the database. I could take a test snapshot as a starting point as I fire them up but the applications don't have the ability to do an exact replay of their own activity.
I've use the tcpdump script from http://www.mysqlperformanceblog.com/2008/11/07/poor-mans-query-logging/ to monitor activity but this doesn't tell me which queries are coming from which connections. Part of the playback I'm wanting is a multi threaded approach that replays the activity from the same number of threads that were actually in use.
I can't afford to turn on general query logging b/c my production master wouldn't be able to handle the performance hit for that.
The whole snapshot part of the process is to have a golden start point database to ensure everything is the same during the start of each test run.
mysql performance performance-testing
Does anyone know of any existing tools/products that accomplish what I'm trying to do?
After searching around for a bit the only thing I could find surrounding what I'm trying to accomplish is someone else looking for the same thing http://forums.mysql.com/read.php?24,192250,192250
The idea is I'd like to capture all the traffic to my master to save for a replay log against a snapshot of the entire database taken when the monitoring started. Bin logs won't serve what I want since they only include writes. I want read activity to realistically view the effects of proposed changes with "real" production traffic.
Real production traffic meaning everything from all applications that are hitting the database to be modified. If there's some application level change, tests running just that app don't account for other activity going on in the system at the time. Running all applications in a test environment aren't guaranteed to have the same state of the database. I could take a test snapshot as a starting point as I fire them up but the applications don't have the ability to do an exact replay of their own activity.
I've use the tcpdump script from http://www.mysqlperformanceblog.com/2008/11/07/poor-mans-query-logging/ to monitor activity but this doesn't tell me which queries are coming from which connections. Part of the playback I'm wanting is a multi threaded approach that replays the activity from the same number of threads that were actually in use.
I can't afford to turn on general query logging b/c my production master wouldn't be able to handle the performance hit for that.
The whole snapshot part of the process is to have a golden start point database to ensure everything is the same during the start of each test run.
mysql performance performance-testing
mysql performance performance-testing
edited Aug 5 '17 at 17:05
RolandoMySQLDBA
144k24228385
144k24228385
asked Sep 30 '11 at 19:14
atxdbaatxdba
3,90342754
3,90342754
bumped to the homepage by Community♦ 14 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
bumped to the homepage by Community♦ 14 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
Hmm, what kind of traffic does your production server get that it can't handle the 'performance' hit of the general_log ? An interesting case-study at the worst difference from 'no logging' was cpu bound with 20 concurrent connections and it still was able to handle 22k transactions/minute and faired much better on i/o bound (ugh) server
– Derek Downey
Sep 30 '11 at 20:40
Also don't want to be misunderstood...I am curious, not meant to be attacking.
– Derek Downey
Sep 30 '11 at 20:41
After rereading how I worded it I feel I kind of lied. It would have been more appropriate to say "I've always heard that was bad and didn't want to play around in production to get hard numbers."
– atxdba
Oct 4 '11 at 3:31
add a comment |
Hmm, what kind of traffic does your production server get that it can't handle the 'performance' hit of the general_log ? An interesting case-study at the worst difference from 'no logging' was cpu bound with 20 concurrent connections and it still was able to handle 22k transactions/minute and faired much better on i/o bound (ugh) server
– Derek Downey
Sep 30 '11 at 20:40
Also don't want to be misunderstood...I am curious, not meant to be attacking.
– Derek Downey
Sep 30 '11 at 20:41
After rereading how I worded it I feel I kind of lied. It would have been more appropriate to say "I've always heard that was bad and didn't want to play around in production to get hard numbers."
– atxdba
Oct 4 '11 at 3:31
Hmm, what kind of traffic does your production server get that it can't handle the 'performance' hit of the general_log ? An interesting case-study at the worst difference from 'no logging' was cpu bound with 20 concurrent connections and it still was able to handle 22k transactions/minute and faired much better on i/o bound (ugh) server
– Derek Downey
Sep 30 '11 at 20:40
Hmm, what kind of traffic does your production server get that it can't handle the 'performance' hit of the general_log ? An interesting case-study at the worst difference from 'no logging' was cpu bound with 20 concurrent connections and it still was able to handle 22k transactions/minute and faired much better on i/o bound (ugh) server
– Derek Downey
Sep 30 '11 at 20:40
Also don't want to be misunderstood...I am curious, not meant to be attacking.
– Derek Downey
Sep 30 '11 at 20:41
Also don't want to be misunderstood...I am curious, not meant to be attacking.
– Derek Downey
Sep 30 '11 at 20:41
After rereading how I worded it I feel I kind of lied. It would have been more appropriate to say "I've always heard that was bad and didn't want to play around in production to get hard numbers."
– atxdba
Oct 4 '11 at 3:31
After rereading how I worded it I feel I kind of lied. It would have been more appropriate to say "I've always heard that was bad and didn't want to play around in production to get hard numbers."
– atxdba
Oct 4 '11 at 3:31
add a comment |
2 Answers
2
active
oldest
votes
I believe that pt-query-digest from the Percona Toolkit (http://www.percona.com/doc/percona-toolkit/2.0/pt-query-digest.html) is what you are looking for.
add a comment |
I actually have a suggestion that will take some work but is doable.
THEORETICAL SCENARIO
- You have 3 databases: db1, db2, db3
- Production DB had binary logging enabled
What you need are the following to create enough traffic:
- mysqldump of Production DB two weeks old (TwoWeekOldData.sql) of db1, db2 and db3
- Binary logs on Production DB that are two or more weeks old
- Four DB Servers running MySQL
- DBServers 1-3 are Slaves to Production DB
- DBServers 1-4 have Binary Logging Enabled
- DBServer1 has
replicate-do-db=db1
and has only db1 data loaded - DBServer2 has
replicate-do-db=db2
and has only db2 data loaded - DBServer3 has
replicate-do-db=db3
and has only db3 data loaded - DBServer4 has all data from TwoWeekOldData.sql loaded
You will need the following:
- Run mysqlbinlog against all binary logs on Slave1 and collect SQL into db1traffic.sql
- Run mysqlbinlog against all binary logs on Slave2 and collect SQL into db2traffic.sql
- Run mysqlbinlog against all binary logs on Slave3 and collect SQL into db3traffic.sql
Run these three traffic SQL files in parallel like this
mysql -hdb4 -u... -p... -A -Ddb1 < db1traffic.sql &
mysql -hdb4 -u... -p... -A -Ddb2 < db2traffic.sql &
mysql -hdb4 -u... -p... -A -Ddb3 < db3traffic.sql &
You can use DBServer4 to measure
- server load
- query performance
- I/O utilization
- Whatever you want to measure
As I mentioned in the outset, this takes some setup work along with access to old data and all binary log entriers since the old data's last backup.
I hope this gives you some guidance on producing real traffic rather that simulating ordinary load.
GIVE IT A TRY!!
FYI This simulates only commands that update the db as it only uses the binlogs, it doesn't simulate select statement, which is where a lot of load comes from.
– timetofly
Jan 3 '18 at 17:55
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "182"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f6391%2fmulti-threaded-replay-of-master-activity-for-load-performance-testing%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
I believe that pt-query-digest from the Percona Toolkit (http://www.percona.com/doc/percona-toolkit/2.0/pt-query-digest.html) is what you are looking for.
add a comment |
I believe that pt-query-digest from the Percona Toolkit (http://www.percona.com/doc/percona-toolkit/2.0/pt-query-digest.html) is what you are looking for.
add a comment |
I believe that pt-query-digest from the Percona Toolkit (http://www.percona.com/doc/percona-toolkit/2.0/pt-query-digest.html) is what you are looking for.
I believe that pt-query-digest from the Percona Toolkit (http://www.percona.com/doc/percona-toolkit/2.0/pt-query-digest.html) is what you are looking for.
answered Jan 21 '12 at 23:34
malonsomalonso
16125
16125
add a comment |
add a comment |
I actually have a suggestion that will take some work but is doable.
THEORETICAL SCENARIO
- You have 3 databases: db1, db2, db3
- Production DB had binary logging enabled
What you need are the following to create enough traffic:
- mysqldump of Production DB two weeks old (TwoWeekOldData.sql) of db1, db2 and db3
- Binary logs on Production DB that are two or more weeks old
- Four DB Servers running MySQL
- DBServers 1-3 are Slaves to Production DB
- DBServers 1-4 have Binary Logging Enabled
- DBServer1 has
replicate-do-db=db1
and has only db1 data loaded - DBServer2 has
replicate-do-db=db2
and has only db2 data loaded - DBServer3 has
replicate-do-db=db3
and has only db3 data loaded - DBServer4 has all data from TwoWeekOldData.sql loaded
You will need the following:
- Run mysqlbinlog against all binary logs on Slave1 and collect SQL into db1traffic.sql
- Run mysqlbinlog against all binary logs on Slave2 and collect SQL into db2traffic.sql
- Run mysqlbinlog against all binary logs on Slave3 and collect SQL into db3traffic.sql
Run these three traffic SQL files in parallel like this
mysql -hdb4 -u... -p... -A -Ddb1 < db1traffic.sql &
mysql -hdb4 -u... -p... -A -Ddb2 < db2traffic.sql &
mysql -hdb4 -u... -p... -A -Ddb3 < db3traffic.sql &
You can use DBServer4 to measure
- server load
- query performance
- I/O utilization
- Whatever you want to measure
As I mentioned in the outset, this takes some setup work along with access to old data and all binary log entriers since the old data's last backup.
I hope this gives you some guidance on producing real traffic rather that simulating ordinary load.
GIVE IT A TRY!!
FYI This simulates only commands that update the db as it only uses the binlogs, it doesn't simulate select statement, which is where a lot of load comes from.
– timetofly
Jan 3 '18 at 17:55
add a comment |
I actually have a suggestion that will take some work but is doable.
THEORETICAL SCENARIO
- You have 3 databases: db1, db2, db3
- Production DB had binary logging enabled
What you need are the following to create enough traffic:
- mysqldump of Production DB two weeks old (TwoWeekOldData.sql) of db1, db2 and db3
- Binary logs on Production DB that are two or more weeks old
- Four DB Servers running MySQL
- DBServers 1-3 are Slaves to Production DB
- DBServers 1-4 have Binary Logging Enabled
- DBServer1 has
replicate-do-db=db1
and has only db1 data loaded - DBServer2 has
replicate-do-db=db2
and has only db2 data loaded - DBServer3 has
replicate-do-db=db3
and has only db3 data loaded - DBServer4 has all data from TwoWeekOldData.sql loaded
You will need the following:
- Run mysqlbinlog against all binary logs on Slave1 and collect SQL into db1traffic.sql
- Run mysqlbinlog against all binary logs on Slave2 and collect SQL into db2traffic.sql
- Run mysqlbinlog against all binary logs on Slave3 and collect SQL into db3traffic.sql
Run these three traffic SQL files in parallel like this
mysql -hdb4 -u... -p... -A -Ddb1 < db1traffic.sql &
mysql -hdb4 -u... -p... -A -Ddb2 < db2traffic.sql &
mysql -hdb4 -u... -p... -A -Ddb3 < db3traffic.sql &
You can use DBServer4 to measure
- server load
- query performance
- I/O utilization
- Whatever you want to measure
As I mentioned in the outset, this takes some setup work along with access to old data and all binary log entriers since the old data's last backup.
I hope this gives you some guidance on producing real traffic rather that simulating ordinary load.
GIVE IT A TRY!!
FYI This simulates only commands that update the db as it only uses the binlogs, it doesn't simulate select statement, which is where a lot of load comes from.
– timetofly
Jan 3 '18 at 17:55
add a comment |
I actually have a suggestion that will take some work but is doable.
THEORETICAL SCENARIO
- You have 3 databases: db1, db2, db3
- Production DB had binary logging enabled
What you need are the following to create enough traffic:
- mysqldump of Production DB two weeks old (TwoWeekOldData.sql) of db1, db2 and db3
- Binary logs on Production DB that are two or more weeks old
- Four DB Servers running MySQL
- DBServers 1-3 are Slaves to Production DB
- DBServers 1-4 have Binary Logging Enabled
- DBServer1 has
replicate-do-db=db1
and has only db1 data loaded - DBServer2 has
replicate-do-db=db2
and has only db2 data loaded - DBServer3 has
replicate-do-db=db3
and has only db3 data loaded - DBServer4 has all data from TwoWeekOldData.sql loaded
You will need the following:
- Run mysqlbinlog against all binary logs on Slave1 and collect SQL into db1traffic.sql
- Run mysqlbinlog against all binary logs on Slave2 and collect SQL into db2traffic.sql
- Run mysqlbinlog against all binary logs on Slave3 and collect SQL into db3traffic.sql
Run these three traffic SQL files in parallel like this
mysql -hdb4 -u... -p... -A -Ddb1 < db1traffic.sql &
mysql -hdb4 -u... -p... -A -Ddb2 < db2traffic.sql &
mysql -hdb4 -u... -p... -A -Ddb3 < db3traffic.sql &
You can use DBServer4 to measure
- server load
- query performance
- I/O utilization
- Whatever you want to measure
As I mentioned in the outset, this takes some setup work along with access to old data and all binary log entriers since the old data's last backup.
I hope this gives you some guidance on producing real traffic rather that simulating ordinary load.
GIVE IT A TRY!!
I actually have a suggestion that will take some work but is doable.
THEORETICAL SCENARIO
- You have 3 databases: db1, db2, db3
- Production DB had binary logging enabled
What you need are the following to create enough traffic:
- mysqldump of Production DB two weeks old (TwoWeekOldData.sql) of db1, db2 and db3
- Binary logs on Production DB that are two or more weeks old
- Four DB Servers running MySQL
- DBServers 1-3 are Slaves to Production DB
- DBServers 1-4 have Binary Logging Enabled
- DBServer1 has
replicate-do-db=db1
and has only db1 data loaded - DBServer2 has
replicate-do-db=db2
and has only db2 data loaded - DBServer3 has
replicate-do-db=db3
and has only db3 data loaded - DBServer4 has all data from TwoWeekOldData.sql loaded
You will need the following:
- Run mysqlbinlog against all binary logs on Slave1 and collect SQL into db1traffic.sql
- Run mysqlbinlog against all binary logs on Slave2 and collect SQL into db2traffic.sql
- Run mysqlbinlog against all binary logs on Slave3 and collect SQL into db3traffic.sql
Run these three traffic SQL files in parallel like this
mysql -hdb4 -u... -p... -A -Ddb1 < db1traffic.sql &
mysql -hdb4 -u... -p... -A -Ddb2 < db2traffic.sql &
mysql -hdb4 -u... -p... -A -Ddb3 < db3traffic.sql &
You can use DBServer4 to measure
- server load
- query performance
- I/O utilization
- Whatever you want to measure
As I mentioned in the outset, this takes some setup work along with access to old data and all binary log entriers since the old data's last backup.
I hope this gives you some guidance on producing real traffic rather that simulating ordinary load.
GIVE IT A TRY!!
edited Oct 27 '16 at 14:55
Philᵀᴹ
25.7k65591
25.7k65591
answered Oct 26 '11 at 20:45
RolandoMySQLDBARolandoMySQLDBA
144k24228385
144k24228385
FYI This simulates only commands that update the db as it only uses the binlogs, it doesn't simulate select statement, which is where a lot of load comes from.
– timetofly
Jan 3 '18 at 17:55
add a comment |
FYI This simulates only commands that update the db as it only uses the binlogs, it doesn't simulate select statement, which is where a lot of load comes from.
– timetofly
Jan 3 '18 at 17:55
FYI This simulates only commands that update the db as it only uses the binlogs, it doesn't simulate select statement, which is where a lot of load comes from.
– timetofly
Jan 3 '18 at 17:55
FYI This simulates only commands that update the db as it only uses the binlogs, it doesn't simulate select statement, which is where a lot of load comes from.
– timetofly
Jan 3 '18 at 17:55
add a comment |
Thanks for contributing an answer to Database Administrators Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f6391%2fmulti-threaded-replay-of-master-activity-for-load-performance-testing%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Hmm, what kind of traffic does your production server get that it can't handle the 'performance' hit of the general_log ? An interesting case-study at the worst difference from 'no logging' was cpu bound with 20 concurrent connections and it still was able to handle 22k transactions/minute and faired much better on i/o bound (ugh) server
– Derek Downey
Sep 30 '11 at 20:40
Also don't want to be misunderstood...I am curious, not meant to be attacking.
– Derek Downey
Sep 30 '11 at 20:41
After rereading how I worded it I feel I kind of lied. It would have been more appropriate to say "I've always heard that was bad and didn't want to play around in production to get hard numbers."
– atxdba
Oct 4 '11 at 3:31