Large INSERTs performance optimization

I have 15 Amazon AWS EC2 t1.micro instances which simultaneously populate Amazon RDS MySQL d2.m2.xlarge database with data using large INSERTs (40000 rows in query).

The queries are sent continuously. The table is INNODB, two INT columns, there is index for both columns. CPU utilization of RDS instance is about 30% during data receiving.

When I have one EC2 instance, the speed is in orders of magnitude faster than I run 15 instances simultaneously. In light of this, the 15-instances group works slower and slower until the speed becomes totally unsatisfactory.

How can I optimize performance for this process?

UPDATE

My SHOW CREATE TABLE result is the following:

CREATE TABLE `UserData` (

 `uid` int(11) NOT NULL,

 `data` int(11) NOT NULL,

 PRIMARY KEY (`uid`,`data`),

 KEY `uid` (`uid`),

 KEY `data` (`data`)

) ENGINE=InnoDB DEFAULT CHARSET=latin1

I need 2 indexes because it is necessary for me to fetch data by uid and by data value.

I insert data with INSERT INTO UserData (uid, data) VALUES (1,2),(1,3),(1,10),... with 40000 (uid,data) pairs.

15 parallel instances insert ~121 000 000 rows in 2 hours, but I am sure that it can be much more faster.

edited 2 mins ago

asked May 13 '13 at 0:09

zavg

121126

Please present the SHOW CREATE TABLE your_table output. In particular, is there an AUTO_INCREMENT column? Which PRIMARY KEY? What is the order by which rows are inserted?

– Shlomi Noach
May 13 '13 at 5:10

@ShlomiNoach Thank you for reply, I have updated the post

– zavg
May 13 '13 at 12:42

Are your INSERTs enclosed into a transaction? START TRANSACTION; INSERT ... and COMMIT;

– Cristian Porta
May 13 '13 at 18:50

@crisrian-porta Not, but I use bulk inserts with 40000 values in one query.

– zavg
May 13 '13 at 19:04

add a comment |

I have 15 Amazon AWS EC2 t1.micro instances which simultaneously populate Amazon RDS MySQL d2.m2.xlarge database with data using large INSERTs (40000 rows in query).

The queries are sent continuously. The table is INNODB, two INT columns, there is index for both columns. CPU utilization of RDS instance is about 30% during data receiving.

How can I optimize performance for this process?

UPDATE

My SHOW CREATE TABLE result is the following:

CREATE TABLE `UserData` (

 `uid` int(11) NOT NULL,

 `data` int(11) NOT NULL,

 PRIMARY KEY (`uid`,`data`),

 KEY `uid` (`uid`),

 KEY `data` (`data`)

) ENGINE=InnoDB DEFAULT CHARSET=latin1

I need 2 indexes because it is necessary for me to fetch data by uid and by data value.

I insert data with INSERT INTO UserData (uid, data) VALUES (1,2),(1,3),(1,10),... with 40000 (uid,data) pairs.

15 parallel instances insert ~121 000 000 rows in 2 hours, but I am sure that it can be much more faster.

edited 2 mins ago

asked May 13 '13 at 0:09

zavg

121126

Please present the SHOW CREATE TABLE your_table output. In particular, is there an AUTO_INCREMENT column? Which PRIMARY KEY? What is the order by which rows are inserted?

– Shlomi Noach
May 13 '13 at 5:10

@ShlomiNoach Thank you for reply, I have updated the post

– zavg
May 13 '13 at 12:42

Are your INSERTs enclosed into a transaction? START TRANSACTION; INSERT ... and COMMIT;

– Cristian Porta
May 13 '13 at 18:50

@crisrian-porta Not, but I use bulk inserts with 40000 values in one query.

– zavg
May 13 '13 at 19:04

add a comment |

I have 15 Amazon AWS EC2 t1.micro instances which simultaneously populate Amazon RDS MySQL d2.m2.xlarge database with data using large INSERTs (40000 rows in query).

The queries are sent continuously. The table is INNODB, two INT columns, there is index for both columns. CPU utilization of RDS instance is about 30% during data receiving.

How can I optimize performance for this process?

UPDATE

My SHOW CREATE TABLE result is the following:

CREATE TABLE `UserData` (

 `uid` int(11) NOT NULL,

 `data` int(11) NOT NULL,

 PRIMARY KEY (`uid`,`data`),

 KEY `uid` (`uid`),

 KEY `data` (`data`)

) ENGINE=InnoDB DEFAULT CHARSET=latin1

I need 2 indexes because it is necessary for me to fetch data by uid and by data value.

I insert data with INSERT INTO UserData (uid, data) VALUES (1,2),(1,3),(1,10),... with 40000 (uid,data) pairs.

15 parallel instances insert ~121 000 000 rows in 2 hours, but I am sure that it can be much more faster.

edited 2 mins ago

asked May 13 '13 at 0:09

zavg

121126

I have 15 Amazon AWS EC2 t1.micro instances which simultaneously populate Amazon RDS MySQL d2.m2.xlarge database with data using large INSERTs (40000 rows in query).

The queries are sent continuously. The table is INNODB, two INT columns, there is index for both columns. CPU utilization of RDS instance is about 30% during data receiving.

How can I optimize performance for this process?

UPDATE

My SHOW CREATE TABLE result is the following:

CREATE TABLE `UserData` (

 `uid` int(11) NOT NULL,

 `data` int(11) NOT NULL,

 PRIMARY KEY (`uid`,`data`),

 KEY `uid` (`uid`),

 KEY `data` (`data`)

) ENGINE=InnoDB DEFAULT CHARSET=latin1

I need 2 indexes because it is necessary for me to fetch data by uid and by data value.

I insert data with INSERT INTO UserData (uid, data) VALUES (1,2),(1,3),(1,10),... with 40000 (uid,data) pairs.

15 parallel instances insert ~121 000 000 rows in 2 hours, but I am sure that it can be much more faster.

mysql innodb optimization insert amazon-rds

edited 2 mins ago

asked May 13 '13 at 0:09

zavg

121126

edited 2 mins ago

asked May 13 '13 at 0:09

zavg

121126

edited 2 mins ago

asked May 13 '13 at 0:09

zavg

121126

asked May 13 '13 at 0:09

zavg

121126

asked May 13 '13 at 0:09

zavg

121126

Please present the SHOW CREATE TABLE your_table output. In particular, is there an AUTO_INCREMENT column? Which PRIMARY KEY? What is the order by which rows are inserted?

– Shlomi Noach
May 13 '13 at 5:10

@ShlomiNoach Thank you for reply, I have updated the post

– zavg
May 13 '13 at 12:42

Are your INSERTs enclosed into a transaction? START TRANSACTION; INSERT ... and COMMIT;

– Cristian Porta
May 13 '13 at 18:50

@crisrian-porta Not, but I use bulk inserts with 40000 values in one query.

– zavg
May 13 '13 at 19:04

add a comment |

Please present the SHOW CREATE TABLE your_table output. In particular, is there an AUTO_INCREMENT column? Which PRIMARY KEY? What is the order by which rows are inserted?

– Shlomi Noach
May 13 '13 at 5:10

@ShlomiNoach Thank you for reply, I have updated the post

– zavg
May 13 '13 at 12:42

Are your INSERTs enclosed into a transaction? START TRANSACTION; INSERT ... and COMMIT;

– Cristian Porta
May 13 '13 at 18:50

@crisrian-porta Not, but I use bulk inserts with 40000 values in one query.

– zavg
May 13 '13 at 19:04

Please present the SHOW CREATE TABLE your_table output. In particular, is there an AUTO_INCREMENT column? Which PRIMARY KEY? What is the order by which rows are inserted?

– Shlomi Noach
May 13 '13 at 5:10

@ShlomiNoach Thank you for reply, I have updated the post

– zavg
May 13 '13 at 12:42

Are your INSERTs enclosed into a transaction? START TRANSACTION; INSERT ... and COMMIT;

– Cristian Porta
May 13 '13 at 18:50

@crisrian-porta Not, but I use bulk inserts with 40000 values in one query.

– zavg
May 13 '13 at 19:04

add a comment |

1 Answer
1

active

oldest

votes

Two hints for you:

The KEY uid is redundant, because it is covered by the PRIMARY KEY

40,000 rows at a time might make for too large a transaction. Although these are very small rows (two INTs) this may cause the transaction to go to disk, depending on your settings. I usually go with around 1,000 rows at a time (I go as low as 100 and as high as 10,000).
Please try doing 40 * 1,000 and see if this works better for you than 1 * 40,000

answered May 17 '13 at 19:54

Shlomi Noach

6,2771721

Ok, thank you for advice. I will try your technique and measure the performance increase. Currently I am using such optimizations: 1) I have dropped all indexes while writing to db, 2) I have set AUTOCOMMIT=0; and COMMIT after very large sequence of INSERTs (about 1500000 rows), 3) I am using increased innodb_buffer_pool_size and bulk_insert_buffer_size, 4) also I changed t1.micro instances to m1.small and m2.xlarge to m1.xlarge for better IO performance

– zavg
May 17 '13 at 21:09

number 2) sound like an anti-optimization to me. And how do you mean you've dropped all indexes while writing? Do you mean you're ALTERing the table before and after the INSERT? If so - don't do it.

– Shlomi Noach
May 19 '13 at 17:00

I insert data to the table without indexes and add them after all INSERTs. then I don't write to this table.

– zavg
May 19 '13 at 22:52

add a comment |

Your Answer

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "182"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f42018%2flarge-inserts-performance-optimization%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

Two hints for you:

The KEY uid is redundant, because it is covered by the PRIMARY KEY

40,000 rows at a time might make for too large a transaction. Although these are very small rows (two INTs) this may cause the transaction to go to disk, depending on your settings. I usually go with around 1,000 rows at a time (I go as low as 100 and as high as 10,000).
Please try doing 40 * 1,000 and see if this works better for you than 1 * 40,000

answered May 17 '13 at 19:54

Shlomi Noach

6,2771721

Ok, thank you for advice. I will try your technique and measure the performance increase. Currently I am using such optimizations: 1) I have dropped all indexes while writing to db, 2) I have set AUTOCOMMIT=0; and COMMIT after very large sequence of INSERTs (about 1500000 rows), 3) I am using increased innodb_buffer_pool_size and bulk_insert_buffer_size, 4) also I changed t1.micro instances to m1.small and m2.xlarge to m1.xlarge for better IO performance

– zavg
May 17 '13 at 21:09

number 2) sound like an anti-optimization to me. And how do you mean you've dropped all indexes while writing? Do you mean you're ALTERing the table before and after the INSERT? If so - don't do it.

– Shlomi Noach
May 19 '13 at 17:00

I insert data to the table without indexes and add them after all INSERTs. then I don't write to this table.

– zavg
May 19 '13 at 22:52

add a comment |

Two hints for you:

The KEY uid is redundant, because it is covered by the PRIMARY KEY

40,000 rows at a time might make for too large a transaction. Although these are very small rows (two INTs) this may cause the transaction to go to disk, depending on your settings. I usually go with around 1,000 rows at a time (I go as low as 100 and as high as 10,000).
Please try doing 40 * 1,000 and see if this works better for you than 1 * 40,000

answered May 17 '13 at 19:54

Shlomi Noach

6,2771721

Ok, thank you for advice. I will try your technique and measure the performance increase. Currently I am using such optimizations: 1) I have dropped all indexes while writing to db, 2) I have set AUTOCOMMIT=0; and COMMIT after very large sequence of INSERTs (about 1500000 rows), 3) I am using increased innodb_buffer_pool_size and bulk_insert_buffer_size, 4) also I changed t1.micro instances to m1.small and m2.xlarge to m1.xlarge for better IO performance

– zavg
May 17 '13 at 21:09

number 2) sound like an anti-optimization to me. And how do you mean you've dropped all indexes while writing? Do you mean you're ALTERing the table before and after the INSERT? If so - don't do it.

– Shlomi Noach
May 19 '13 at 17:00

I insert data to the table without indexes and add them after all INSERTs. then I don't write to this table.

– zavg
May 19 '13 at 22:52

add a comment |

Two hints for you:

The KEY uid is redundant, because it is covered by the PRIMARY KEY

40,000 rows at a time might make for too large a transaction. Although these are very small rows (two INTs) this may cause the transaction to go to disk, depending on your settings. I usually go with around 1,000 rows at a time (I go as low as 100 and as high as 10,000).
Please try doing 40 * 1,000 and see if this works better for you than 1 * 40,000

answered May 17 '13 at 19:54

Shlomi Noach

6,2771721

Two hints for you:

The KEY uid is redundant, because it is covered by the PRIMARY KEY

40,000 rows at a time might make for too large a transaction. Although these are very small rows (two INTs) this may cause the transaction to go to disk, depending on your settings. I usually go with around 1,000 rows at a time (I go as low as 100 and as high as 10,000).
Please try doing 40 * 1,000 and see if this works better for you than 1 * 40,000

answered May 17 '13 at 19:54

Shlomi Noach

6,2771721

answered May 17 '13 at 19:54

Shlomi Noach

6,2771721

answered May 17 '13 at 19:54

Shlomi Noach

6,2771721

answered May 17 '13 at 19:54

Shlomi Noach

6,2771721

Ok, thank you for advice. I will try your technique and measure the performance increase. Currently I am using such optimizations: 1) I have dropped all indexes while writing to db, 2) I have set AUTOCOMMIT=0; and COMMIT after very large sequence of INSERTs (about 1500000 rows), 3) I am using increased innodb_buffer_pool_size and bulk_insert_buffer_size, 4) also I changed t1.micro instances to m1.small and m2.xlarge to m1.xlarge for better IO performance

– zavg
May 17 '13 at 21:09

number 2) sound like an anti-optimization to me. And how do you mean you've dropped all indexes while writing? Do you mean you're ALTERing the table before and after the INSERT? If so - don't do it.

– Shlomi Noach
May 19 '13 at 17:00

I insert data to the table without indexes and add them after all INSERTs. then I don't write to this table.

– zavg
May 19 '13 at 22:52

add a comment |

Ok, thank you for advice. I will try your technique and measure the performance increase. Currently I am using such optimizations: 1) I have dropped all indexes while writing to db, 2) I have set AUTOCOMMIT=0; and COMMIT after very large sequence of INSERTs (about 1500000 rows), 3) I am using increased innodb_buffer_pool_size and bulk_insert_buffer_size, 4) also I changed t1.micro instances to m1.small and m2.xlarge to m1.xlarge for better IO performance

– zavg
May 17 '13 at 21:09

number 2) sound like an anti-optimization to me. And how do you mean you've dropped all indexes while writing? Do you mean you're ALTERing the table before and after the INSERT? If so - don't do it.

– Shlomi Noach
May 19 '13 at 17:00

I insert data to the table without indexes and add them after all INSERTs. then I don't write to this table.

– zavg
May 19 '13 at 22:52

Ok, thank you for advice. I will try your technique and measure the performance increase. Currently I am using such optimizations: 1) I have dropped all indexes while writing to db, 2) I have set AUTOCOMMIT=0; and COMMIT after very large sequence of INSERTs (about 1500000 rows), 3) I am using increased innodb_buffer_pool_size and bulk_insert_buffer_size, 4) also I changed t1.micro instances to m1.small and m2.xlarge to m1.xlarge for better IO performance

– zavg
May 17 '13 at 21:09

number 2) sound like an anti-optimization to me. And how do you mean you've dropped all indexes while writing? Do you mean you're ALTERing the table before and after the INSERT? If so - don't do it.

– Shlomi Noach
May 19 '13 at 17:00

I insert data to the table without indexes and add them after all INSERTs. then I don't write to this table.

– zavg
May 19 '13 at 22:52

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Database Administrators Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Vryjdfkk