SQL Server bulk transfer help needed





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ margin-bottom:0;
}







1















We've got this very badly designed logging table that we want to add functionality to. The problem is that it's already a scalability nightmare and we want to fix the design before adding to it, but we only have the nightly upgrade window to do it in.



I've seen a lot of articles about the various bulk copy options with SQL Server, claiming "We could move 80M rows in 10 minutes!" but so far my testing doesn't get anywhere near that, and I'd like suggestions on how to improve on what I'm seeing.



Before the upgrade, there's always a full backup. I'm only interested in the end result and don't want a huge transaction log. I also don't want it to take too long and I don't want to blow out the disk space with transaction logs or temp files.



The table's been out there a while, so in our bigger customer dbs, it's already over 50 million rows. Each row is about 350-400 bytes. The columns are something like this



IdentityColID int, [type] int, [subtype] int, 
created datetime, author nvarchar(100), Message nvarchar(max)


The problems with the design are




  • The primary clustered key is (type, subtype, created, identitycolid), so it's an insert nightmare. Blocksplits all over the place. And even doing a SELECT COUNT(*) takes like 8 minutes.


  • There aren't good indexes to support the types of queries desired



I wanted to make a new table where the primary clustered index is the IdentityColId and add the indexes to support the type of necessary queries, and then copy the existing data over and drop the old table.



So far, I tried using BCP to get the data out, and importing with




  • BCP


  • BULK INSERT


  • INSERT INTO ... FROM OPENROWSET



The bcp export took about 25 minutes and the imports all took about 1.3 hour - about 1.5 hours. With Recovery Model Simple, the transaction log didn't grow but the cpu consumption was in the 60-65% range most of the time.



I tried just using T-SQL INSERT INTO NewTable SELECT * FROM OldTable, but even with Recovery Model Simple, the transaction log gets to 100 gig.



I tried using SSIS data import packages with the from/to model, and the net time was about an hour 20 minutes. With Recovery Model Simple, the transaction log stayed small.



Then I tried an SSIS Execute SQLTask package to effectively do the INSERT INTO NewTable... line within SSIS. That got the execution time down to about 1:15, but no matter what the recovery model, the transaction log ended up around 100 gig, though CPU consumption stays modest.



I'd like the end result to be one new table, so the suggestion from some of the articles I've read to parallelize into multiple result tables doesn't seem a profitable path. But so far, I just can't seem to approach those stats from the articles I've read.



Anyone have any suggestions on how I can goose this a bit?










share|improve this question
















bumped to the homepage by Community 11 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
















  • If you are doing the insert in one batch, then it will be logged, and your log file will grow until the transaction is finished, even when the recovery model is simple. The lob data (messages column) could be slowing your inserts down, if you could do anything about that, that would help.

    – Randi Vertongen
    Dec 21 '18 at 21:40








  • 5





    Have you tried to use INSERT .. SELECT in batches (of say 1-10K rows eacj)? It would probbaly hep to add the indexes after the table has been populated.

    – ypercubeᵀᴹ
    Dec 21 '18 at 21:42








  • 2





    What version of SQL Server? The table is 20 GB, correct? If rows are only 400 bytes then why is there an nvarchar(max) column? Do you want to do the fastest load possible while minimizing transaction log growth?

    – Joe Obbish
    Dec 22 '18 at 0:15






  • 1





    I dont understand why you are copying the data. You can drop and add indexes on the existing table.

    – Sir Swears-a-lot
    Dec 24 '18 at 9:44











  • The nvarchar(max) is to allow people to type in whatever they want for audit notes, but on average people don't tend to type all that much. Yeah, I'm looking for fastest possible load while minimizing transaction log growth because of the maintenance window I have to work in.

    – user1664043
    Dec 26 '18 at 14:41


















1















We've got this very badly designed logging table that we want to add functionality to. The problem is that it's already a scalability nightmare and we want to fix the design before adding to it, but we only have the nightly upgrade window to do it in.



I've seen a lot of articles about the various bulk copy options with SQL Server, claiming "We could move 80M rows in 10 minutes!" but so far my testing doesn't get anywhere near that, and I'd like suggestions on how to improve on what I'm seeing.



Before the upgrade, there's always a full backup. I'm only interested in the end result and don't want a huge transaction log. I also don't want it to take too long and I don't want to blow out the disk space with transaction logs or temp files.



The table's been out there a while, so in our bigger customer dbs, it's already over 50 million rows. Each row is about 350-400 bytes. The columns are something like this



IdentityColID int, [type] int, [subtype] int, 
created datetime, author nvarchar(100), Message nvarchar(max)


The problems with the design are




  • The primary clustered key is (type, subtype, created, identitycolid), so it's an insert nightmare. Blocksplits all over the place. And even doing a SELECT COUNT(*) takes like 8 minutes.


  • There aren't good indexes to support the types of queries desired



I wanted to make a new table where the primary clustered index is the IdentityColId and add the indexes to support the type of necessary queries, and then copy the existing data over and drop the old table.



So far, I tried using BCP to get the data out, and importing with




  • BCP


  • BULK INSERT


  • INSERT INTO ... FROM OPENROWSET



The bcp export took about 25 minutes and the imports all took about 1.3 hour - about 1.5 hours. With Recovery Model Simple, the transaction log didn't grow but the cpu consumption was in the 60-65% range most of the time.



I tried just using T-SQL INSERT INTO NewTable SELECT * FROM OldTable, but even with Recovery Model Simple, the transaction log gets to 100 gig.



I tried using SSIS data import packages with the from/to model, and the net time was about an hour 20 minutes. With Recovery Model Simple, the transaction log stayed small.



Then I tried an SSIS Execute SQLTask package to effectively do the INSERT INTO NewTable... line within SSIS. That got the execution time down to about 1:15, but no matter what the recovery model, the transaction log ended up around 100 gig, though CPU consumption stays modest.



I'd like the end result to be one new table, so the suggestion from some of the articles I've read to parallelize into multiple result tables doesn't seem a profitable path. But so far, I just can't seem to approach those stats from the articles I've read.



Anyone have any suggestions on how I can goose this a bit?










share|improve this question
















bumped to the homepage by Community 11 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
















  • If you are doing the insert in one batch, then it will be logged, and your log file will grow until the transaction is finished, even when the recovery model is simple. The lob data (messages column) could be slowing your inserts down, if you could do anything about that, that would help.

    – Randi Vertongen
    Dec 21 '18 at 21:40








  • 5





    Have you tried to use INSERT .. SELECT in batches (of say 1-10K rows eacj)? It would probbaly hep to add the indexes after the table has been populated.

    – ypercubeᵀᴹ
    Dec 21 '18 at 21:42








  • 2





    What version of SQL Server? The table is 20 GB, correct? If rows are only 400 bytes then why is there an nvarchar(max) column? Do you want to do the fastest load possible while minimizing transaction log growth?

    – Joe Obbish
    Dec 22 '18 at 0:15






  • 1





    I dont understand why you are copying the data. You can drop and add indexes on the existing table.

    – Sir Swears-a-lot
    Dec 24 '18 at 9:44











  • The nvarchar(max) is to allow people to type in whatever they want for audit notes, but on average people don't tend to type all that much. Yeah, I'm looking for fastest possible load while minimizing transaction log growth because of the maintenance window I have to work in.

    – user1664043
    Dec 26 '18 at 14:41














1












1








1


1






We've got this very badly designed logging table that we want to add functionality to. The problem is that it's already a scalability nightmare and we want to fix the design before adding to it, but we only have the nightly upgrade window to do it in.



I've seen a lot of articles about the various bulk copy options with SQL Server, claiming "We could move 80M rows in 10 minutes!" but so far my testing doesn't get anywhere near that, and I'd like suggestions on how to improve on what I'm seeing.



Before the upgrade, there's always a full backup. I'm only interested in the end result and don't want a huge transaction log. I also don't want it to take too long and I don't want to blow out the disk space with transaction logs or temp files.



The table's been out there a while, so in our bigger customer dbs, it's already over 50 million rows. Each row is about 350-400 bytes. The columns are something like this



IdentityColID int, [type] int, [subtype] int, 
created datetime, author nvarchar(100), Message nvarchar(max)


The problems with the design are




  • The primary clustered key is (type, subtype, created, identitycolid), so it's an insert nightmare. Blocksplits all over the place. And even doing a SELECT COUNT(*) takes like 8 minutes.


  • There aren't good indexes to support the types of queries desired



I wanted to make a new table where the primary clustered index is the IdentityColId and add the indexes to support the type of necessary queries, and then copy the existing data over and drop the old table.



So far, I tried using BCP to get the data out, and importing with




  • BCP


  • BULK INSERT


  • INSERT INTO ... FROM OPENROWSET



The bcp export took about 25 minutes and the imports all took about 1.3 hour - about 1.5 hours. With Recovery Model Simple, the transaction log didn't grow but the cpu consumption was in the 60-65% range most of the time.



I tried just using T-SQL INSERT INTO NewTable SELECT * FROM OldTable, but even with Recovery Model Simple, the transaction log gets to 100 gig.



I tried using SSIS data import packages with the from/to model, and the net time was about an hour 20 minutes. With Recovery Model Simple, the transaction log stayed small.



Then I tried an SSIS Execute SQLTask package to effectively do the INSERT INTO NewTable... line within SSIS. That got the execution time down to about 1:15, but no matter what the recovery model, the transaction log ended up around 100 gig, though CPU consumption stays modest.



I'd like the end result to be one new table, so the suggestion from some of the articles I've read to parallelize into multiple result tables doesn't seem a profitable path. But so far, I just can't seem to approach those stats from the articles I've read.



Anyone have any suggestions on how I can goose this a bit?










share|improve this question
















We've got this very badly designed logging table that we want to add functionality to. The problem is that it's already a scalability nightmare and we want to fix the design before adding to it, but we only have the nightly upgrade window to do it in.



I've seen a lot of articles about the various bulk copy options with SQL Server, claiming "We could move 80M rows in 10 minutes!" but so far my testing doesn't get anywhere near that, and I'd like suggestions on how to improve on what I'm seeing.



Before the upgrade, there's always a full backup. I'm only interested in the end result and don't want a huge transaction log. I also don't want it to take too long and I don't want to blow out the disk space with transaction logs or temp files.



The table's been out there a while, so in our bigger customer dbs, it's already over 50 million rows. Each row is about 350-400 bytes. The columns are something like this



IdentityColID int, [type] int, [subtype] int, 
created datetime, author nvarchar(100), Message nvarchar(max)


The problems with the design are




  • The primary clustered key is (type, subtype, created, identitycolid), so it's an insert nightmare. Blocksplits all over the place. And even doing a SELECT COUNT(*) takes like 8 minutes.


  • There aren't good indexes to support the types of queries desired



I wanted to make a new table where the primary clustered index is the IdentityColId and add the indexes to support the type of necessary queries, and then copy the existing data over and drop the old table.



So far, I tried using BCP to get the data out, and importing with




  • BCP


  • BULK INSERT


  • INSERT INTO ... FROM OPENROWSET



The bcp export took about 25 minutes and the imports all took about 1.3 hour - about 1.5 hours. With Recovery Model Simple, the transaction log didn't grow but the cpu consumption was in the 60-65% range most of the time.



I tried just using T-SQL INSERT INTO NewTable SELECT * FROM OldTable, but even with Recovery Model Simple, the transaction log gets to 100 gig.



I tried using SSIS data import packages with the from/to model, and the net time was about an hour 20 minutes. With Recovery Model Simple, the transaction log stayed small.



Then I tried an SSIS Execute SQLTask package to effectively do the INSERT INTO NewTable... line within SSIS. That got the execution time down to about 1:15, but no matter what the recovery model, the transaction log ended up around 100 gig, though CPU consumption stays modest.



I'd like the end result to be one new table, so the suggestion from some of the articles I've read to parallelize into multiple result tables doesn't seem a profitable path. But so far, I just can't seem to approach those stats from the articles I've read.



Anyone have any suggestions on how I can goose this a bit?







sql-server bulk-insert bulkcopy






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Dec 26 '18 at 9:37









marc_s

7,13853849




7,13853849










asked Dec 21 '18 at 21:29









user1664043user1664043

1061




1061





bumped to the homepage by Community 11 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.







bumped to the homepage by Community 11 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.















  • If you are doing the insert in one batch, then it will be logged, and your log file will grow until the transaction is finished, even when the recovery model is simple. The lob data (messages column) could be slowing your inserts down, if you could do anything about that, that would help.

    – Randi Vertongen
    Dec 21 '18 at 21:40








  • 5





    Have you tried to use INSERT .. SELECT in batches (of say 1-10K rows eacj)? It would probbaly hep to add the indexes after the table has been populated.

    – ypercubeᵀᴹ
    Dec 21 '18 at 21:42








  • 2





    What version of SQL Server? The table is 20 GB, correct? If rows are only 400 bytes then why is there an nvarchar(max) column? Do you want to do the fastest load possible while minimizing transaction log growth?

    – Joe Obbish
    Dec 22 '18 at 0:15






  • 1





    I dont understand why you are copying the data. You can drop and add indexes on the existing table.

    – Sir Swears-a-lot
    Dec 24 '18 at 9:44











  • The nvarchar(max) is to allow people to type in whatever they want for audit notes, but on average people don't tend to type all that much. Yeah, I'm looking for fastest possible load while minimizing transaction log growth because of the maintenance window I have to work in.

    – user1664043
    Dec 26 '18 at 14:41



















  • If you are doing the insert in one batch, then it will be logged, and your log file will grow until the transaction is finished, even when the recovery model is simple. The lob data (messages column) could be slowing your inserts down, if you could do anything about that, that would help.

    – Randi Vertongen
    Dec 21 '18 at 21:40








  • 5





    Have you tried to use INSERT .. SELECT in batches (of say 1-10K rows eacj)? It would probbaly hep to add the indexes after the table has been populated.

    – ypercubeᵀᴹ
    Dec 21 '18 at 21:42








  • 2





    What version of SQL Server? The table is 20 GB, correct? If rows are only 400 bytes then why is there an nvarchar(max) column? Do you want to do the fastest load possible while minimizing transaction log growth?

    – Joe Obbish
    Dec 22 '18 at 0:15






  • 1





    I dont understand why you are copying the data. You can drop and add indexes on the existing table.

    – Sir Swears-a-lot
    Dec 24 '18 at 9:44











  • The nvarchar(max) is to allow people to type in whatever they want for audit notes, but on average people don't tend to type all that much. Yeah, I'm looking for fastest possible load while minimizing transaction log growth because of the maintenance window I have to work in.

    – user1664043
    Dec 26 '18 at 14:41

















If you are doing the insert in one batch, then it will be logged, and your log file will grow until the transaction is finished, even when the recovery model is simple. The lob data (messages column) could be slowing your inserts down, if you could do anything about that, that would help.

– Randi Vertongen
Dec 21 '18 at 21:40







If you are doing the insert in one batch, then it will be logged, and your log file will grow until the transaction is finished, even when the recovery model is simple. The lob data (messages column) could be slowing your inserts down, if you could do anything about that, that would help.

– Randi Vertongen
Dec 21 '18 at 21:40






5




5





Have you tried to use INSERT .. SELECT in batches (of say 1-10K rows eacj)? It would probbaly hep to add the indexes after the table has been populated.

– ypercubeᵀᴹ
Dec 21 '18 at 21:42







Have you tried to use INSERT .. SELECT in batches (of say 1-10K rows eacj)? It would probbaly hep to add the indexes after the table has been populated.

– ypercubeᵀᴹ
Dec 21 '18 at 21:42






2




2





What version of SQL Server? The table is 20 GB, correct? If rows are only 400 bytes then why is there an nvarchar(max) column? Do you want to do the fastest load possible while minimizing transaction log growth?

– Joe Obbish
Dec 22 '18 at 0:15





What version of SQL Server? The table is 20 GB, correct? If rows are only 400 bytes then why is there an nvarchar(max) column? Do you want to do the fastest load possible while minimizing transaction log growth?

– Joe Obbish
Dec 22 '18 at 0:15




1




1





I dont understand why you are copying the data. You can drop and add indexes on the existing table.

– Sir Swears-a-lot
Dec 24 '18 at 9:44





I dont understand why you are copying the data. You can drop and add indexes on the existing table.

– Sir Swears-a-lot
Dec 24 '18 at 9:44













The nvarchar(max) is to allow people to type in whatever they want for audit notes, but on average people don't tend to type all that much. Yeah, I'm looking for fastest possible load while minimizing transaction log growth because of the maintenance window I have to work in.

– user1664043
Dec 26 '18 at 14:41





The nvarchar(max) is to allow people to type in whatever they want for audit notes, but on average people don't tend to type all that much. Yeah, I'm looking for fastest possible load while minimizing transaction log growth because of the maintenance window I have to work in.

– user1664043
Dec 26 '18 at 14:41










2 Answers
2






active

oldest

votes


















0














Did you change of the configurations when you tried SSIS? Using an Execute SQL Task with SQL statement is equivalent to running the same SQL in SSMS. Below is an overview of doing this process in SSIS. Of course, you'll want to change and test various settings to see what works best in your environment.




  • Add a Data Flow Task. In this, add an OLE DB Source component. I'd recommend using the "SQL Command" Data Access Mode and writing a SELECT statement that contains only the column you need.

  • Add either an OLE DB Destination or SQL Server Destination and connect the source to this. The SQL Server Destination tends to perform better, however is only available when importing to a database on a local server. If you use an OLE DB Destination, use the "Table or view - fast load" option. This will operate as a BULK INSERT internally. On the destination un-check the Check Constraints option to bypass this.

  • To avoid excessive transaction log use, change the Maximum Insert Commit Size (OLE DB Destination) to a value lower than the default. The equivelant to this on the SQL Server Destination in the MaxInsertCommitSize which you can view by right-clicking the component, selecting Show Advanced Editor, and going to the Component Properties page. The default of 0 means all records will be commit in a single transaction.

  • On the Data Flow Task, go to the Properties window and set AutoAdjustBufferSize to true. This will set the size of the buffer based on the DefaultBufferMaxRows property, which indicates how many rows can be held by the buffers.

  • You can adjust parallel execution as well. On the package, the MaxConcurrentExecutables property sets the max number of task that can be run. The default is -1, which is the number of processors where the package is running plus two.
    The Data Flow Task has an EngineThreads property, which controls how many threads can be used by it.






share|improve this answer































    0














    I can't give the credit to Sir Swears-A-lot because his response was in a comment, but he was right. My assumption that dropping the primary key and creating a new one would be effectively the same or more expensive than loading up an empty new table was incorrect.



    Setting recovery mode simple, dropping the primary key, recreating it (forcing the reshuffle), and adding all my new indexes afterwards clocked in at 40 minutes. That beat any bcp/SSIS method I'd tried by more than a half hour, so that was a big improvement.



    And the transaction log grew by 10 gig but didn't go nuts like some of my other experiments.






    share|improve this answer
























      Your Answer








      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "182"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: false,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: null,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });














      draft saved

      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f225591%2fsql-server-bulk-transfer-help-needed%23new-answer', 'question_page');
      }
      );

      Post as a guest















      Required, but never shown

























      2 Answers
      2






      active

      oldest

      votes








      2 Answers
      2






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      0














      Did you change of the configurations when you tried SSIS? Using an Execute SQL Task with SQL statement is equivalent to running the same SQL in SSMS. Below is an overview of doing this process in SSIS. Of course, you'll want to change and test various settings to see what works best in your environment.




      • Add a Data Flow Task. In this, add an OLE DB Source component. I'd recommend using the "SQL Command" Data Access Mode and writing a SELECT statement that contains only the column you need.

      • Add either an OLE DB Destination or SQL Server Destination and connect the source to this. The SQL Server Destination tends to perform better, however is only available when importing to a database on a local server. If you use an OLE DB Destination, use the "Table or view - fast load" option. This will operate as a BULK INSERT internally. On the destination un-check the Check Constraints option to bypass this.

      • To avoid excessive transaction log use, change the Maximum Insert Commit Size (OLE DB Destination) to a value lower than the default. The equivelant to this on the SQL Server Destination in the MaxInsertCommitSize which you can view by right-clicking the component, selecting Show Advanced Editor, and going to the Component Properties page. The default of 0 means all records will be commit in a single transaction.

      • On the Data Flow Task, go to the Properties window and set AutoAdjustBufferSize to true. This will set the size of the buffer based on the DefaultBufferMaxRows property, which indicates how many rows can be held by the buffers.

      • You can adjust parallel execution as well. On the package, the MaxConcurrentExecutables property sets the max number of task that can be run. The default is -1, which is the number of processors where the package is running plus two.
        The Data Flow Task has an EngineThreads property, which controls how many threads can be used by it.






      share|improve this answer




























        0














        Did you change of the configurations when you tried SSIS? Using an Execute SQL Task with SQL statement is equivalent to running the same SQL in SSMS. Below is an overview of doing this process in SSIS. Of course, you'll want to change and test various settings to see what works best in your environment.




        • Add a Data Flow Task. In this, add an OLE DB Source component. I'd recommend using the "SQL Command" Data Access Mode and writing a SELECT statement that contains only the column you need.

        • Add either an OLE DB Destination or SQL Server Destination and connect the source to this. The SQL Server Destination tends to perform better, however is only available when importing to a database on a local server. If you use an OLE DB Destination, use the "Table or view - fast load" option. This will operate as a BULK INSERT internally. On the destination un-check the Check Constraints option to bypass this.

        • To avoid excessive transaction log use, change the Maximum Insert Commit Size (OLE DB Destination) to a value lower than the default. The equivelant to this on the SQL Server Destination in the MaxInsertCommitSize which you can view by right-clicking the component, selecting Show Advanced Editor, and going to the Component Properties page. The default of 0 means all records will be commit in a single transaction.

        • On the Data Flow Task, go to the Properties window and set AutoAdjustBufferSize to true. This will set the size of the buffer based on the DefaultBufferMaxRows property, which indicates how many rows can be held by the buffers.

        • You can adjust parallel execution as well. On the package, the MaxConcurrentExecutables property sets the max number of task that can be run. The default is -1, which is the number of processors where the package is running plus two.
          The Data Flow Task has an EngineThreads property, which controls how many threads can be used by it.






        share|improve this answer


























          0












          0








          0







          Did you change of the configurations when you tried SSIS? Using an Execute SQL Task with SQL statement is equivalent to running the same SQL in SSMS. Below is an overview of doing this process in SSIS. Of course, you'll want to change and test various settings to see what works best in your environment.




          • Add a Data Flow Task. In this, add an OLE DB Source component. I'd recommend using the "SQL Command" Data Access Mode and writing a SELECT statement that contains only the column you need.

          • Add either an OLE DB Destination or SQL Server Destination and connect the source to this. The SQL Server Destination tends to perform better, however is only available when importing to a database on a local server. If you use an OLE DB Destination, use the "Table or view - fast load" option. This will operate as a BULK INSERT internally. On the destination un-check the Check Constraints option to bypass this.

          • To avoid excessive transaction log use, change the Maximum Insert Commit Size (OLE DB Destination) to a value lower than the default. The equivelant to this on the SQL Server Destination in the MaxInsertCommitSize which you can view by right-clicking the component, selecting Show Advanced Editor, and going to the Component Properties page. The default of 0 means all records will be commit in a single transaction.

          • On the Data Flow Task, go to the Properties window and set AutoAdjustBufferSize to true. This will set the size of the buffer based on the DefaultBufferMaxRows property, which indicates how many rows can be held by the buffers.

          • You can adjust parallel execution as well. On the package, the MaxConcurrentExecutables property sets the max number of task that can be run. The default is -1, which is the number of processors where the package is running plus two.
            The Data Flow Task has an EngineThreads property, which controls how many threads can be used by it.






          share|improve this answer













          Did you change of the configurations when you tried SSIS? Using an Execute SQL Task with SQL statement is equivalent to running the same SQL in SSMS. Below is an overview of doing this process in SSIS. Of course, you'll want to change and test various settings to see what works best in your environment.




          • Add a Data Flow Task. In this, add an OLE DB Source component. I'd recommend using the "SQL Command" Data Access Mode and writing a SELECT statement that contains only the column you need.

          • Add either an OLE DB Destination or SQL Server Destination and connect the source to this. The SQL Server Destination tends to perform better, however is only available when importing to a database on a local server. If you use an OLE DB Destination, use the "Table or view - fast load" option. This will operate as a BULK INSERT internally. On the destination un-check the Check Constraints option to bypass this.

          • To avoid excessive transaction log use, change the Maximum Insert Commit Size (OLE DB Destination) to a value lower than the default. The equivelant to this on the SQL Server Destination in the MaxInsertCommitSize which you can view by right-clicking the component, selecting Show Advanced Editor, and going to the Component Properties page. The default of 0 means all records will be commit in a single transaction.

          • On the Data Flow Task, go to the Properties window and set AutoAdjustBufferSize to true. This will set the size of the buffer based on the DefaultBufferMaxRows property, which indicates how many rows can be held by the buffers.

          • You can adjust parallel execution as well. On the package, the MaxConcurrentExecutables property sets the max number of task that can be run. The default is -1, which is the number of processors where the package is running plus two.
            The Data Flow Task has an EngineThreads property, which controls how many threads can be used by it.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Dec 26 '18 at 19:49









          userfl89userfl89

          4616




          4616

























              0














              I can't give the credit to Sir Swears-A-lot because his response was in a comment, but he was right. My assumption that dropping the primary key and creating a new one would be effectively the same or more expensive than loading up an empty new table was incorrect.



              Setting recovery mode simple, dropping the primary key, recreating it (forcing the reshuffle), and adding all my new indexes afterwards clocked in at 40 minutes. That beat any bcp/SSIS method I'd tried by more than a half hour, so that was a big improvement.



              And the transaction log grew by 10 gig but didn't go nuts like some of my other experiments.






              share|improve this answer




























                0














                I can't give the credit to Sir Swears-A-lot because his response was in a comment, but he was right. My assumption that dropping the primary key and creating a new one would be effectively the same or more expensive than loading up an empty new table was incorrect.



                Setting recovery mode simple, dropping the primary key, recreating it (forcing the reshuffle), and adding all my new indexes afterwards clocked in at 40 minutes. That beat any bcp/SSIS method I'd tried by more than a half hour, so that was a big improvement.



                And the transaction log grew by 10 gig but didn't go nuts like some of my other experiments.






                share|improve this answer


























                  0












                  0








                  0







                  I can't give the credit to Sir Swears-A-lot because his response was in a comment, but he was right. My assumption that dropping the primary key and creating a new one would be effectively the same or more expensive than loading up an empty new table was incorrect.



                  Setting recovery mode simple, dropping the primary key, recreating it (forcing the reshuffle), and adding all my new indexes afterwards clocked in at 40 minutes. That beat any bcp/SSIS method I'd tried by more than a half hour, so that was a big improvement.



                  And the transaction log grew by 10 gig but didn't go nuts like some of my other experiments.






                  share|improve this answer













                  I can't give the credit to Sir Swears-A-lot because his response was in a comment, but he was right. My assumption that dropping the primary key and creating a new one would be effectively the same or more expensive than loading up an empty new table was incorrect.



                  Setting recovery mode simple, dropping the primary key, recreating it (forcing the reshuffle), and adding all my new indexes afterwards clocked in at 40 minutes. That beat any bcp/SSIS method I'd tried by more than a half hour, so that was a big improvement.



                  And the transaction log grew by 10 gig but didn't go nuts like some of my other experiments.







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Jan 4 at 20:59









                  user1664043user1664043

                  1061




                  1061






























                      draft saved

                      draft discarded




















































                      Thanks for contributing an answer to Database Administrators Stack Exchange!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f225591%2fsql-server-bulk-transfer-help-needed%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      Liste der Baudenkmale in Friedland (Mecklenburg)

                      Single-Malt-Whisky

                      Czorneboh