|
dev
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
SqlCommand slow on INSERTOn each iteration it assigns data to values in the SqlParameter collection. The command text is an INSERT statement to a Sql Server database, run with an .ExecuteQuery I enclosed the loop in a SqlTransaction and commit it at the end. I timed the program and it inserts about 70 records a second...which I think is sort of slow...so I set up some Debug.WriteLines to show where the time was being spent. The DataReader loop to the DB2 table is instantaneous. Almost 0s spent getting each record. Same with assigning values. The slow step is the actual execution of the SqlCommand. However, I also ran a SQL Trace and monitored the execution of the statement on the server. It took 0s to execute. The SqlCommand itself is adding an extra 0.01s to 0.03s which can add up over the course of hundreds of thousands of records. So the only overhead is running .ExecuteQuery on the SqlCommand object(!) Is there anyway to reduce or minimize this overhead, or a setting that can affect performance. I mean if my external source and target are running at 0s - my code shouldn't be adding overhead to run a command! I have a Model A Ford that does not climb hills that well when towing my 65'
boat. What should I do? Ah, ADO.NET (or any of the data access interfaces) are not designed to do bulk inserts. While the 2.0 is faster than ever (with batch mode), it's still orders of magnitude slower than the fastest DAI INSERT loop. Using SqlBulkCopy or DTS I can move 500,000 rows in about 20 seconds. -- Show quote____________________________________ William (Bill) Vaughn Author, Mentor, Consultant Microsoft MVP INETA Speaker www.betav.com/blog/billva www.betav.com Please reply only to the newsgroup so that others can benefit. This posting is provided "AS IS" with no warranties, and confers no rights. __________________________________ "John Bailo" <jaba***@texeme.com> wrote in message news:tfmdnfk3ZIcV2GLeRVn-tw@speakeasy.net... > > I have a c# program that loops through a table on a DB2 database. > > On each iteration it assigns data to values in the SqlParameter > collection. The command text is an INSERT statement to a Sql Server > database, run with an .ExecuteQuery I enclosed the loop in a > SqlTransaction and commit it at the end. > > I timed the program and it inserts about 70 records a second...which I > think is sort of slow...so I set up some Debug.WriteLines to show where > the time was being spent. > > The DataReader loop to the DB2 table is instantaneous. Almost 0s spent > getting each record. Same with assigning values. > > The slow step is the actual execution of the SqlCommand. However, I also > ran a SQL Trace and monitored the execution of the statement on the > server. It took 0s to execute. The SqlCommand itself is adding an extra > 0.01s to 0.03s which can add up over the course of hundreds of thousands > of records. > > So the only overhead is running .ExecuteQuery on the SqlCommand object(!) > Is there anyway to reduce or minimize this overhead, or a setting that can > affect performance. > > I mean if my external source and target are running at 0s - my code > shouldn't be adding overhead to run a command! > I have a Model A Ford that does not climb hills that well when towing my 65' Put smaller tires on the Model A.> boat. What should I do? I'm not John, but thanks for the tip on SqlBulkCopy! Some of us don't get very many chances to peek over the trenches to find the new jewels in 2.0. -Mike Show quote "William (Bill) Vaughn" wrote: > I have a Model A Ford that does not climb hills that well when towing my 65' > boat. What should I do? > Ah, ADO.NET (or any of the data access interfaces) are not designed to do > bulk inserts. While the 2.0 is faster than ever (with batch mode), it's > still orders of magnitude slower than the fastest DAI INSERT loop. Using > SqlBulkCopy or DTS I can move 500,000 rows in about 20 seconds. > > -- > ____________________________________ > William (Bill) Vaughn > Author, Mentor, Consultant > Microsoft MVP > INETA Speaker > www.betav.com/blog/billva > www.betav.com > Please reply only to the newsgroup so that others can benefit. > This posting is provided "AS IS" with no warranties, and confers no rights. > __________________________________ > > "John Bailo" <jaba***@texeme.com> wrote in message > news:tfmdnfk3ZIcV2GLeRVn-tw@speakeasy.net... > > > > I have a c# program that loops through a table on a DB2 database. > > > > On each iteration it assigns data to values in the SqlParameter > > collection. The command text is an INSERT statement to a Sql Server > > database, run with an .ExecuteQuery I enclosed the loop in a > > SqlTransaction and commit it at the end. > > > > I timed the program and it inserts about 70 records a second...which I > > think is sort of slow...so I set up some Debug.WriteLines to show where > > the time was being spent. > > > > The DataReader loop to the DB2 table is instantaneous. Almost 0s spent > > getting each record. Same with assigning values. > > > > The slow step is the actual execution of the SqlCommand. However, I also > > ran a SQL Trace and monitored the execution of the statement on the > > server. It took 0s to execute. The SqlCommand itself is adding an extra > > 0.01s to 0.03s which can add up over the course of hundreds of thousands > > of records. > > > > So the only overhead is running .ExecuteQuery on the SqlCommand object(!) > > Is there anyway to reduce or minimize this overhead, or a setting that can > > affect performance. > > > > I mean if my external source and target are running at 0s - my code > > shouldn't be adding overhead to run a command! > > > Ok, I'm curious.
Just why is BULK INSERT so fast? I mean, does it not go through the SQL DBMS itself and somehow write directly to the the .mdf file? What is the mechanism it uses? Also, running my code on a quad proc (SQL 2000, W2K) I am getting 5,000 records in 10-14s ( or about 450 to 500 recs per second ). Not too shabby! William (Bill) Vaughn wrote: Show quote > I have a Model A Ford that does not climb hills that well when towing my 65' > boat. What should I do? > Ah, ADO.NET (or any of the data access interfaces) are not designed to do > bulk inserts. While the 2.0 is faster than ever (with batch mode), it's > still orders of magnitude slower than the fastest DAI INSERT loop. Using > SqlBulkCopy or DTS I can move 500,000 rows in about 20 seconds. > Ok, I'm curious.
Just why is BULK INSERT so fast? I mean, does it not go through the SQL DBMS itself and somehow write directly to the the .mdf file? What is the mechanism it uses? Also, running my code on a quad proc (SQL 2000, W2K) I am getting 5,000 records in 10-14s ( or about 450 to 500 recs per second ). Not too shabby! William (Bill) Vaughn wrote: Show quote > I have a Model A Ford that does not climb hills that well when towing my 65' > boat. What should I do? > Ah, ADO.NET (or any of the data access interfaces) are not designed to do > bulk inserts. While the 2.0 is faster than ever (with batch mode), it's > still orders of magnitude slower than the fastest DAI INSERT loop. Using > SqlBulkCopy or DTS I can move 500,000 rows in about 20 seconds. > John Bailo wrote:
> Ok, I'm curious. Well, I'm not a SQL Server expert or anything, but I believe bulk> > Just why is BULK INSERT so fast? I mean, does it not go through the > SQL DBMS itself and somehow write directly to the the .mdf file? What > is the mechanism it uses? > inserts bypass the transaction log. That probably accounts for some of the difference anyway. Show quote > Also, running my code on a quad proc (SQL 2000, W2K) I am getting 5,000 > records in 10-14s ( or about 450 to 500 recs per second ). Not too shabby! > > John Bailo wrote:
> Ok, I'm curious. Well, I'm not a SQL Server expert or anything, but I believe bulk> > Just why is BULK INSERT so fast? I mean, does it not go through the > SQL DBMS itself and somehow write directly to the the .mdf file? What > is the mechanism it uses? > inserts bypass the transaction log. That probably accounts for some of the difference anyway. Show quote > Also, running my code on a quad proc (SQL 2000, W2K) I am getting 5,000 > records in 10-14s ( or about 450 to 500 recs per second ). Not too shabby! > > To bad I can't programmatically disable the transaction log or request
sql not to log my inserts... Brian Gideon wrote: Show quote > John Bailo wrote: > >>Ok, I'm curious. >> >>Just why is BULK INSERT so fast? I mean, does it not go through the >>SQL DBMS itself and somehow write directly to the the .mdf file? What >>is the mechanism it uses? >> > > > Well, I'm not a SQL Server expert or anything, but I believe bulk > inserts bypass the transaction log. That probably accounts for some of > the difference anyway. > > >>Also, running my code on a quad proc (SQL 2000, W2K) I am getting 5,000 >>records in 10-14s ( or about 450 to 500 recs per second ). Not too shabby! >> >> > > To bad I can't programmatically disable the transaction log or request
sql not to log my inserts... Brian Gideon wrote: Show quote > John Bailo wrote: > >>Ok, I'm curious. >> >>Just why is BULK INSERT so fast? I mean, does it not go through the >>SQL DBMS itself and somehow write directly to the the .mdf file? What >>is the mechanism it uses? >> > > > Well, I'm not a SQL Server expert or anything, but I believe bulk > inserts bypass the transaction log. That probably accounts for some of > the difference anyway. > > >>Also, running my code on a quad proc (SQL 2000, W2K) I am getting 5,000 >>records in 10-14s ( or about 450 to 500 recs per second ). Not too shabby! >> >> > > The bulk copy interface and functionality has been in place since the Sybase
versions... a long time. In the Model A days we had to use the BCP utility, but in SS7 and later we could use one of the SQL Server management "object" libraries (SQLDMO, SQLSMO) to call Bulk copy functionality. In SS 2000 (or earlier?) it appeared as TSQL functions too. Yes, Bulk Copy turns off the TL. It assumes that you're copying to a temp table (or a permanent work table). Generally, these tables don't have indexes or other constraints to slow down the import process. BCP also uses special TDS stream packets to move the data. It's THE way to go when moving data from/to the server. DTS leverages this technology by permitting you to setup a scripted BCP operation that can transform (edit) the data as it's moved from any data source with a provider or driver (.NET, OLE DB, ODBC, text, tight-string-with-two-cans). hth -- Show quote____________________________________ William (Bill) Vaughn Author, Mentor, Consultant Microsoft MVP INETA Speaker www.betav.com/blog/billva www.betav.com Please reply only to the newsgroup so that others can benefit. This posting is provided "AS IS" with no warranties, and confers no rights. __________________________________ "John Bailo" <jaba***@texeme.com> wrote in message news:mv2dnZtVKfa8_mLeRVn-pQ@speakeasy.net... > > Ok, I'm curious. > > Just why is BULK INSERT so fast? I mean, does it not go through the > SQL DBMS itself and somehow write directly to the the .mdf file? What is > the mechanism it uses? > > Also, running my code on a quad proc (SQL 2000, W2K) I am getting 5,000 > records in 10-14s ( or about 450 to 500 recs per second ). Not too > shabby! > > > William (Bill) Vaughn wrote: >> I have a Model A Ford that does not climb hills that well when towing my >> 65' boat. What should I do? >> Ah, ADO.NET (or any of the data access interfaces) are not designed to do >> bulk inserts. While the 2.0 is faster than ever (with batch mode), it's >> still orders of magnitude slower than the fastest DAI INSERT loop. Using >> SqlBulkCopy or DTS I can move 500,000 rows in about 20 seconds. >> The bulk copy interface and functionality has been in place since the Sybase
versions... a long time. In the Model A days we had to use the BCP utility, but in SS7 and later we could use one of the SQL Server management "object" libraries (SQLDMO, SQLSMO) to call Bulk copy functionality. In SS 2000 (or earlier?) it appeared as TSQL functions too. Yes, Bulk Copy turns off the TL. It assumes that you're copying to a temp table (or a permanent work table). Generally, these tables don't have indexes or other constraints to slow down the import process. BCP also uses special TDS stream packets to move the data. It's THE way to go when moving data from/to the server. DTS leverages this technology by permitting you to setup a scripted BCP operation that can transform (edit) the data as it's moved from any data source with a provider or driver (.NET, OLE DB, ODBC, text, tight-string-with-two-cans). hth -- Show quote____________________________________ William (Bill) Vaughn Author, Mentor, Consultant Microsoft MVP INETA Speaker www.betav.com/blog/billva www.betav.com Please reply only to the newsgroup so that others can benefit. This posting is provided "AS IS" with no warranties, and confers no rights. __________________________________ "John Bailo" <jaba***@texeme.com> wrote in message news:mv2dnZtVKfa8_mLeRVn-pQ@speakeasy.net... > > Ok, I'm curious. > > Just why is BULK INSERT so fast? I mean, does it not go through the > SQL DBMS itself and somehow write directly to the the .mdf file? What is > the mechanism it uses? > > Also, running my code on a quad proc (SQL 2000, W2K) I am getting 5,000 > records in 10-14s ( or about 450 to 500 recs per second ). Not too > shabby! > > > William (Bill) Vaughn wrote: >> I have a Model A Ford that does not climb hills that well when towing my >> 65' boat. What should I do? >> Ah, ADO.NET (or any of the data access interfaces) are not designed to do >> bulk inserts. While the 2.0 is faster than ever (with batch mode), it's >> still orders of magnitude slower than the fastest DAI INSERT loop. Using >> SqlBulkCopy or DTS I can move 500,000 rows in about 20 seconds. >> The thing is, that answer really doesn't address my question.
As I mentioned, when I run the Trace utility, the duration of each insert command is 0s. It's the wait time to return before the SqlCommand and after the SqlCommand. So, to me, if I were to string together a bunch of INSERT statements, that represent the changing parameters of the INSERT, it should run lightening fast. My question still stands: why is there so much overhead in the SqlCommand object? Or, is it the transmission time to send the command on the network? Or the conversion to TDS protocol? If either of those factors could be reduced, then I could send my INSERTs through almost instantly. William (Bill) Vaughn wrote: Show quote > The bulk copy interface and functionality has been in place since the Sybase > versions... a long time. In the Model A days we had to use the BCP utility, > but in SS7 and later we could use one of the SQL Server management "object" > libraries (SQLDMO, SQLSMO) to call Bulk copy functionality. In SS 2000 (or > earlier?) it appeared as TSQL functions too. > > Yes, Bulk Copy turns off the TL. It assumes that you're copying to a temp > table (or a permanent work table). Generally, these tables don't have > indexes or other constraints to slow down the import process. BCP also uses > special TDS stream packets to move the data. It's THE way to go when moving > data from/to the server. DTS leverages this technology by permitting you to > setup a scripted BCP operation that can transform (edit) the data as it's > moved from any data source with a provider or driver (.NET, OLE DB, ODBC, > text, tight-string-with-two-cans). > > hth > The thing is, that answer really doesn't address my question.
As I mentioned, when I run the Trace utility, the duration of each insert command is 0s. It's the wait time to return before the SqlCommand and after the SqlCommand. So, to me, if I were to string together a bunch of INSERT statements, that represent the changing parameters of the INSERT, it should run lightening fast. My question still stands: why is there so much overhead in the SqlCommand object? Or, is it the transmission time to send the command on the network? Or the conversion to TDS protocol? If either of those factors could be reduced, then I could send my INSERTs through almost instantly. William (Bill) Vaughn wrote: Show quote > The bulk copy interface and functionality has been in place since the Sybase > versions... a long time. In the Model A days we had to use the BCP utility, > but in SS7 and later we could use one of the SQL Server management "object" > libraries (SQLDMO, SQLSMO) to call Bulk copy functionality. In SS 2000 (or > earlier?) it appeared as TSQL functions too. > > Yes, Bulk Copy turns off the TL. It assumes that you're copying to a temp > table (or a permanent work table). Generally, these tables don't have > indexes or other constraints to slow down the import process. BCP also uses > special TDS stream packets to move the data. It's THE way to go when moving > data from/to the server. DTS leverages this technology by permitting you to > setup a scripted BCP operation that can transform (edit) the data as it's > moved from any data source with a provider or driver (.NET, OLE DB, ODBC, > text, tight-string-with-two-cans). > > hth > It's the fundamental difference in the mechanism. First, each INSERT
statement is sent as text to the server, not as raw data. The SQL Server compiler then needs to compile the INSERT statement(s) and generate a query plan. Nope, this does not take long, but it takes time. It then logs the operation to the TL (which can't be disabled) and then to the database. At that point the constraints are checked, the indexes are built and any RI checks are made. In the case of BCP, the protocol (which is proprietary and subject to change) opens a channel, sends the meta data (once), and the server starts an agent that simply writes the inbound data stream (binary) to the rows in the target table. It requires very little overhead--90% of which can't be disabled. -- Show quote____________________________________ William (Bill) Vaughn Author, Mentor, Consultant Microsoft MVP INETA Speaker www.betav.com/blog/billva www.betav.com Please reply only to the newsgroup so that others can benefit. This posting is provided "AS IS" with no warranties, and confers no rights. __________________________________ "John Bailo" <jaba***@texeme.com> wrote in message news:Z4OdnVsa1eyyHmLenZ2dnUVZ_sCdnZ2d@speakeasy.net... > > The thing is, that answer really doesn't address my question. > > As I mentioned, when I run the Trace utility, the duration of each insert > command is 0s. > > It's the wait time to return before the SqlCommand and after the > SqlCommand. > > So, to me, if I were to string together a bunch of INSERT statements, that > represent the changing parameters of the INSERT, it should run lightening > fast. > > My question still stands: why is there so much overhead in the SqlCommand > object? Or, is it the transmission time to send the command on the > network? Or the conversion to TDS protocol? > > If either of those factors could be reduced, then I could send my INSERTs > through almost instantly. > > William (Bill) Vaughn wrote: >> The bulk copy interface and functionality has been in place since the >> Sybase versions... a long time. In the Model A days we had to use the BCP >> utility, but in SS7 and later we could use one of the SQL Server >> management "object" libraries (SQLDMO, SQLSMO) to call Bulk copy >> functionality. In SS 2000 (or earlier?) it appeared as TSQL functions >> too. >> >> Yes, Bulk Copy turns off the TL. It assumes that you're copying to a temp >> table (or a permanent work table). Generally, these tables don't have >> indexes or other constraints to slow down the import process. BCP also >> uses special TDS stream packets to move the data. It's THE way to go when >> moving data from/to the server. DTS leverages this technology by >> permitting you to setup a scripted BCP operation that can transform >> (edit) the data as it's moved from any data source with a provider or >> driver (.NET, OLE DB, ODBC, text, tight-string-with-two-cans). >> >> hth >> It's the fundamental difference in the mechanism. First, each INSERT
statement is sent as text to the server, not as raw data. The SQL Server compiler then needs to compile the INSERT statement(s) and generate a query plan. Nope, this does not take long, but it takes time. It then logs the operation to the TL (which can't be disabled) and then to the database. At that point the constraints are checked, the indexes are built and any RI checks are made. In the case of BCP, the protocol (which is proprietary and subject to change) opens a channel, sends the meta data (once), and the server starts an agent that simply writes the inbound data stream (binary) to the rows in the target table. It requires very little overhead--90% of which can't be disabled. -- Show quote____________________________________ William (Bill) Vaughn Author, Mentor, Consultant Microsoft MVP INETA Speaker www.betav.com/blog/billva www.betav.com Please reply only to the newsgroup so that others can benefit. This posting is provided "AS IS" with no warranties, and confers no rights. __________________________________ "John Bailo" <jaba***@texeme.com> wrote in message news:Z4OdnVsa1eyyHmLenZ2dnUVZ_sCdnZ2d@speakeasy.net... > > The thing is, that answer really doesn't address my question. > > As I mentioned, when I run the Trace utility, the duration of each insert > command is 0s. > > It's the wait time to return before the SqlCommand and after the > SqlCommand. > > So, to me, if I were to string together a bunch of INSERT statements, that > represent the changing parameters of the INSERT, it should run lightening > fast. > > My question still stands: why is there so much overhead in the SqlCommand > object? Or, is it the transmission time to send the command on the > network? Or the conversion to TDS protocol? > > If either of those factors could be reduced, then I could send my INSERTs > through almost instantly. > > William (Bill) Vaughn wrote: >> The bulk copy interface and functionality has been in place since the >> Sybase versions... a long time. In the Model A days we had to use the BCP >> utility, but in SS7 and later we could use one of the SQL Server >> management "object" libraries (SQLDMO, SQLSMO) to call Bulk copy >> functionality. In SS 2000 (or earlier?) it appeared as TSQL functions >> too. >> >> Yes, Bulk Copy turns off the TL. It assumes that you're copying to a temp >> table (or a permanent work table). Generally, these tables don't have >> indexes or other constraints to slow down the import process. BCP also >> uses special TDS stream packets to move the data. It's THE way to go when >> moving data from/to the server. DTS leverages this technology by >> permitting you to setup a scripted BCP operation that can transform >> (edit) the data as it's moved from any data source with a provider or >> driver (.NET, OLE DB, ODBC, text, tight-string-with-two-cans). >> >> hth >> > It's the fundamental difference in the mechanism. First, each INSERT ....Which then causes me to question: Are you Prepare() ing the insert > statement is sent as text to the server, not as raw data. The SQL Server > compiler then needs to compile the INSERT statement(s) and generate a > query plan. statement? I know you mentioned you were using parameters, but I think the speed bonus does not occurr if the statement is not prepared (because, as Bill said, the insert statements have to be parsed each time). At least with Oracle it makes a huge difference in speed. > It's the fundamental difference in the mechanism. First, each INSERT ....Which then causes me to question: Are you Prepare() ing the insert > statement is sent as text to the server, not as raw data. The SQL Server > compiler then needs to compile the INSERT statement(s) and generate a > query plan. statement? I know you mentioned you were using parameters, but I think the speed bonus does not occurr if the statement is not prepared (because, as Bill said, the insert statements have to be parsed each time). At least with Oracle it makes a huge difference in speed. Yes, I issue the .Prepare() method before entering the loop.
Gabriel Magaña wrote: Show quote >>It's the fundamental difference in the mechanism. First, each INSERT >>statement is sent as text to the server, not as raw data. The SQL Server >>compiler then needs to compile the INSERT statement(s) and generate a >>query plan. > > > ...Which then causes me to question: Are you Prepare() ing the insert > statement? I know you mentioned you were using parameters, but I think the > speed bonus does not occurr if the statement is not prepared (because, as > Bill said, the insert statements have to be parsed each time). > > At least with Oracle it makes a huge difference in speed. > > Yes, I issue the .Prepare() method before entering the loop.
Gabriel Magaña wrote: Show quote >>It's the fundamental difference in the mechanism. First, each INSERT >>statement is sent as text to the server, not as raw data. The SQL Server >>compiler then needs to compile the INSERT statement(s) and generate a >>query plan. > > > ...Which then causes me to question: Are you Prepare() ing the insert > statement? I know you mentioned you were using parameters, but I think the > speed bonus does not occurr if the statement is not prepared (because, as > Bill said, the insert statements have to be parsed each time). > > At least with Oracle it makes a huge difference in speed. > > Shouldn't using a .Prepare() method (which I do) eliminate that overhead?
Also, I enclose loop in a SqlTransaction. Doesn't that mean that all the log entries are written when the transaction is committed? William (Bill) Vaughn wrote: Show quote > It's the fundamental difference in the mechanism. First, each INSERT > statement is sent as text to the server, not as raw data. The SQL Server > compiler then needs to compile the INSERT statement(s) and generate a query > plan. Nope, this does not take long, but it takes time. It then logs the > operation to the TL (which can't be disabled) and then to the database. At > that point the constraints are checked, the indexes are built and any RI > checks are made. > > In the case of BCP, the protocol (which is proprietary and subject to > change) opens a channel, sends the meta data (once), and the server starts > an agent that simply writes the inbound data stream (binary) to the rows in > the target table. It requires very little overhead--90% of which can't be > disabled. > > Shouldn't using a .Prepare() method (which I do) eliminate that overhead?
Also, I enclose loop in a SqlTransaction. Doesn't that mean that all the log entries are written when the transaction is committed? William (Bill) Vaughn wrote: Show quote > It's the fundamental difference in the mechanism. First, each INSERT > statement is sent as text to the server, not as raw data. The SQL Server > compiler then needs to compile the INSERT statement(s) and generate a query > plan. Nope, this does not take long, but it takes time. It then logs the > operation to the TL (which can't be disabled) and then to the database. At > that point the constraints are checked, the indexes are built and any RI > checks are made. > > In the case of BCP, the protocol (which is proprietary and subject to > change) opens a channel, sends the meta data (once), and the server starts > an agent that simply writes the inbound data stream (binary) to the rows in > the target table. It requires very little overhead--90% of which can't be > disabled. > > > Also, I enclose loop in a SqlTransaction. I might be mistaken for SQL Server (I'm certified as an Oracle DBA, but I > Doesn't that mean that all the log entries are written when the > transaction is committed? only know SQLServer superficially), but what happens in Oracle (somewhat simplified) is that the DB itself is updated during a transaction, and the log keeps the original data in it. That way when you commit a transaction there is very little cost in terms of IO, there is only a cost if you rollback the transaction (which is assumed to happen very few times). I would not be suprised at all if SQL Server worked this way too, since, if the assumption that the ratio of transaction commits to rollbacks is extremely high holds, transactions take up very little extra overhead. The only problem is when a transaction involves a great many operations: The log system gets bogged down since a very big part of the log is "vital" and could be needed in case of a rollback. For the stuff I do normally, I have noticed that committing transactions every could hundred record insertions is a good mid-point performance-wise. But that number is highly dependant on the server hardware and also on the size of the records being inserted. Inserting 100,000 records in a single transaction is just asking for trouble, in other words. > Also, I enclose loop in a SqlTransaction. I might be mistaken for SQL Server (I'm certified as an Oracle DBA, but I > Doesn't that mean that all the log entries are written when the > transaction is committed? only know SQLServer superficially), but what happens in Oracle (somewhat simplified) is that the DB itself is updated during a transaction, and the log keeps the original data in it. That way when you commit a transaction there is very little cost in terms of IO, there is only a cost if you rollback the transaction (which is assumed to happen very few times). I would not be suprised at all if SQL Server worked this way too, since, if the assumption that the ratio of transaction commits to rollbacks is extremely high holds, transactions take up very little extra overhead. The only problem is when a transaction involves a great many operations: The log system gets bogged down since a very big part of the log is "vital" and could be needed in case of a rollback. For the stuff I do normally, I have noticed that committing transactions every could hundred record insertions is a good mid-point performance-wise. But that number is highly dependant on the server hardware and also on the size of the records being inserted. Inserting 100,000 records in a single transaction is just asking for trouble, in other words. I guess I might start believing that the transaction log is the
additional overhead, except that I would expect that to be reflected in the duration reported in a the SQL trace. But it isn't! The duration for each INSERT statement is 0 -- or less than 1 millesecond! Wouldn't the time to write to the transaction log be reflected in the duration for each INSERT? Gabriel Magaña wrote: Show quote >>Also, I enclose loop in a SqlTransaction. >>Doesn't that mean that all the log entries are written when the >>transaction is committed? > > > I might be mistaken for SQL Server (I'm certified as an Oracle DBA, but I > only know SQLServer superficially), but what happens in Oracle (somewhat > simplified) is that the DB itself is updated during a transaction, and the > log keeps the original data in it. That way when you commit a transaction > there is very little cost in terms of IO, there is only a cost if you > rollback the transaction (which is assumed to happen very few times). I > would not be suprised at all if SQL Server worked this way too, since, if > the assumption that the ratio of transaction commits to rollbacks is > extremely high holds, transactions take up very little extra overhead. > > The only problem is when a transaction involves a great many operations: > The log system gets bogged down since a very big part of the log is "vital" > and could be needed in case of a rollback. For the stuff I do normally, I > have noticed that committing transactions every could hundred record > insertions is a good mid-point performance-wise. But that number is highly > dependant on the server hardware and also on the size of the records being > inserted. Inserting 100,000 records in a single transaction is just asking > for trouble, in other words. > > I guess I might start believing that the transaction log is the
additional overhead, except that I would expect that to be reflected in the duration reported in a the SQL trace. But it isn't! The duration for each INSERT statement is 0 -- or less than 1 millesecond! Wouldn't the time to write to the transaction log be reflected in the duration for each INSERT? Gabriel Magaña wrote: Show quote >>Also, I enclose loop in a SqlTransaction. >>Doesn't that mean that all the log entries are written when the >>transaction is committed? > > > I might be mistaken for SQL Server (I'm certified as an Oracle DBA, but I > only know SQLServer superficially), but what happens in Oracle (somewhat > simplified) is that the DB itself is updated during a transaction, and the > log keeps the original data in it. That way when you commit a transaction > there is very little cost in terms of IO, there is only a cost if you > rollback the transaction (which is assumed to happen very few times). I > would not be suprised at all if SQL Server worked this way too, since, if > the assumption that the ratio of transaction commits to rollbacks is > extremely high holds, transactions take up very little extra overhead. > > The only problem is when a transaction involves a great many operations: > The log system gets bogged down since a very big part of the log is "vital" > and could be needed in case of a rollback. For the stuff I do normally, I > have noticed that committing transactions every could hundred record > insertions is a good mid-point performance-wise. But that number is highly > dependant on the server hardware and also on the size of the records being > inserted. Inserting 100,000 records in a single transaction is just asking > for trouble, in other words. > > > I guess I might start believing that the transaction log is the additional Writing to the transaction log at insert time is not the only I/O done on > overhead, except that I would expect that to be reflected in the duration > reported in a the SQL trace. > But it isn't! The duration for each INSERT statement is 0 -- or less > than 1 millesecond! Wouldn't the time to write to the transaction log be > reflected in the duration for each INSERT? the transaction log. There is another I/O done when you commit the transaction, to tell the DB that the logs are ok to be recycled, since you have just commited the transaction (and hence, cannot rollback anymore). You might try splitting into several smaller transactions (with their own commits, of course) to see if you get any speed improvements. But to answer your question, I do not know exactly how SQL trace works, so I do not know if it includes updates to the log. Bypassing the log is dangerous though... You would not be able to rollback (so that means no transactions either), nor would you be able to recover from data curruption should the DB crash during the insert operation. You may as well use DBF files instead of SQL server, inserts to that are lightning fast. I have done this in the past where speed is absolutely of the essence and data loss is no big deal... But there are lots of downsides, I guess it's the same as using goto statements: Only people who fully understand what they are doing should be messing with it! > I guess I might start believing that the transaction log is the additional Writing to the transaction log at insert time is not the only I/O done on > overhead, except that I would expect that to be reflected in the duration > reported in a the SQL trace. > But it isn't! The duration for each INSERT statement is 0 -- or less > than 1 millesecond! Wouldn't the time to write to the transaction log be > reflected in the duration for each INSERT? the transaction log. There is another I/O done when you commit the transaction, to tell the DB that the logs are ok to be recycled, since you have just commited the transaction (and hence, cannot rollback anymore). You might try splitting into several smaller transactions (with their own commits, of course) to see if you get any speed improvements. But to answer your question, I do not know exactly how SQL trace works, so I do not know if it includes updates to the log. Bypassing the log is dangerous though... You would not be able to rollback (so that means no transactions either), nor would you be able to recover from data curruption should the DB crash during the insert operation. You may as well use DBF files instead of SQL server, inserts to that are lightning fast. I have done this in the past where speed is absolutely of the essence and data loss is no big deal... But there are lots of downsides, I guess it's the same as using goto statements: Only people who fully understand what they are doing should be messing with it! No. You still don't understand. BCP requires no compile overhead and uses
direct IO to the DBMS instead of "logical" writes. And no, we don't usually write to a permanent table so the Transaction Log issues are not a factor. When we're ready to post the data to the live tables we run a SP that edits the data and does a proper INSERT. -- Show quote____________________________________ William (Bill) Vaughn Author, Mentor, Consultant Microsoft MVP INETA Speaker www.betav.com/blog/billva www.betav.com Please reply only to the newsgroup so that others can benefit. This posting is provided "AS IS" with no warranties, and confers no rights. __________________________________ "John Bailo" <jaba***@texeme.com> wrote in message news:15Sdnblm0qhOCGLenZ2dneKdnZydnZ2d@speakeasy.net... > > Shouldn't using a .Prepare() method (which I do) eliminate that overhead? > > Also, I enclose loop in a SqlTransaction. > > Doesn't that mean that all the log entries are written when the > transaction is committed? > > > > William (Bill) Vaughn wrote: >> It's the fundamental difference in the mechanism. First, each INSERT >> statement is sent as text to the server, not as raw data. The SQL Server >> compiler then needs to compile the INSERT statement(s) and generate a >> query plan. Nope, this does not take long, but it takes time. It then >> logs the operation to the TL (which can't be disabled) and then to the >> database. At that point the constraints are checked, the indexes are >> built and any RI checks are made. >> >> In the case of BCP, the protocol (which is proprietary and subject to >> change) opens a channel, sends the meta data (once), and the server >> starts an agent that simply writes the inbound data stream (binary) to >> the rows in the target table. It requires very little overhead--90% of >> which can't be disabled. >> No. You still don't understand. BCP requires no compile overhead and uses
direct IO to the DBMS instead of "logical" writes. And no, we don't usually write to a permanent table so the Transaction Log issues are not a factor. When we're ready to post the data to the live tables we run a SP that edits the data and does a proper INSERT. -- Show quote____________________________________ William (Bill) Vaughn Author, Mentor, Consultant Microsoft MVP INETA Speaker www.betav.com/blog/billva www.betav.com Please reply only to the newsgroup so that others can benefit. This posting is provided "AS IS" with no warranties, and confers no rights. __________________________________ "John Bailo" <jaba***@texeme.com> wrote in message news:15Sdnblm0qhOCGLenZ2dneKdnZydnZ2d@speakeasy.net... > > Shouldn't using a .Prepare() method (which I do) eliminate that overhead? > > Also, I enclose loop in a SqlTransaction. > > Doesn't that mean that all the log entries are written when the > transaction is committed? > > > > William (Bill) Vaughn wrote: >> It's the fundamental difference in the mechanism. First, each INSERT >> statement is sent as text to the server, not as raw data. The SQL Server >> compiler then needs to compile the INSERT statement(s) and generate a >> query plan. Nope, this does not take long, but it takes time. It then >> logs the operation to the TL (which can't be disabled) and then to the >> database. At that point the constraints are checked, the indexes are >> built and any RI checks are made. >> >> In the case of BCP, the protocol (which is proprietary and subject to >> change) opens a channel, sends the meta data (once), and the server >> starts an agent that simply writes the inbound data stream (binary) to >> the rows in the target table. It requires very little overhead--90% of >> which can't be disabled. >> Would writing to a #temp table, or to a table variable (and then
transferring the data to a regular table with INSERT...SELECT) be faster than regular INSERTs to a table? William (Bill) Vaughn wrote: Show quote > No. You still don't understand. BCP requires no compile overhead and uses > direct IO to the DBMS instead of "logical" writes. And no, we don't usually > write to a permanent table so the Transaction Log issues are not a factor. > When we're ready to post the data to the live tables we run a SP that edits > the data and does a proper INSERT. > Would writing to a #temp table, or to a table variable (and then
transferring the data to a regular table with INSERT...SELECT) be faster than regular INSERTs to a table? William (Bill) Vaughn wrote: Show quote > No. You still don't understand. BCP requires no compile overhead and uses > direct IO to the DBMS instead of "logical" writes. And no, we don't usually > write to a permanent table so the Transaction Log issues are not a factor. > When we're ready to post the data to the live tables we run a SP that edits > the data and does a proper INSERT. > Basically, yes. We generally transfer to "temporary" tables in the database
(not to tempdb) and perform intelligent INSERT/UPDATE statements from there. It's far faster. -- Show quote____________________________________ William (Bill) Vaughn Author, Mentor, Consultant Microsoft MVP INETA Speaker www.betav.com/blog/billva www.betav.com Please reply only to the newsgroup so that others can benefit. This posting is provided "AS IS" with no warranties, and confers no rights. __________________________________ "John Bailo" <jaba***@texeme.com> wrote in message news:B66dnatsivNZ45nZRVn-uw@speakeasy.net... > > Would writing to a #temp table, or to a table variable (and then > transferring the data to a regular table with INSERT...SELECT) be faster > than regular INSERTs to a table? > > > William (Bill) Vaughn wrote: >> No. You still don't understand. BCP requires no compile overhead and uses >> direct IO to the DBMS instead of "logical" writes. And no, we don't >> usually write to a permanent table so the Transaction Log issues are not >> a factor. When we're ready to post the data to the live tables we run a >> SP that edits the data and does a proper INSERT. >> > Basically, yes. We generally transfer to "temporary" tables in the database
(not to tempdb) and perform intelligent INSERT/UPDATE statements from there. It's far faster. -- Show quote____________________________________ William (Bill) Vaughn Author, Mentor, Consultant Microsoft MVP INETA Speaker www.betav.com/blog/billva www.betav.com Please reply only to the newsgroup so that others can benefit. This posting is provided "AS IS" with no warranties, and confers no rights. __________________________________ "John Bailo" <jaba***@texeme.com> wrote in message news:B66dnatsivNZ45nZRVn-uw@speakeasy.net... > > Would writing to a #temp table, or to a table variable (and then > transferring the data to a regular table with INSERT...SELECT) be faster > than regular INSERTs to a table? > > > William (Bill) Vaughn wrote: >> No. You still don't understand. BCP requires no compile overhead and uses >> direct IO to the DBMS instead of "logical" writes. And no, we don't >> usually write to a permanent table so the Transaction Log issues are not >> a factor. When we're ready to post the data to the live tables we run a >> SP that edits the data and does a proper INSERT. >> > Hmmm...well, there's no reason I can't do that as an intermediate way in
my code. Do I have to create the temporary table on the same command that I run my INSERTS on? Or can I just create two SqlCommand's and run them on the same SqlConnection? Will the table remain in memory if I do that? And then I'll need to do the transfer from the temp table to the table after that. William (Bill) Vaughn wrote: Show quote > Basically, yes. We generally transfer to "temporary" tables in the database > (not to tempdb) and perform intelligent INSERT/UPDATE statements from there. > It's far faster. > Hmmm...well, there's no reason I can't do that as an intermediate way in
my code. Do I have to create the temporary table on the same command that I run my INSERTS on? Or can I just create two SqlCommand's and run them on the same SqlConnection? Will the table remain in memory if I do that? And then I'll need to do the transfer from the temp table to the table after that. William (Bill) Vaughn wrote: Show quote > Basically, yes. We generally transfer to "temporary" tables in the database > (not to tempdb) and perform intelligent INSERT/UPDATE statements from there. > It's far faster. > John, I'm not talking about using INSERTs from your client. I'm talking
about using an intelligent INSERT statement within the context of a stored procedure that runs on the server. It inserts rows from a "temporary" table on the server (which was filled with BulkCopy) to a permanent table. This permits the server-side business rules, triggers, RI and other logic to run. -- Show quote____________________________________ William (Bill) Vaughn Author, Mentor, Consultant Microsoft MVP INETA Speaker www.betav.com/blog/billva www.betav.com Please reply only to the newsgroup so that others can benefit. This posting is provided "AS IS" with no warranties, and confers no rights. __________________________________ "John Bailo" <jaba***@texeme.com> wrote in message news:XYOdnQFdSYjeMZnZnZ2dnUVZ_tidnZ2d@speakeasy.net... > > Hmmm...well, there's no reason I can't do that as an intermediate way in > my code. > > Do I have to create the temporary table on the same command that I run my > INSERTS on? > > Or can I just create two SqlCommand's and run them on the same > SqlConnection? Will the table remain in memory if I do that? > > And then I'll need to do the transfer from the temp table to the table > after that. > > > > William (Bill) Vaughn wrote: >> Basically, yes. We generally transfer to "temporary" tables in the >> database (not to tempdb) and perform intelligent INSERT/UPDATE statements >> from there. It's far faster. >> > John, I'm not talking about using INSERTs from your client. I'm talking
about using an intelligent INSERT statement within the context of a stored procedure that runs on the server. It inserts rows from a "temporary" table on the server (which was filled with BulkCopy) to a permanent table. This permits the server-side business rules, triggers, RI and other logic to run. -- Show quote____________________________________ William (Bill) Vaughn Author, Mentor, Consultant Microsoft MVP INETA Speaker www.betav.com/blog/billva www.betav.com Please reply only to the newsgroup so that others can benefit. This posting is provided "AS IS" with no warranties, and confers no rights. __________________________________ "John Bailo" <jaba***@texeme.com> wrote in message news:XYOdnQFdSYjeMZnZnZ2dnUVZ_tidnZ2d@speakeasy.net... > > Hmmm...well, there's no reason I can't do that as an intermediate way in > my code. > > Do I have to create the temporary table on the same command that I run my > INSERTS on? > > Or can I just create two SqlCommand's and run them on the same > SqlConnection? Will the table remain in memory if I do that? > > And then I'll need to do the transfer from the temp table to the table > after that. > > > > William (Bill) Vaughn wrote: >> Basically, yes. We generally transfer to "temporary" tables in the >> database (not to tempdb) and perform intelligent INSERT/UPDATE statements >> from there. It's far faster. >> > One thing I found that seems to make a little difference is using the
..Prepare() method on my SqlCommand. This seems to take a SqlCommand and turn it into a sproc, /on-the-fly/, for SqlCommands that need to be reused. John Bailo wrote: Show quote > > I have a c# program that loops through a table on a DB2 database. > > On each iteration it assigns data to values in the SqlParameter > collection. The command text is an INSERT statement to a Sql Server > database, run with an .ExecuteQuery I enclosed the loop in a > SqlTransaction and commit it at the end. > > I timed the program and it inserts about 70 records a second...which I > think is sort of slow...so I set up some Debug.WriteLines to show where > the time was being spent. > > The DataReader loop to the DB2 table is instantaneous. Almost 0s spent > getting each record. Same with assigning values. > > The slow step is the actual execution of the SqlCommand. However, I > also ran a SQL Trace and monitored the execution of the statement on the > server. It took 0s to execute. The SqlCommand itself is adding an > extra 0.01s to 0.03s which can add up over the course of hundreds of > thousands of records. > > So the only overhead is running .ExecuteQuery on the SqlCommand > object(!) Is there anyway to reduce or minimize this overhead, or a > setting that can affect performance. > > I mean if my external source and target are running at 0s - my code > shouldn't be adding overhead to run a command! One thing I found that seems to make a little difference is using the
..Prepare() method on my SqlCommand. This seems to take a SqlCommand and turn it into a sproc, /on-the-fly/, for SqlCommands that need to be reused. John Bailo wrote: Show quote > > I have a c# program that loops through a table on a DB2 database. > > On each iteration it assigns data to values in the SqlParameter > collection. The command text is an INSERT statement to a Sql Server > database, run with an .ExecuteQuery I enclosed the loop in a > SqlTransaction and commit it at the end. > > I timed the program and it inserts about 70 records a second...which I > think is sort of slow...so I set up some Debug.WriteLines to show where > the time was being spent. > > The DataReader loop to the DB2 table is instantaneous. Almost 0s spent > getting each record. Same with assigning values. > > The slow step is the actual execution of the SqlCommand. However, I > also ran a SQL Trace and monitored the execution of the statement on the > server. It took 0s to execute. The SqlCommand itself is adding an > extra 0.01s to 0.03s which can add up over the course of hundreds of > thousands of records. > > So the only overhead is running .ExecuteQuery on the SqlCommand > object(!) Is there anyway to reduce or minimize this overhead, or a > setting that can affect performance. > > I mean if my external source and target are running at 0s - my code > shouldn't be adding overhead to run a command! |
|||||||||||||||||||||||