|
dev
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
SqlDataReader performancetables. The query performes acceptable in the Query analyzer. dr = cmd.ExecuteReader(); //I retriev my data within a wile loop containing 3 primary sections wile(dr.Rea()) { 1) string keyvalue = dr[keyname].ToString() 2) //Compare with previous key and ignore duplicates 3) //Collect remainding collum data } On large resultsets 50k+ rows I see a terrible performance penalty on the first point. 95% of the time used in completing this procedure is spendt in point 1). Typical trace data: Execute query 1.6 seconds Collecting keys 1) 8.1 seconds Comparing 2) 0 seconds Collecting remaining data 3) 0.7 seconds. As you can see most of the time is spendt getting data from the DataReader the first time each row is accessed. To be able to optimize this I need to understand the correlation between the SQL statement, the cmd.ExecuteReader() and the retrieval of data from the DataReader. Even on small resultsets the time spendt on 1) is unacceptable. What affects the performance of the SqlDataReader? What effect has a transaction on this? Any input would be greatly appreciated!! Jan -- Do or die.. Hi Jan,
Here's something that might help http://sqljunkies.com/WebLog/amachanic/archive/2005/04/06/10462.aspx (this article applies to .NET 1.1, so I don't know if it will really help) Also, if you're only reading data, there is no need to start a transaction Yves. Jan wrote: Show quote > I've got a rater complex sql query that returns 30+ collums from 4 joined > tables. > The query performes acceptable in the Query analyzer. > > dr = cmd.ExecuteReader(); > //I retriev my data within a wile loop containing 3 primary sections > wile(dr.Rea()) > { > 1) > string keyvalue = dr[keyname].ToString() > 2) > //Compare with previous key and ignore duplicates > 3) > //Collect remainding collum data > > } > > On large resultsets 50k+ rows I see a terrible performance penalty on the > first point. 95% of the time used in completing this procedure is spendt in > point 1). > > Typical trace data: > Execute query 1.6 seconds > Collecting keys 1) 8.1 seconds > Comparing 2) 0 seconds > Collecting remaining data 3) 0.7 seconds. > > As you can see most of the time is spendt getting data from the DataReader > the first time each row is accessed. > > To be able to optimize this I need to understand the correlation between the > SQL statement, the cmd.ExecuteReader() and the retrieval of data from the > DataReader. > > Even on small resultsets the time spendt on 1) is unacceptable. > > What affects the performance of the SqlDataReader? > What effect has a transaction on this? > > Any input would be greatly appreciated!! > > Jan > Hi Jan,
You are probably loosing time on getting column index, instead do something like: int colIndex = dr.GetOrdinal(keyname); while .... -- Show quoteMiha Markic [MVP C#, INETA Country Leader for Slovenia] RightHand .NET consulting & development www.rthand.com Blog: http://cs.rthand.com/blogs/blog_with_righthand/ string keyValue = dr.GetString(colIndex); "Jan" <jango@newsgroup.nospam> wrote in message news:2A2A696A-FF04-4C49-91DE-851B065CFB82@microsoft.com... > I've got a rater complex sql query that returns 30+ collums from 4 joined > tables. > The query performes acceptable in the Query analyzer. > > dr = cmd.ExecuteReader(); > //I retriev my data within a wile loop containing 3 primary sections > wile(dr.Rea()) > { > 1) > string keyvalue = dr[keyname].ToString() > 2) > //Compare with previous key and ignore duplicates > 3) > //Collect remainding collum data > > } > > On large resultsets 50k+ rows I see a terrible performance penalty on the > first point. 95% of the time used in completing this procedure is spendt > in > point 1). > > Typical trace data: > Execute query 1.6 seconds > Collecting keys 1) 8.1 seconds > Comparing 2) 0 seconds > Collecting remaining data 3) 0.7 seconds. > > As you can see most of the time is spendt getting data from the DataReader > the first time each row is accessed. > > To be able to optimize this I need to understand the correlation between > the > SQL statement, the cmd.ExecuteReader() and the retrieval of data from the > DataReader. > > Even on small resultsets the time spendt on 1) is unacceptable. > > What affects the performance of the SqlDataReader? > What effect has a transaction on this? > > Any input would be greatly appreciated!! > > Jan > > -- > Do or die.. I would test
Dim dt as New DataTable dt.Load MyDataReader to see if it's faster. -- Show quote____________________________________ William (Bill) Vaughn Author, Mentor, Consultant Microsoft MVP INETA Speaker www.betav.com/blog/billva www.betav.com Please reply only to the newsgroup so that others can benefit. This posting is provided "AS IS" with no warranties, and confers no rights. __________________________________ Visit www.hitchhikerguides.net to get more information on my latest book: Hitchhiker's Guide to Visual Studio and SQL Server (7th Edition) and Hitchhiker's Guide to SQL Server 2005 Compact Edition (EBook) ----------------------------------------------------------------------------------------------------------------------- "Jan" <jango@newsgroup.nospam> wrote in message news:2A2A696A-FF04-4C49-91DE-851B065CFB82@microsoft.com... > I've got a rater complex sql query that returns 30+ collums from 4 joined > tables. > The query performes acceptable in the Query analyzer. > > dr = cmd.ExecuteReader(); > //I retriev my data within a wile loop containing 3 primary sections > wile(dr.Rea()) > { > 1) > string keyvalue = dr[keyname].ToString() > 2) > //Compare with previous key and ignore duplicates > 3) > //Collect remainding collum data > > } > > On large resultsets 50k+ rows I see a terrible performance penalty on the > first point. 95% of the time used in completing this procedure is spendt > in > point 1). > > Typical trace data: > Execute query 1.6 seconds > Collecting keys 1) 8.1 seconds > Comparing 2) 0 seconds > Collecting remaining data 3) 0.7 seconds. > > As you can see most of the time is spendt getting data from the DataReader > the first time each row is accessed. > > To be able to optimize this I need to understand the correlation between > the > SQL statement, the cmd.ExecuteReader() and the retrieval of data from the > DataReader. > > Even on small resultsets the time spendt on 1) is unacceptable. > > What affects the performance of the SqlDataReader? > What effect has a transaction on this? > > Any input would be greatly appreciated!! > > Jan > > -- > Do or die.. Try this instead:
dr = cmd.ExecuteReader(); string keyvalue = ""; int ordinal = dr.GetOrdinal(keyname); while(dr.Read()) { keyvalue = dr[ordinal].ToString(); ... } Try also keyvalue = (string)dr[ordinal]; // might not work and keyvalue = Convert.ToString(dr[ordinal]); // might be slower but don't know without transaction (not necessary for read operations in most cases) and in RELEASE mode without debugger attached (run without debugging in RELEASE mode). I'm not sure how the optimizer will treat your string keyvalue = dr[keyname].ToString(); but many times I find it helps to move the declaration of a string outside the loop and its also a good practice to know the index of your field in advance so you don't have to locate it each iteration... so its best to get the ordinal(s) before the loop begins. That helps a bit to. Beyond that, I'm not sure there's a whole lot else can be done to improve performance source code wise based on the snippet provided. Thanks, Shawn Show quote > I've got a rater complex sql query that returns 30+ collums from 4 joined > tables. > The query performes acceptable in the Query analyzer. > > dr = cmd.ExecuteReader(); > //I retriev my data within a wile loop containing 3 primary sections > wile(dr.Rea()) > { > 1) > string keyvalue = dr[keyname].ToString() > 2) > //Compare with previous key and ignore duplicates > 3) > //Collect remainding collum data > > } > > On large resultsets 50k+ rows I see a terrible performance penalty on the > first point. 95% of the time used in completing this procedure is spendt > in > point 1). > > Typical trace data: > Execute query 1.6 seconds > Collecting keys 1) 8.1 seconds > Comparing 2) 0 seconds > Collecting remaining data 3) 0.7 seconds. > > As you can see most of the time is spendt getting data from the DataReader > the first time each row is accessed. > > To be able to optimize this I need to understand the correlation between > the > SQL statement, the cmd.ExecuteReader() and the retrieval of data from the > DataReader. > > Even on small resultsets the time spendt on 1) is unacceptable. > > What affects the performance of the SqlDataReader? > What effect has a transaction on this? > > Any input would be greatly appreciated!! > > Jan > > -- > Do or die.. Hi Jan,
First of all, I'd like to confirm my understanding of your issue. According to your description, I understand that you want to know why it will take a long time to execute the statement "string keyvalue = dr[keyname].ToString()". If I misunderstand anything here, please don't hesitate to correct me. The first thing we have to do is to check which mode you have set when you do the test. There will be a performance issue when we use DEBUG mode. If we want to get more performance, we should change the mode to RELEASE. Second, as Shawn said, GetValue would call GetOrdinal first, if we're looping through a large rowset GetOrdinal might be called every single time. Instead, only call it once. Try this instead: dr = cmd.ExecuteReader(); string keyvalue = ""; int ordinal = dr.GetOrdinal(keyname); while(dr.Read()) { keyvalue = dr[ordinal].ToString(); ... } Last but not least, I'm afraid I don't think it will take a long time to execute the statement "string keyvalue = dr[keyname].ToString()". It seems like "dr.Read()" is time consuming. Would you mind telling us which version of ADO.net your program run on? I'll perform more research on it. If there is anything unclear, please feel free to reply me here. We will follow up. I'm glad to work with you. Sincerely, WenYuan I've implemented the suggested codechanges without much of a performance
benefit. To focus my optimization efforts I need to understand when the most severe performance penalties are to be expected. In my case: When I'm executing my three table joined query, or when Im trying to retrieve the data from the datareader? In my case it looks like the query is fairly quick, but the retrieval of data is a nightmare. If I'm not in a transaction, should I be impacted by locking issues? Would these issues manifest themselves when I'm trying to read the data? I'll try to compare with a query returning fewer collumns, and see how this effects performance. WenYuan: This version of the software is still running on .net framework 1.1. I've got an upgraded version though, and will try to make a comparison. -- Show quoteDo or die.. "WenYuan Wang" wrote: > Hi Jan, > > First of all, I'd like to confirm my understanding of your issue. > According to your description, I understand that you want to know why it > will take a long time to execute the statement "string keyvalue = > dr[keyname].ToString()". > If I misunderstand anything here, please don't hesitate to correct me. > > The first thing we have to do is to check which mode you have set when you > do the test. > There will be a performance issue when we use DEBUG mode. > If we want to get more performance, we should change the mode to RELEASE. > > Second, as Shawn said, GetValue would call GetOrdinal first, if we're > looping through a large rowset GetOrdinal might be called every single > time. Instead, only call it once. > Try this instead: > dr = cmd.ExecuteReader(); > string keyvalue = ""; > int ordinal = dr.GetOrdinal(keyname); > while(dr.Read()) > { > keyvalue = dr[ordinal].ToString(); > ... > } > > Last but not least, I'm afraid I don't think it will take a long time to > execute the statement "string keyvalue = dr[keyname].ToString()". It seems > like "dr.Read()" is time consuming. > Would you mind telling us which version of ADO.net your program run on? > I'll perform more research on it. > > If there is anything unclear, please feel free to reply me here. We will > follow up. > I'm glad to work with you. > Sincerely, > WenYuan > > WenYuan
To address some of your additional comments and questions: I'm not looking for the "string keyvalue = dr[keyname].ToString()" performance, I'm trying to understand general DataReader performance. I'm running in release mode in a production environment. ADO.NET from .net framework 1.1 Thanks for your reply Jan -- Show quoteDo or die.. "WenYuan Wang" wrote: > Hi Jan, > > First of all, I'd like to confirm my understanding of your issue. > According to your description, I understand that you want to know why it > will take a long time to execute the statement "string keyvalue = > dr[keyname].ToString()". > If I misunderstand anything here, please don't hesitate to correct me. > > The first thing we have to do is to check which mode you have set when you > do the test. > There will be a performance issue when we use DEBUG mode. > If we want to get more performance, we should change the mode to RELEASE. > > Second, as Shawn said, GetValue would call GetOrdinal first, if we're > looping through a large rowset GetOrdinal might be called every single > time. Instead, only call it once. > Try this instead: > dr = cmd.ExecuteReader(); > string keyvalue = ""; > int ordinal = dr.GetOrdinal(keyname); > while(dr.Read()) > { > keyvalue = dr[ordinal].ToString(); > ... > } > > Last but not least, I'm afraid I don't think it will take a long time to > execute the statement "string keyvalue = dr[keyname].ToString()". It seems > like "dr.Read()" is time consuming. > Would you mind telling us which version of ADO.net your program run on? > I'll perform more research on it. > > If there is anything unclear, please feel free to reply me here. We will > follow up. > I'm glad to work with you. > Sincerely, > WenYuan > > Hi Jan,
Thanks for your reply. What "cmd.ExecuteReader()" was to query the DataBase and return a cursor (not the whole data). Each time the data row will be returned from DataBase when the statement "dr.read()" is executed. For these reasons, as you know, the execute query will not take a long time(1.6 seconds in your case), but retrieval of data is time consuming (when you have huge amounts of data returned). I think the root cause of this issue is the Query String(returns 30+ collums from 4 joined tables). It should effect performance if you try to returning fewer columns. (Tip: SELECT statement should only return the columns/rows that are going to be used, possibly for display purposes or some business logic.) Some suggestions: Have you tried testing whether the time is spent in the data retrieval from the DataReader, or the ToString call? Is dr[keyname] already a string, or is it some other data type that's being converted with ToString? If it's the latter, I would try profiling what happens if you do this instead: object val = dr[keyname]; string keyvalue = val.ToString(); This will not help your performance, but will help isolate what the problem is. If dr[keyname] is a string value, try using dr.GetString(keyordinal) instead of dr[keyname], which is basically dr.GetValue(GetOrdinal(keyname)). In doing so you will avoid boxing the value into an object. If there is anything unclear, please feel free to reply me. I'm glad to work with you. Best Regards, Wen Yuan Wen Yuan
Thanks for the input so far. I already figured that the collumn count was a bit steep, and are in the process of optimizing the data model as well as the query. I'll do a test on the datatyping changes you suggested, but I'm pretty sure this is not where I'm loosing valuable seconds. Do you know of any other method that is more effective at moving large rowsets from the db server to the client. How about the sql xml methods? Do they deliver compareable performance (or better)? Best regards Jan -- Show quoteDo or die.. "WenYuan Wang" wrote: > Hi Jan, > Thanks for your reply. > > What "cmd.ExecuteReader()" was to query the DataBase and return a cursor > (not the whole data). > Each time the data row will be returned from DataBase when the statement > "dr.read()" is executed. > For these reasons, as you know, the execute query will not take a long > time(1.6 seconds in your case), but retrieval of data is time consuming > (when you have huge amounts of data returned). > > I think the root cause of this issue is the Query String(returns 30+ > collums from 4 joined tables). It should effect performance if you try to > returning fewer columns. > (Tip: SELECT statement should only return the columns/rows that are going > to be used, possibly for display purposes or some business logic.) > > Some suggestions: > Have you tried testing whether the time is spent in the data retrieval > from the DataReader, or the ToString call? > Is dr[keyname] already a string, or is it some other data type that's being > converted with ToString? > If it's the latter, I would try profiling what happens if you do this > instead: > > object val = dr[keyname]; > string keyvalue = val.ToString(); > This will not help your performance, but will help isolate what the problem > is. > > If dr[keyname] is a string value, try using dr.GetString(keyordinal) > instead of dr[keyname], which is basically dr.GetValue(GetOrdinal(keyname)). > In doing so you will avoid boxing the value into an object. > > If there is anything unclear, please feel free to reply me. > I'm glad to work with you. > Best Regards, > Wen Yuan > > XML and performance cannot be used in the same sentence. The other question
you need to ask is "Why?". Why move so many rows to the client? Shouldn't the data be managed in the Database Management System--that's what they're designed to do. Too many developers treat the DBMS as a file store and fetch much of the database to the client to process. Most serious DBMS engines can process data locally and return the compiled/computed/correlated results of these operations far more efficiently than moving bulk data to the client and crunching it there. As far as making the server-side query run faster, investigate the Query Analyzer that helps tune the query so it can run as efficiently as possible. All too often I've seen query problems boil down to proper query and index design (and maintenance) and knowing how the engine processes the queries. hth -- Show quote____________________________________ William (Bill) Vaughn Author, Mentor, Consultant Microsoft MVP INETA Speaker www.betav.com/blog/billva www.betav.com Please reply only to the newsgroup so that others can benefit. This posting is provided "AS IS" with no warranties, and confers no rights. __________________________________ Visit www.hitchhikerguides.net to get more information on my latest book: Hitchhiker's Guide to Visual Studio and SQL Server (7th Edition) and Hitchhiker's Guide to SQL Server 2005 Compact Edition (EBook) ----------------------------------------------------------------------------------------------------------------------- "Jan" <jango@newsgroup.nospam> wrote in message news:28EB7EC6-2914-4500-9659-BE482466D94B@microsoft.com... > Wen Yuan > > Thanks for the input so far. > > I already figured that the collumn count was a bit steep, and are in the > process of optimizing the data model as well as the query. > > I'll do a test on the datatyping changes you suggested, but I'm pretty > sure > this is not where I'm loosing valuable seconds. > > Do you know of any other method that is more effective at moving large > rowsets from the db server to the client. How about the sql xml methods? > Do > they deliver compareable performance (or better)? > > Best regards > > Jan > -- > Do or die.. > > > "WenYuan Wang" wrote: > >> Hi Jan, >> Thanks for your reply. >> >> What "cmd.ExecuteReader()" was to query the DataBase and return a cursor >> (not the whole data). >> Each time the data row will be returned from DataBase when the statement >> "dr.read()" is executed. >> For these reasons, as you know, the execute query will not take a long >> time(1.6 seconds in your case), but retrieval of data is time consuming >> (when you have huge amounts of data returned). >> >> I think the root cause of this issue is the Query String(returns 30+ >> collums from 4 joined tables). It should effect performance if you try to >> returning fewer columns. >> (Tip: SELECT statement should only return the columns/rows that are going >> to be used, possibly for display purposes or some business logic.) >> >> Some suggestions: >> Have you tried testing whether the time is spent in the data retrieval >> from the DataReader, or the ToString call? >> Is dr[keyname] already a string, or is it some other data type that's >> being >> converted with ToString? >> If it's the latter, I would try profiling what happens if you do this >> instead: >> >> object val = dr[keyname]; >> string keyvalue = val.ToString(); >> This will not help your performance, but will help isolate what the >> problem >> is. >> >> If dr[keyname] is a string value, try using dr.GetString(keyordinal) >> instead of dr[keyname], which is basically >> dr.GetValue(GetOrdinal(keyname)). >> In doing so you will avoid boxing the value into an object. >> >> If there is anything unclear, please feel free to reply me. >> I'm glad to work with you. >> Best Regards, >> Wen Yuan >> >> Hi Jan,
Sorry for the late reply. I agree with Bill. SQL XML method will not give you more performance. It will just reorganize the out-put string from the SQL database. In your special case, the most effective way to improve the performance is cutting the columns and rows that are not needed. As Bill said, we would like to recommend you use Stored Procedure if you just want to update rows or compute the results. In this way, SQL database can process the data locally and just return the data row which you really need. Also I think the root cause of your issue is the statement "dr.read()". It will take a long time to initialize each row from the database. For this reason, if you are developing a winform project, we would like to suggest you use Dataset instead of Datareader. Dataset will initialize all rows and columns when DataAdapter fill the data into. But something we should pay attention to is that this method is not optimal. The best way to resolve this issue is just to return the rows/columns that you really need. If there is anything unclear, please feel free to reply me here. We will follow up. I'm glad to work with you. Sincerely, WenYuan Bill and WenYuan
Thanks for the input guys. I have already removed unnecessary collumns from the query, but are still stuck with around 20 in the query. So it seems I'll have to move the data processing logic to the db tier and forfeit some of our database independancy.. Jan -- Show quoteDo or die.. "WenYuan Wang" wrote: > Hi Jan, > Sorry for the late reply. > > I agree with Bill. > SQL XML method will not give you more performance. It will just reorganize > the out-put string from the SQL database. > In your special case, the most effective way to improve the performance is > cutting the columns and rows that are not needed. > > As Bill said, we would like to recommend you use Stored Procedure if you > just want to update rows or compute the results. In this way, SQL database > can process the data locally and just return the data row which you really > need. > > Also I think the root cause of your issue is the statement "dr.read()". It > will take a long time to initialize each row from the database. > For this reason, if you are developing a winform project, we would like to > suggest you use Dataset instead of Datareader. Dataset will initialize all > rows and columns when DataAdapter fill the data into. > But something we should pay attention to is that this method is not optimal. > The best way to resolve this issue is just to return the rows/columns that > you really need. > > If there is anything unclear, please feel free to reply me here. We will > follow up. > I'm glad to work with you. > Sincerely, > WenYuan > > Hi Jan,
Thanks for your reply. We are sorry for the inconvenience that this caused. Working with a large amount of data, ADO.net performance will be curtailed. Some idea: If Database and Application are located on different machines, the network is also related to this issue. We would like to suggest you try to test the performance of your application on the machine which your database is located on. This method will help you to check whether this issue is related to the Network problem. If there is something I can help with, please feel free to reply me here. We will follow up. I'm glad to work with you. Sincerely, Wen Yuan Hi Jan
Just want to check whether there is anything we can help with. You can reply me here, and we will follow up. I'm glad to work with you. Have a good day. Best regards, Wen Yuan Are you running 2.0 or 1.1? I remember seeing a thread in this newsgroup
where someone ran the same ADO.NET code under 1.1, and it was much faster then the same code under 2.0. Someone from MS did respond, acknowledged that it was reproduceable, but that seemed to be the end of that. Show quote "Jan" <jango@newsgroup.nospam> wrote in message news:2A2A696A-FF04-4C49-91DE-851B065CFB82@microsoft.com... > I've got a rater complex sql query that returns 30+ collums from 4 joined > tables. > The query performes acceptable in the Query analyzer. > > dr = cmd.ExecuteReader(); > //I retriev my data within a wile loop containing 3 primary sections > wile(dr.Rea()) > { > 1) > string keyvalue = dr[keyname].ToString() > 2) > //Compare with previous key and ignore duplicates > 3) > //Collect remainding collum data > > } > > On large resultsets 50k+ rows I see a terrible performance penalty on the > first point. 95% of the time used in completing this procedure is spendt > in > point 1). > > Typical trace data: > Execute query 1.6 seconds > Collecting keys 1) 8.1 seconds > Comparing 2) 0 seconds > Collecting remaining data 3) 0.7 seconds. > > As you can see most of the time is spendt getting data from the DataReader > the first time each row is accessed. > > To be able to optimize this I need to understand the correlation between > the > SQL statement, the cmd.ExecuteReader() and the retrieval of data from the > DataReader. > > Even on small resultsets the time spendt on 1) is unacceptable. > > What affects the performance of the SqlDataReader? > What effect has a transaction on this? > > Any input would be greatly appreciated!! > > Jan > > -- > Do or die.. 1.1
Regards Jan -- Show quoteDo or die.. "Marina Levit [MVP]" wrote: > Are you running 2.0 or 1.1? I remember seeing a thread in this newsgroup > where someone ran the same ADO.NET code under 1.1, and it was much faster > then the same code under 2.0. Someone from MS did respond, acknowledged > that it was reproduceable, but that seemed to be the end of that. > > "Jan" <jango@newsgroup.nospam> wrote in message > news:2A2A696A-FF04-4C49-91DE-851B065CFB82@microsoft.com... > > I've got a rater complex sql query that returns 30+ collums from 4 joined > > tables. > > The query performes acceptable in the Query analyzer. > > > > dr = cmd.ExecuteReader(); > > //I retriev my data within a wile loop containing 3 primary sections > > wile(dr.Rea()) > > { > > 1) > > string keyvalue = dr[keyname].ToString() > > 2) > > //Compare with previous key and ignore duplicates > > 3) > > //Collect remainding collum data > > > > } > > > > On large resultsets 50k+ rows I see a terrible performance penalty on the > > first point. 95% of the time used in completing this procedure is spendt > > in > > point 1). > > > > Typical trace data: > > Execute query 1.6 seconds > > Collecting keys 1) 8.1 seconds > > Comparing 2) 0 seconds > > Collecting remaining data 3) 0.7 seconds. > > > > As you can see most of the time is spendt getting data from the DataReader > > the first time each row is accessed. > > > > To be able to optimize this I need to understand the correlation between > > the > > SQL statement, the cmd.ExecuteReader() and the retrieval of data from the > > DataReader. > > > > Even on small resultsets the time spendt on 1) is unacceptable. > > > > What affects the performance of the SqlDataReader? > > What effect has a transaction on this? > > > > Any input would be greatly appreciated!! > > > > Jan > > > > -- > > Do or die.. > > >
Other interesting topics
|
|||||||||||||||||||||||