Home All Groups Group Topic Archive Search About

datatable vs custom List<MyClass>

Author
25 Oct 2007 12:16 AM
paul
Let's say I want a read-only copy of a simple query that joins 2 tables. When
I compare a DataTable (adapter.fill) and a custom List + DataReader, I see
that the custom sol'n is about 6x faster and takes 5x less memory.  For my
test I read about 100K rows and then bind to a grid so the timing takes into
account both getting the data and binding the data. From profiling the app I
see one thing that really stands out - the internal DataTable routine to
build the (default?) index takes up most of the time - it clearly is doing
linear scans of a red-black tree and the hit count for traversing parts of
the tree ending up in the 100's of millions. Can I give the DataTable some
hint (via a property) to tell it not to perform this extra work? In the end I
may go w/ a custom sol'n but I first want to check if the built in classes
can solve my problem.

btw, this is .net 2.0, c#, and I'm just setting the datasource w/out any
other binding code.

thanks,
Paul.

Author
25 Oct 2007 8:26 AM
Miha Markic
How does your table look like?
Did you try pre-create the table without any key?

--
Miha Markic [MVP C#, INETA Country Leader for Slovenia]
RightHand .NET consulting & development www.rthand.com
Blog: http://cs.rthand.com/blogs/blog_with_righthand/

Show quote
"paul" <p***@discussions.microsoft.com> wrote in message
news:BDAA602B-4FB6-4207-BF71-894DD6BB46D0@microsoft.com...
> Let's say I want a read-only copy of a simple query that joins 2 tables.
> When
> I compare a DataTable (adapter.fill) and a custom List + DataReader, I see
> that the custom sol'n is about 6x faster and takes 5x less memory.  For my
> test I read about 100K rows and then bind to a grid so the timing takes
> into
> account both getting the data and binding the data. From profiling the app
> I
> see one thing that really stands out - the internal DataTable routine to
> build the (default?) index takes up most of the time - it clearly is doing
> linear scans of a red-black tree and the hit count for traversing parts of
> the tree ending up in the 100's of millions. Can I give the DataTable some
> hint (via a property) to tell it not to perform this extra work? In the
> end I
> may go w/ a custom sol'n but I first want to check if the built in classes
> can solve my problem.
>
> btw, this is .net 2.0, c#, and I'm just setting the datasource w/out any
> other binding code.
>
> thanks,
> Paul.
Author
25 Oct 2007 9:34 PM
paul
Well for the test, the select performs an inner join on two simple tables
that have no indexes. The tables have ints and bigints and the result set
obviously won't have any indexes as it will be tempdb w/ about 100K rows.
When I step in the debugger the data retrieval from sql server is pretty fast
- around 1 - 2 sec for both the DataTable and a custom list. BUT, when I step
over the code that sets the datasource, the DataTable does the extra work.
The call stack during the extra work is something like:

  IListSource.GetList
   dataTable.DefaultView 
    defaultView.SetIndex2 <- this can't be indexing on any specific column?
      UpdateIndex
        GetIndex
          Index.ctor
           InitRecords
             RBtree<>.Insert
               RBTree<>.Left/Right/Parent  <- called 100's of millions of
times?

The RBTree methods are called 100's of millions of times while this index is
set. I really don't want any indexes because I want the user to be able to
sort/group/filter by any combination of columns. I was wondering if there was
a way to give the DataTable some extra knowledge so that the list inside of
DataRowCollection is maintained more efficiently.

thanks,
Paul.

Show quote
"Miha Markic" wrote:

> How does your table look like?
> Did you try pre-create the table without any key?
>

AddThis Social Bookmark Button