Home All Groups Group Topic Archive Search About

Serialization takes to long. Any suggestions?

Author
19 Jul 2006 3:28 PM
E
I have binary datafiles that I read and write from a stream that is created
through Serialization/DeSerialization.  These data files are very large and
it's taking to much time to deserialize the stream. 

I have already implemented the SerializationInfo/StreamingContext
Constructors and the GetObjectData and have also used OnDeserialization.  But
it still takes to much time to Deserialize the stream.  Can anyone suggest
something else that can be done to speed up deserialization?

I am now at the point to break the full Serialzation/Deserialization by not
adding/getting some of the lower objects and read/write those into some
additional binary flat files.  If I do this I was wondering if I can simply
perform this functionality from within SerializationInfo Constructor and the
GetObjectData method.  Has anyone tried this?  Would if be faster if I took
this out of the binary flat files and moved it all to Sql Server?


--
Ed Reyes

Author
19 Jul 2006 4:33 PM
Kevin Spencer
You'll have to explain more about these "files" and what they are serialized
from.

--
HTH,

Kevin Spencer
Microsoft MVP
Professional Chicken Salad Alchemist

What You Seek Is What You Get.


Show quote
"E" <E@discussions.microsoft.com> wrote in message
news:3A89C25F-00F2-4342-A52E-EB6B5ED1153D@microsoft.com...
>
> I have binary datafiles that I read and write from a stream that is
> created
> through Serialization/DeSerialization.  These data files are very large
> and
> it's taking to much time to deserialize the stream.
>
> I have already implemented the SerializationInfo/StreamingContext
> Constructors and the GetObjectData and have also used OnDeserialization.
> But
> it still takes to much time to Deserialize the stream.  Can anyone suggest
> something else that can be done to speed up deserialization?
>
> I am now at the point to break the full Serialzation/Deserialization by
> not
> adding/getting some of the lower objects and read/write those into some
> additional binary flat files.  If I do this I was wondering if I can
> simply
> perform this functionality from within SerializationInfo Constructor and
> the
> GetObjectData method.  Has anyone tried this?  Would if be faster if I
> took
> this out of the binary flat files and moved it all to Sql Server?
>
>
> --
> Ed Reyes
>
Author
19 Jul 2006 6:24 PM
E
The binary file is a serialization.  The serialization contains objects
containing objects 5 deep.  The various objects contain ArrayLists of
objects, HashTable of Objects and of course value fields.  The Serialization
and DeSerialization works fine.  It just takes to long to Deserialize mainly
and even the Serialization is a bit slow but not terribly bad.  A Test file
is about 31 megs, they will assuredly get bigger.  If I can somehow speed up
the Deserialization then I would have no problem.  I am already using the
SerializationInfo Constructor and using the GetObjectData method to
Deserialize and Serialize and I am using OnDeserialization to do anything
else I may need to do (which is very very little) after Deserialization.  Can
something else be done to speed Deserialization?

What I am thinking of doing is only to Serialize a portion of the "call
graph" ( think that is the correct term).  The part of the "call graph" I do
not serialize I am going to place into a different binary data file that is
not loaded and unloaded using Serialization techniques.  If I have to do
this, do you think it is ok to appropriately place the calls to store the
other data within the GetObjectData method for Serialization?  I will place
the other needed functionality for data loading into the OnDeserialization
method, but will the OnDeserialization methods be called after all the
SerailzationInfo constructors for the entire "call graph" are called and
creating all the objects as well?


--
Ed Reyes



Show quote
"Kevin Spencer" wrote:

> You'll have to explain more about these "files" and what they are serialized
> from.
>
> --
> HTH,
>
> Kevin Spencer
> Microsoft MVP
> Professional Chicken Salad Alchemist
>
> What You Seek Is What You Get.
>
>
> "E" <E@discussions.microsoft.com> wrote in message
> news:3A89C25F-00F2-4342-A52E-EB6B5ED1153D@microsoft.com...
> >
> > I have binary datafiles that I read and write from a stream that is
> > created
> > through Serialization/DeSerialization.  These data files are very large
> > and
> > it's taking to much time to deserialize the stream.
> >
> > I have already implemented the SerializationInfo/StreamingContext
> > Constructors and the GetObjectData and have also used OnDeserialization.
> > But
> > it still takes to much time to Deserialize the stream.  Can anyone suggest
> > something else that can be done to speed up deserialization?
> >
> > I am now at the point to break the full Serialzation/Deserialization by
> > not
> > adding/getting some of the lower objects and read/write those into some
> > additional binary flat files.  If I do this I was wondering if I can
> > simply
> > perform this functionality from within SerializationInfo Constructor and
> > the
> > GetObjectData method.  Has anyone tried this?  Would if be faster if I
> > took
> > this out of the binary flat files and moved it all to Sql Server?
> >
> >
> > --
> > Ed Reyes
> >
>
>
>
Author
19 Jul 2006 10:14 PM
schneider
I would remove anything that is not needed. Maybe serialize to a file, and
review the data.
Adjust scope, (Makes things private/internal/friend when you can). Binary
serilization includes privates...
Avoid any duplicate data.

Another thing to watch for is any exceptions raised/caught when
serialized/deserialized this can really slow it down..
Your transport protocol makes a difference also, some us compression.

That's all I can think of.

Schneider



Show quote
"E" <E@discussions.microsoft.com> wrote in message
news:DBCA1FDC-8769-4245-AF36-956F77748B9B@microsoft.com...
> The binary file is a serialization.  The serialization contains objects
> containing objects 5 deep.  The various objects contain ArrayLists of
> objects, HashTable of Objects and of course value fields.  The
Serialization
> and DeSerialization works fine.  It just takes to long to Deserialize
mainly
> and even the Serialization is a bit slow but not terribly bad.  A Test
file
> is about 31 megs, they will assuredly get bigger.  If I can somehow speed
up
> the Deserialization then I would have no problem.  I am already using the
> SerializationInfo Constructor and using the GetObjectData method to
> Deserialize and Serialize and I am using OnDeserialization to do anything
> else I may need to do (which is very very little) after Deserialization.
Can
> something else be done to speed Deserialization?
>
> What I am thinking of doing is only to Serialize a portion of the "call
> graph" ( think that is the correct term).  The part of the "call graph" I
do
> not serialize I am going to place into a different binary data file that
is
> not loaded and unloaded using Serialization techniques.  If I have to do
> this, do you think it is ok to appropriately place the calls to store the
> other data within the GetObjectData method for Serialization?  I will
place
> the other needed functionality for data loading into the OnDeserialization
> method, but will the OnDeserialization methods be called after all the
> SerailzationInfo constructors for the entire "call graph" are called and
> creating all the objects as well?
>
>
> --
> Ed Reyes
>
>
>
> "Kevin Spencer" wrote:
>
> > You'll have to explain more about these "files" and what they are
serialized
> > from.
> >
> > --
> > HTH,
> >
> > Kevin Spencer
> > Microsoft MVP
> > Professional Chicken Salad Alchemist
> >
> > What You Seek Is What You Get.
> >
> >
> > "E" <E@discussions.microsoft.com> wrote in message
> > news:3A89C25F-00F2-4342-A52E-EB6B5ED1153D@microsoft.com...
> > >
> > > I have binary datafiles that I read and write from a stream that is
> > > created
> > > through Serialization/DeSerialization.  These data files are very
large
> > > and
> > > it's taking to much time to deserialize the stream.
> > >
> > > I have already implemented the SerializationInfo/StreamingContext
> > > Constructors and the GetObjectData and have also used
OnDeserialization.
> > > But
> > > it still takes to much time to Deserialize the stream.  Can anyone
suggest
> > > something else that can be done to speed up deserialization?
> > >
> > > I am now at the point to break the full Serialzation/Deserialization
by
> > > not
> > > adding/getting some of the lower objects and read/write those into
some
> > > additional binary flat files.  If I do this I was wondering if I can
> > > simply
> > > perform this functionality from within SerializationInfo Constructor
and
> > > the
> > > GetObjectData method.  Has anyone tried this?  Would if be faster if I
> > > took
> > > this out of the binary flat files and moved it all to Sql Server?
> > >
> > >
> > > --
> > > Ed Reyes
> > >
> >
> >
> >
Author
20 Jul 2006 10:34 AM
Kevin Spencer
I'm not sure what to recommend. At 31 MB of data, serialization is going to
take some time. It will take less if the objects are less complex. For
example, it will take less time to serialize/deserialize a single 31 MB
Image than a Collection of 31 1MB Images. I also noticed that you're using
ArrayLists and HashTables. Using strongly-typed Collections would probably
improve the speed of serialization/deserialization, as it would not entail
the use of reflection as much during the process. And HashTables are
definitely going to slow down performance. An MSDN Magazine online article
(http://msdn.microsoft.com/msdnmag/issues/02/07/net/) states, regarding
binary serialization of HashTables:

"Occasionally, you will design a type that requires complete control over
how it is serialized and deserialized. The System.Collections.Hashtable type
is just such a type. When serialized, a Hashtable object and all the objects
it references must be written to the stream. Upon deserialization, a new
Hashtable object must be constructed, all the objects managed by the
Hashtable object must be constructed, and all the object references must be
set correctly. The problem is that hash codes are not guaranteed to be the
same for the newly deserialized objects. So deserializing a Hashtable
requires that all the objects be deserialized first and then each of the
objects must be manually added to the Hashtable object using each object's
new hash code value."

That's about all I can think of.

--
HTH,

Kevin Spencer
Microsoft MVP
Professional Chicken Salad Alchemist

What You Seek Is What You Get.


Show quote
"E" <E@discussions.microsoft.com> wrote in message
news:DBCA1FDC-8769-4245-AF36-956F77748B9B@microsoft.com...
> The binary file is a serialization.  The serialization contains objects
> containing objects 5 deep.  The various objects contain ArrayLists of
> objects, HashTable of Objects and of course value fields.  The
> Serialization
> and DeSerialization works fine.  It just takes to long to Deserialize
> mainly
> and even the Serialization is a bit slow but not terribly bad.  A Test
> file
> is about 31 megs, they will assuredly get bigger.  If I can somehow speed
> up
> the Deserialization then I would have no problem.  I am already using the
> SerializationInfo Constructor and using the GetObjectData method to
> Deserialize and Serialize and I am using OnDeserialization to do anything
> else I may need to do (which is very very little) after Deserialization.
> Can
> something else be done to speed Deserialization?
>
> What I am thinking of doing is only to Serialize a portion of the "call
> graph" ( think that is the correct term).  The part of the "call graph" I
> do
> not serialize I am going to place into a different binary data file that
> is
> not loaded and unloaded using Serialization techniques.  If I have to do
> this, do you think it is ok to appropriately place the calls to store the
> other data within the GetObjectData method for Serialization?  I will
> place
> the other needed functionality for data loading into the OnDeserialization
> method, but will the OnDeserialization methods be called after all the
> SerailzationInfo constructors for the entire "call graph" are called and
> creating all the objects as well?
>
>
> --
> Ed Reyes
>
>
>
> "Kevin Spencer" wrote:
>
>> You'll have to explain more about these "files" and what they are
>> serialized
>> from.
>>
>> --
>> HTH,
>>
>> Kevin Spencer
>> Microsoft MVP
>> Professional Chicken Salad Alchemist
>>
>> What You Seek Is What You Get.
>>
>>
>> "E" <E@discussions.microsoft.com> wrote in message
>> news:3A89C25F-00F2-4342-A52E-EB6B5ED1153D@microsoft.com...
>> >
>> > I have binary datafiles that I read and write from a stream that is
>> > created
>> > through Serialization/DeSerialization.  These data files are very large
>> > and
>> > it's taking to much time to deserialize the stream.
>> >
>> > I have already implemented the SerializationInfo/StreamingContext
>> > Constructors and the GetObjectData and have also used
>> > OnDeserialization.
>> > But
>> > it still takes to much time to Deserialize the stream.  Can anyone
>> > suggest
>> > something else that can be done to speed up deserialization?
>> >
>> > I am now at the point to break the full Serialzation/Deserialization by
>> > not
>> > adding/getting some of the lower objects and read/write those into some
>> > additional binary flat files.  If I do this I was wondering if I can
>> > simply
>> > perform this functionality from within SerializationInfo Constructor
>> > and
>> > the
>> > GetObjectData method.  Has anyone tried this?  Would if be faster if I
>> > took
>> > this out of the binary flat files and moved it all to Sql Server?
>> >
>> >
>> > --
>> > Ed Reyes
>> >
>>
>>
>>
Author
24 Jul 2006 9:27 AM
Wolfgang Gross
E schrieb:
Show quote
> I have binary datafiles that I read and write from a stream that is created
> through Serialization/DeSerialization.  These data files are very large and
> it's taking to much time to deserialize the stream. 
>
> I have already implemented the SerializationInfo/StreamingContext
> Constructors and the GetObjectData and have also used OnDeserialization.  But
> it still takes to much time to Deserialize the stream.  Can anyone suggest
> something else that can be done to speed up deserialization?
>
> I am now at the point to break the full Serialzation/Deserialization by not
> adding/getting some of the lower objects and read/write those into some
> additional binary flat files.  If I do this I was wondering if I can simply
> perform this functionality from within SerializationInfo Constructor and the
> GetObjectData method.  Has anyone tried this?  Would if be faster if I took
> this out of the binary flat files and moved it all to Sql Server?
>
>

Two suggestions:

1. Try to split the data in multiple partitions that are
loaded/deserialized on demand.

2. Don't use serialization at all. I don't believe that serialization is
  optimized to perform the task of mass data storage. Change your
persistence implementation to something like db4o (www.db4o.com). It can
handle object hierachies and searching in your data data better.

my 0.02$,
Wolfgang
-

AddThis Social Bookmark Button