Home All Groups Group Topic Archive Search About

Whats an efficient way of insertring elements into an XML document

Author
10 Oct 2005 5:30 PM
Nick Z.
Lets say you have the following xml document

<?xml version="1.0" encoding="utf-16"?>
<log>
    <event>Some event #1</event>
</log>

How would I add another <event> element to this document?

Asume that the file size is already 200kb or more.
Using XmlDocument class is not very efficient as far as I see.
Using the XmlTextWriter you can only append to the end of the file,
hence breaking the xml structure.

How would I do this?

Thanks,
Nick Z.

Author
10 Oct 2005 5:44 PM
Cowboy (Gregory A. Beamer) - MVP
Despite not being "highly efficient", the XmlDocument, along with XPath to
find the proper node for insert, is still the best method. There may be some
rare instances where using a Reader and parsing lines is faster, but I cannot
envision an algorythm, except perhaps a known hash of a line, or sticking to
binary (which is far more complex) will beat the XmlDocument.

If you can prebuild the XML snippets (nodes) for the insert, you could
manufacture an XSLT on the fly, but I do not think that would give you more
perf than the XmlDocument.

--
Gregory A. Beamer
MVP; MCP: +I, SE, SD, DBA

***************************
Think Outside the Box!
***************************


Show quote
"Nick Z." wrote:

> Lets say you have the following xml document
>
> <?xml version="1.0" encoding="utf-16"?>
> <log>
>     <event>Some event #1</event>
> </log>
>
> How would I add another <event> element to this document?
>
> Asume that the file size is already 200kb or more.
> Using XmlDocument class is not very efficient as far as I see.
> Using the XmlTextWriter you can only append to the end of the file,
> hence breaking the xml structure.
>
> How would I do this?
>
> Thanks,
> Nick Z.
>
>
Author
10 Oct 2005 5:59 PM
Nick Z.
But if the file is several megabytes, doesn't that mean that calling
XmlDocument.Load() will load all that into memory each time you open
the document, and then parse all of the data and build an in memory
representation of the whole file. Thats seems incredibly inefficient in
performance critical situation.

Cowboy (Gregory A. Beamer) - MVP wrote:
Show quote
> Despite not being "highly efficient", the XmlDocument, along with XPath to
> find the proper node for insert, is still the best method. There may be some
> rare instances where using a Reader and parsing lines is faster, but I cannot
> envision an algorythm, except perhaps a known hash of a line, or sticking to
> binary (which is far more complex) will beat the XmlDocument.
>
> If you can prebuild the XML snippets (nodes) for the insert, you could
> manufacture an XSLT on the fly, but I do not think that would give you more
> perf than the XmlDocument.
>
> --
> Gregory A. Beamer
> MVP; MCP: +I, SE, SD, DBA
>
> ***************************
> Think Outside the Box!
> ***************************
>
>
> "Nick Z." wrote:
>
> > Lets say you have the following xml document
> >
> > <?xml version="1.0" encoding="utf-16"?>
> > <log>
> >     <event>Some event #1</event>
> > </log>
> >
> > How would I add another <event> element to this document?
> >
> > Asume that the file size is already 200kb or more.
> > Using XmlDocument class is not very efficient as far as I see.
> > Using the XmlTextWriter you can only append to the end of the file,
> > hence breaking the xml structure.
> >
> > How would I do this?
> >
> > Thanks,
> > Nick Z.
> >
> >
Author
10 Oct 2005 6:49 PM
Jon Skeet [C# MVP]
Nick Z. <pace***@gmail.com> wrote:
> But if the file is several megabytes, doesn't that mean that calling
> XmlDocument.Load() will load all that into memory each time you open
> the document, and then parse all of the data and build an in memory
> representation of the whole file. Thats seems incredibly inefficient in
> performance critical situation.

How often are you expecting to need to insert elements into the
document? Have you considered using a format other than XML? XML really
isn't terribly friendly when it comes to appending.

Alternatively, consider having a file which contains *multiple* XML
documents, and a class which can build them up into a single one. You
could then combine the "fragments" every so often to restore it to a
proper XML file.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet   Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Author
10 Oct 2005 7:12 PM
ernesto bascón pantoja
I think XML is not the best option in order to write your document.
Maybe writing only XML elements in a XML-like document could be a good idea,
id est:

You can write manually <Event /> elements to the end of your XML element
file. So, your file should be like:

<Event>Some event #1</Event>
<Event>Some event #2</Event>
<Event>Some event #3</Event>

Thus, you can write hi-speed log entries into your file and you can recreate
your XML structure on memory with something as simple as:

XmlDocument doc = new XmlDocument();
doc.LoadXml("<Log>" + fileContent + "</Log>");

saludos,



ernesto





Show quote
"Jon Skeet [C# MVP]" <sk***@pobox.com> wrote in message
news:MPG.1db4c80f81cc5d4998c8b2@msnews.microsoft.com...
> Nick Z. <pace***@gmail.com> wrote:
>> But if the file is several megabytes, doesn't that mean that calling
>> XmlDocument.Load() will load all that into memory each time you open
>> the document, and then parse all of the data and build an in memory
>> representation of the whole file. Thats seems incredibly inefficient in
>> performance critical situation.
>
> How often are you expecting to need to insert elements into the
> document? Have you considered using a format other than XML? XML really
> isn't terribly friendly when it comes to appending.
>
> Alternatively, consider having a file which contains *multiple* XML
> documents, and a class which can build them up into a single one. You
> could then combine the "fragments" every so often to restore it to a
> proper XML file.
>
> --
> Jon Skeet - <sk***@pobox.com>
> http://www.pobox.com/~skeet   Blog: http://www.msmvps.com/jon.skeet
> If replying to the group, please do not mail me too
Author
10 Oct 2005 11:00 PM
Nick Z.
I wanted to keep the document in well-formatted xml, but I guess it's
not that important. I'll have to write <event> elements directly.

Thanks,
Nick Z.

Nick Z. wrote:

Show quote
> Lets say you have the following xml document
>
> <?xml version="1.0" encoding="utf-16"?>
> <log>
>     <event>Some event #1</event>
> </log>
>
> How would I add another <event> element to this document?
>
> Asume that the file size is already 200kb or more.
> Using XmlDocument class is not very efficient as far as I see.
> Using the XmlTextWriter you can only append to the end of the file,
> hence breaking the xml structure.
>
> How would I do this?
>
> Thanks,
> Nick Z.
>
Author
11 Oct 2005 6:55 AM
Scott Coonce
Something I did long before .Net was write a log file in pseudo HTML.  That
way, the log file could be viewed in a browser.  The other advantage is that
it would be relatively(?) easy to parse.  My file looked something like:

<table width=100%>
    <tr>
        <td>some data</td>
        <td>more data</td></tr>
    <tr>...

Although i _never_ wrote the "</table>" tag, or even <body></body> tags, the
file still displayed in ie5 (I think) without any problems.

Just another option,
Scott


Show quote
"Nick Z." <pace***@gmail.com> wrote in message
news:IpC2f.17368$Ge5.842@fe10.lga...
>I wanted to keep the document in well-formatted xml, but I guess it's not
>that important. I'll have to write <event> elements directly.
>
> Thanks,
> Nick Z.
>
> Nick Z. wrote:
>
>> Lets say you have the following xml document
>>
>> <?xml version="1.0" encoding="utf-16"?>
>> <log>
>>     <event>Some event #1</event>
>> </log>
>>
>> How would I add another <event> element to this document?
>>
>> Asume that the file size is already 200kb or more.
>> Using XmlDocument class is not very efficient as far as I see.
>> Using the XmlTextWriter you can only append to the end of the file,
>> hence breaking the xml structure.
>>
>> How would I do this?
>>
>> Thanks,
>> Nick Z.
>>
Author
11 Oct 2005 6:49 PM
john conwell
When writing stuff like log files in xml i leave off the document level tags.
so what you have is a long list of <event>Some event #1</event> tags and
nothing else.  this allows you to append at the end of the file very
effeciently without havign to use any xml objects at all...just a FileStream
object.

Then when you want to consume the log, just read the file into a string,
slap an opening document level tag at the begining, and a closing one at the
end.  then do what you will with it.

Show quote
"Nick Z." wrote:

> Lets say you have the following xml document
>
> <?xml version="1.0" encoding="utf-16"?>
> <log>
>     <event>Some event #1</event>
> </log>
>
> How would I add another <event> element to this document?
>
> Asume that the file size is already 200kb or more.
> Using XmlDocument class is not very efficient as far as I see.
> Using the XmlTextWriter you can only append to the end of the file,
> hence breaking the xml structure.
>
> How would I do this?
>
> Thanks,
> Nick Z.
>
>
Author
12 Oct 2005 12:02 AM
Nick Z.
john conwell wrote:
> When writing stuff like log files in xml i leave off the document level tags.
>  so what you have is a long list of <event>Some event #1</event> tags and
> nothing else.  this allows you to append at the end of the file very
> effeciently without havign to use any xml objects at all...just a FileStream
> object.

Thats almost how I implemented it. The only difference being that I used
an XmlTextWriter instanciated with a StreamWriter, the combo seems to
work well.

I'm just wondering, whats the best way to keep track of the file size...
I can think of several ways, but neither sounds terribly efficient. You
could 'trim' the log file every time you are about to add an event. But
that requires opening it locating the oldest <event> (the first in the
file in my case), deleting it and closing the file. How would I
accomplish this efficiently.

Show quote
>
> Then when you want to consume the log, just read the file into a string,
> slap an opening document level tag at the begining, and a closing one at the
> end.  then do what you will with it.
>
> "Nick Z." wrote:
>
>
>>Lets say you have the following xml document
>>
>><?xml version="1.0" encoding="utf-16"?>
>><log>
>>    <event>Some event #1</event>
>></log>
>>
>>How would I add another <event> element to this document?
>>
>>Asume that the file size is already 200kb or more.
>>Using XmlDocument class is not very efficient as far as I see.
>>Using the XmlTextWriter you can only append to the end of the file,
>>hence breaking the xml structure.
>>
>>How would I do this?
>>
>>Thanks,
>>Nick Z.
>>
>>
Author
12 Oct 2005 6:07 AM
Jon Skeet [C# MVP]
Nick Z. <pace***@gmail.com> wrote:
Show quote
> john conwell wrote:
> > When writing stuff like log files in xml i leave off the document level tags.
> >  so what you have is a long list of <event>Some event #1</event> tags and
> > nothing else.  this allows you to append at the end of the file very
> > effeciently without havign to use any xml objects at all...just a FileStream
> > object.
>
> Thats almost how I implemented it. The only difference being that I used
> an XmlTextWriter instanciated with a StreamWriter, the combo seems to
> work well.
>
> I'm just wondering, whats the best way to keep track of the file size...
> I can think of several ways, but neither sounds terribly efficient. You
> could 'trim' the log file every time you are about to add an event. But
> that requires opening it locating the oldest <event> (the first in the
> file in my case), deleting it and closing the file. How would I
> accomplish this efficiently.

Well, just how often are you going to need to do this? If you only need
to trim once an hour (or even once every few minutes) I'd suggest
loading the whole thing into an XML document and then rewriting just
the nodes you still want.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet   Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Author
12 Oct 2005 7:04 AM
Nick Z.
I just thought of something.

I think I'm going to use two files, say 'OldLog' and 'CurrentLog'.
First have only 'CurrentLog' then once it fills up rename it to 'OldLog'
and create a new, empty 'CurrentLog'. Once that fills up, delete the old
'OldLog' and rename the 'CurrentLog' to 'OldLog', then create a new
'CurrentLog'... etc.

This seems like an easier approach than trimming every set period of
time/posts.

Thank you,
Nick Z.

Jon Skeet [C# MVP] wrote:
Show quote
> Nick Z. <pace***@gmail.com> wrote:
>
>>john conwell wrote:
>>
>>>When writing stuff like log files in xml i leave off the document level tags.
>>> so what you have is a long list of <event>Some event #1</event> tags and
>>>nothing else.  this allows you to append at the end of the file very
>>>effeciently without havign to use any xml objects at all...just a FileStream
>>>object.
>>
>>Thats almost how I implemented it. The only difference being that I used
>>an XmlTextWriter instanciated with a StreamWriter, the combo seems to
>>work well.
>>
>>I'm just wondering, whats the best way to keep track of the file size...
>>I can think of several ways, but neither sounds terribly efficient. You
>>could 'trim' the log file every time you are about to add an event. But
>>that requires opening it locating the oldest <event> (the first in the
>>file in my case), deleting it and closing the file. How would I
>>accomplish this efficiently.
>
>
> Well, just how often are you going to need to do this? If you only need
> to trim once an hour (or even once every few minutes) I'd suggest
> loading the whole thing into an XML document and then rewriting just
> the nodes you still want.
>
Author
12 Oct 2005 5:16 PM
Jon Skeet [C# MVP]
Nick Z. <pace***@gmail.com> wrote:
> I just thought of something.
>
> I think I'm going to use two files, say 'OldLog' and 'CurrentLog'.
> First have only 'CurrentLog' then once it fills up rename it to 'OldLog'
> and create a new, empty 'CurrentLog'. Once that fills up, delete the old
> 'OldLog' and rename the 'CurrentLog' to 'OldLog', then create a new
> 'CurrentLog'... etc.
>
> This seems like an easier approach than trimming every set period of
> time/posts.

Yup, that certainly sounds like it would work.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet   Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too

AddThis Social Bookmark Button