Home All Groups Group Topic Archive Search About

XmlReader and first char

Author
13 Sep 2006 11:55 AM
Steve B.
Hi,

I've a string that contains a Xsl transformation :

string tranform = @"
<xml version=""1.0"">
<xsl .....>
";

Notice that the first chars are \r\n

I write this string into a MemoryStream, then I create a new XmlReader like
this :

XmlReader reader = XmlReader.Create(myMemoryStream);

This line throw an exception : the char 0x00 is not valid.
If I slightly change the code :

string tranform = @"<xml version=""1.0"">
<xsl .....>
";

(the content starts with the declaratibe tag)

The code then works correctly.

So my question is : why does the first \r\n make the XmlReader throw an
Exception ? I thought spaces are ignored (W3C specs).

Thanks in advance for any clarifications
Steve

Author
13 Sep 2006 12:22 PM
Cowboy (Gregory A. Beamer)
When you stream, the streamreader will attempt to determine if the entire
transmission is bogus or not. A null char is a definite signal that the
stream is bad or non-existent. I am not sure why the white space in the
first position would be seen as a null char, but I am not overly surprised
either.

Remember that files, with the exception of ascii files, begin with real
characters, not white space. It appears the mentality of working with binary
files was extended to XML (incorrectly? perhaps).

--
Gregory A. Beamer
MVP; MCP: +I, SE, SD, DBA

*************************************************
Think outside of the box!
*************************************************
Show quote
"Steve B." <steve_beauge@com.msn_swap_msn_and_com> wrote in message
news:e6cg5vy1GHA.4176@TK2MSFTNGP06.phx.gbl...
> Hi,
>
> I've a string that contains a Xsl transformation :
>
> string tranform = @"
> <xml version=""1.0"">
> <xsl .....>
> ";
>
> Notice that the first chars are \r\n
>
> I write this string into a MemoryStream, then I create a new XmlReader
> like this :
>
> XmlReader reader = XmlReader.Create(myMemoryStream);
>
> This line throw an exception : the char 0x00 is not valid.
> If I slightly change the code :
>
> string tranform = @"<xml version=""1.0"">
> <xsl .....>
> ";
>
> (the content starts with the declaratibe tag)
>
> The code then works correctly.
>
> So my question is : why does the first \r\n make the XmlReader throw an
> Exception ? I thought spaces are ignored (W3C specs).
>
> Thanks in advance for any clarifications
> Steve
>
Author
13 Sep 2006 7:19 PM
Jon Skeet [C# MVP]
Cowboy (Gregory A. Beamer) <NoSpamMgbworld@comcast.netNoSpamM> wrote:
> When you stream, the streamreader will attempt to determine if the entire
> transmission is bogus or not. A null char is a definite signal that the
> stream is bad or non-existent. I am not sure why the white space in the
> first position would be seen as a null char, but I am not overly surprised
> either.

My guess is that the stream has been created using a UTF-16 (or
similar) encoding, giving 0 as the first byte.

> Remember that files, with the exception of ascii files, begin with real
> characters, not white space.

On what grounds? There's not reason why a text file stored in UTF-16 shouldn't
start with spaces, for instance. It wouldn't be valid XML, but it's a
perfectly reaosnable text file.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet   Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Author
14 Sep 2006 12:44 PM
Cowboy (Gregory A. Beamer)
"Jon Skeet [C# MVP]" <sk***@pobox.com> wrote in message
news:MPG.1f726a1fbddb259798d464@msnews.microsoft.com...
> Cowboy (Gregory A. Beamer) <NoSpamMgbworld@comcast.netNoSpamM> wrote:
>> When you stream, the streamreader will attempt to determine if the entire
>> transmission is bogus or not. A null char is a definite signal that the
>> stream is bad or non-existent. I am not sure why the white space in the
>> first position would be seen as a null char, but I am not overly
>> surprised
>> either.
>
> My guess is that the stream has been created using a UTF-16 (or
> similar) encoding, giving 0 as the first byte.

That would make sense. I will file that one away in a place where it is
accessible. :-)

>> Remember that files, with the exception of ascii files, begin with real
>> characters, not white space.
>
> On what grounds? There's not reason why a text file stored in UTF-16
> shouldn't
> start with spaces, for instance. It wouldn't be valid XML, but it's a
> perfectly reaosnable text file.


Okay, you have me on point #2. :-)

I got a bit too focused on XML.


--
Gregory A. Beamer
MVP; MCP: +I, SE, SD, DBA

*************************************************
Think outside of the box!
*************************************************
Author
13 Sep 2006 7:15 PM
Jon Skeet [C# MVP]
Steve B. <steve_beauge@com.msn_swap_msn_and_com> wrote:

<snip>

> So my question is : why does the first \r\n make the XmlReader throw an
> Exception ? I thought spaces are ignored (W3C specs).

Spaces are ignored (to some extent) in most of XML, but looking at the
specs I can't see anything in the definition of the prolog part of the
XML which allows whitespace before the XMLDecl part.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet   Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too

AddThis Social Bookmark Button