Home All Groups Group Topic Archive Search About

vertical tab in XML

Author
6 Nov 2007 3:16 PM
Andy Fish
hello,

I have an xml document that contains an element like this:

<foo title="hello, world&#xB;"/>

I can edit the file with visual studio or XML spy without any warnings.
however, if I try to process it using .Net 2.0 XslCompiledTransform, I get
the error:

System.ArgumentException: ' ', hexadecimal value 0x0B, is an invalid
character.

running the same transformation in .Net 1.1 or XMLSpy's built-in XSLT
processor does not give an error.

I have seen in the XML specification that character code 0B (vt) is not a
valid XML character but I'm not quite clear on whether this means that a
character reference to vt is also invalid.

either way surely something is wrong? - I created the file in .Net 2.0 using
XmlDocument.Save() but I can't process it in .net 2.0. this is exactly the
sort of problem I thought using standard XML libraries was supposed to
protect me from.

Andy

Author
6 Nov 2007 3:36 PM
Martin Honnen
Andy Fish wrote:

Show quote
> I have an xml document that contains an element like this:
>
> <foo title="hello, world&#xB;"/>
>
> I can edit the file with visual studio or XML spy without any warnings.
> however, if I try to process it using .Net 2.0 XslCompiledTransform, I get
> the error:
>
> System.ArgumentException: ' ', hexadecimal value 0x0B, is an invalid
> character.
>
> running the same transformation in .Net 1.1 or XMLSpy's built-in XSLT
> processor does not give an error.
>
> I have seen in the XML specification that character code 0B (vt) is not a
> valid XML character but I'm not quite clear on whether this means that a
> character reference to vt is also invalid.

With XML 1.0 the character reference is not allowed.

> either way surely something is wrong? - I created the file in .Net 2.0 using
> XmlDocument.Save() but I can't process it in .net 2.0. this is exactly the
> sort of problem I thought using standard XML libraries was supposed to
> protect me from.

..NET 1.x did allow such character references, with .NET 2.0 you have the
choice to either allow them or disallow them, using the CheckCharacters
property on XmlWriterSettings (when you create XML) or XmlReaderSettings
(when you parse XML).
See
<URL:http://msdn2.microsoft.com/en-us/library/System.Xml.XmlWriterSettings.CheckCharacters.aspx>
and
<URL:http://msdn2.microsoft.com/en-us/library/System.Xml.XmlReaderSettings.CheckCharacters.aspx>

Thus if you do e.g.
   XmlWriterSettings settings = new XmlWriterSettings();
   settings.CheckCharacters = true;
   using (XmlWriter writer = XmlWriter.Create("file.xml"))
   {
     xmlDocumentInstance.Save(writer);
   }
then you should already get an error when trying to create the XML.





--

    Martin Honnen --- MVP XML
    http://JavaScript.FAQTs.com/
Author
6 Nov 2007 3:59 PM
Andy Fish
Show quote
"Martin Honnen" <mahotr***@yahoo.de> wrote in message
news:eTp$RrIIIHA.2100@TK2MSFTNGP03.phx.gbl...
> Andy Fish wrote:
>
>> I have an xml document that contains an element like this:
>>
>> <foo title="hello, world&#xB;"/>
>>
>> I can edit the file with visual studio or XML spy without any warnings.
>> however, if I try to process it using .Net 2.0 XslCompiledTransform, I
>> get the error:
>>
>> System.ArgumentException: ' ', hexadecimal value 0x0B, is an invalid
>> character.
>>
>> running the same transformation in .Net 1.1 or XMLSpy's built-in XSLT
>> processor does not give an error.
>>
>> I have seen in the XML specification that character code 0B (vt) is not a
>> valid XML character but I'm not quite clear on whether this means that a
>> character reference to vt is also invalid.
>
> With XML 1.0 the character reference is not allowed.
>
>> either way surely something is wrong? - I created the file in .Net 2.0
>> using XmlDocument.Save() but I can't process it in .net 2.0. this is
>> exactly the sort of problem I thought using standard XML libraries was
>> supposed to protect me from.
>
> .NET 1.x did allow such character references, with .NET 2.0 you have the
> choice to either allow them or disallow them, using the CheckCharacters
> property on XmlWriterSettings (when you create XML) or XmlReaderSettings
> (when you parse XML).
> See
> <URL:http://msdn2.microsoft.com/en-us/library/System.Xml.XmlWriterSettings.CheckCharacters.aspx>
> and
> <URL:http://msdn2.microsoft.com/en-us/library/System.Xml.XmlReaderSettings.CheckCharacters.aspx>
>
> Thus if you do e.g.
>   XmlWriterSettings settings = new XmlWriterSettings();
>   settings.CheckCharacters = true;
>   using (XmlWriter writer = XmlWriter.Create("file.xml"))
>   {
>     xmlDocumentInstance.Save(writer);
>   }
> then you should already get an error when trying to create the XML.
>
>

Thanks martin. a couple more questions if you happen to know the answer off
the top of your head:

1. can I turn off the CheckCharaters setting for XslCompiledTransform?

2. I would like to filter the document to remove such invalid data. is there
a built-in function anywhere to test if a character is valid in XML 1.0?

TIA

Andy

Show quote
> --
>
> Martin Honnen --- MVP XML
> http://JavaScript.FAQTs.com/
Author
6 Nov 2007 4:13 PM
Martin Honnen
Andy Fish wrote:


> 1. can I turn off the CheckCharaters setting for XslCompiledTransform?

You can pass an XmlReader created with XmlReaderSettings where
CheckCharacters is set to false to the Load method of
XslCompiledTransform or to the Transform method.
That reader should then allow the escaped characters, even if they are
not allowed according to the spec.
But I would rather avoid that and make sure your code does not create
any such XML by using XmlWriter with XmlWriterSettings and
CheckCharacters set to true.


> 2. I would like to filter the document to remove such invalid data. is there
> a built-in function anywhere to test if a character is valid in XML 1.0?

I don't think there is a method or function in the .NET framework class
library, you could implement your own making use of an XmlWriter in
fragment mode where you Write the character(s) you want to check and
catch any exception.

--

    Martin Honnen --- MVP XML
    http://JavaScript.FAQTs.com/
Author
6 Nov 2007 3:59 PM
Pavel Lepin
Andy Fish <ajf***@blueyonder.co.uk> wrote in
<Jz%Xi.93842$vI1.19***@fe1.news.blueyonder.co.uk>:
> I have an xml document that contains an element like this:
>
> <foo title="hello, world&#xB;"/>

No you don't:

pavel@debian:~/dev/xml$ xmllint vtab.xml
vtab.xml:1: parser error : xmlParseCharRef: invalid xmlChar
value 11
<foo title="hello, world&#xB;"/>
                             ^
pavel@debian:~/dev/xml$

> I can edit the file with visual studio or XML spy without
> any warnings. however, if I try to process it using .Net
> 2.0 XslCompiledTransform, I get the error:
>
> System.ArgumentException: ' ', hexadecimal value 0x0B, is
> an invalid character.

Right on money.

> running the same transformation in .Net 1.1 or XMLSpy's
> built-in XSLT processor does not give an error.

Those are broken then. Complain about broken tools to the
tools' vendors (well, Microsoft seems to have fixed it, -
and good luck trying to get Altova to comply).

> I have seen in the XML specification that character code
> 0B (vt) is not a valid XML character but I'm not quite
> clear on whether this means that a character reference to
> vt is also invalid.

It does.

  2.2 Characters

  [Definition: A parsed entity contains text, a sequence of
  characters, which may represent markup or character data.]
  [Definition: A character is an atomic unit of text as
  specified by ISO/IEC 10646:2000 [ISO/IEC 10646]. Legal
  characters are tab, carriage return, line feed, and the
  legal characters of Unicode and ISO/IEC 10646. The
  versions of these standards cited in A.1 Normative
  References were current at the time this document was
  prepared. New characters may be added to these standards
  by amendments or new editions. Consequently, XML
  processors MUST accept any character in the range
  specified for Char. ]

As I read it, it doesn't speak about the default
serialisation, but about the very XML infoset.

> either way surely something is wrong? - I created the file
> in .Net 2.0 using XmlDocument.Save() but I can't process
> it in .net 2.0.

Still broken then.

> this is exactly the sort of problem I thought using
> standard XML libraries was supposed to protect me from.

It is. So complain to the people who wrote those tools.

--
"I can't help but wonder if you... don't know a hell of a
lot more about practically every subject than Solomon ever
did."
Author
6 Nov 2007 8:06 PM
Richard Tobin
[I'm not honouring the Followup-To: line since we don't have that
newsgroup here, and anyway this is relevant in comp.text.xml.]

In article <fgq324$mr***@aioe.org>, Pavel Lepin  <p.le***@ctncorp.com> wrote:

>> I have an xml document that contains an element like this:

>> <foo title="hello, world&#xB;"/>

>No you don't:

Here is an XML document containing that element:

<?xml version="1.1"?>
<foo title="hello, world&#xB;"/>

Whether your tools support XML 1.1 is another matter.

-- Richard
--
"Consideration shall be given to the need for as many as 32 characters
in some alphabets" - X3.4, 1963.

AddThis Social Bookmark Button