Home All Groups Group Topic Archive Search About

XML parsing: ... illegal xml character

Author
11 Mar 2009 5:17 PM
Josh

WHY do I get this , "XML parsing: line 3, character 13, illegal xml
character" when using UTF-8 ??
declare @myxml xml;
set @myxml = '<?xml version="1.0" encoding="utf-8"?>
<myFields>
    <NAME>Kaboré</NAME>
</myFields>';
SELECT @myxml;

however, change it to UTF-16 and it's happy,

declare @myxml xml;
set @myxml = N'<?xml version="1.0" encoding="utf-16"?>
<myFields>
    <NAME>Kaboré</NAME>
</myFields>';
SELECT @myxml;

My problem is that I have a bunch of InfoPath forms saved to SharePoint
where the encoding is set to UTF-8.  It's going to be ugly, and perhaps
problematic, to try switching to UTF-16.

Thanks for your help!
--
Josh Smith

Author
11 Mar 2009 6:58 PM
Josh
sorry, I cross-posted this before I saw the sqlxml forum.  in any case, I
have one reply so far, but am still not quite comfortable with the situation.

thanks.
--
Josh Smith


Show quoteHide quote
"Josh" wrote:

> WHY do I get this , "XML parsing: line 3, character 13, illegal xml
> character" when using UTF-8 ??
> declare @myxml xml;
> set @myxml = '<?xml version="1.0" encoding="utf-8"?>
> <myFields>
>     <NAME>Kaboré</NAME>
> </myFields>';
> SELECT @myxml;
>
> however, change it to UTF-16 and it's happy,
>
> declare @myxml xml;
> set @myxml = N'<?xml version="1.0" encoding="utf-16"?>
> <myFields>
>     <NAME>Kaboré</NAME>
> </myFields>';
> SELECT @myxml;
>
> My problem is that I have a bunch of InfoPath forms saved to SharePoint
> where the encoding is set to UTF-8.  It's going to be ugly, and perhaps
> problematic, to try switching to UTF-16.
>
> Thanks for your help!
> --
> Josh Smith
Are all your drivers up to date? click for free checkup

Author
11 Mar 2009 7:04 PM
Josh
as I was saying... the other post is in SQL Programming at http://www.microsoft.com/communities/newsgroups/en-us/default.aspx?dg=microsoft.public.sqlserver.programming&mid=e56f26b2-e15a-4fcc-b57c-974e54188cdc

--
Josh Smith


Show quoteHide quote
"Josh" wrote:

> sorry, I cross-posted this before I saw the sqlxml forum.  in any case, I
> have one reply so far, but am still not quite comfortable with the situation.
>
> thanks.
> --
> Josh Smith
>
>
> "Josh" wrote:
>
> > WHY do I get this , "XML parsing: line 3, character 13, illegal xml
> > character" when using UTF-8 ??
> > declare @myxml xml;
> > set @myxml = '<?xml version="1.0" encoding="utf-8"?>
> > <myFields>
> >     <NAME>Kaboré</NAME>
> > </myFields>';
> > SELECT @myxml;
> >
> > however, change it to UTF-16 and it's happy,
> >
> > declare @myxml xml;
> > set @myxml = N'<?xml version="1.0" encoding="utf-16"?>
> > <myFields>
> >     <NAME>Kaboré</NAME>
> > </myFields>';
> > SELECT @myxml;
> >
> > My problem is that I have a bunch of InfoPath forms saved to SharePoint
> > where the encoding is set to UTF-8.  It's going to be ugly, and perhaps
> > problematic, to try switching to UTF-16.
> >
> > Thanks for your help!
> > --
> > Josh Smith
Author
19 Mar 2009 12:15 AM
Michael Coles
As you were told over there, by two people, é is not a valid one-byte UTF-8
character.  If you want validation on this, look here:
http://www.tony-franks.co.uk/UTF-8.htm

Notice the code for é is #258, a value which is outside the range of codes
that can be represented by one-byte in UTF-8 (0 to 255).  It doesn't have to
be overly troubling or problematic.  Look here.  Assign it to a string first
and replace "utf-8" with "utf-16":

DECLARE @strXML nvarchar(1000);
SET @strXML = N'<?xml version="1.0" encoding="utf-8"?>
<myFields>
<NAME>Kaboré</NAME>
</myFields>';
SET @strXML = REPLACE(@strXML, N'"utf-8"', N'"utf-16"');
SELECT @strXML;

DECLARE @myxml xml;
SET @myxml = @strXML;
SELECT @myxml;

You can encapsulate this additional logic in a stored proc on the server, or
you can do it on the client or in the middle tier before you even send it to
the server.

--

========
Michael Coles
"Pro T-SQL 2008 Programmer's Guide"
http://www.amazon.com/T-SQL-2008-Programmer-rsquo-Guide/dp/143021001X


Show quoteHide quote
"Josh" <jsm***@us.ci.REMOVETHIS.org> wrote in message
news:415146D7-EF41-4619-9ECC-C019D66268B1@microsoft.com...
> WHY do I get this , "XML parsing: line 3, character 13, illegal xml
> character" when using UTF-8 ??
> declare @myxml xml;
> set @myxml = '<?xml version="1.0" encoding="utf-8"?>
> <myFields>
> <NAME>Kaboré</NAME>
> </myFields>';
> SELECT @myxml;
>
> however, change it to UTF-16 and it's happy,
>
> declare @myxml xml;
> set @myxml = N'<?xml version="1.0" encoding="utf-16"?>
> <myFields>
> <NAME>Kaboré</NAME>
> </myFields>';
> SELECT @myxml;
>
> My problem is that I have a bunch of InfoPath forms saved to SharePoint
> where the encoding is set to UTF-8.  It's going to be ugly, and perhaps
> problematic, to try switching to UTF-16.
>
> Thanks for your help!
> --
> Josh Smith

Bookmark and Share