Home All Groups Group Topic Archive Search About
Author
13 Jan 2005 4:13 PM
Luke
I am using the following code to decode a UTF string from an XML element. 
All the fonts being used is Arial Unicode MS and the UTF characters are not
corrupt as they display correctly within the raw XML page.  The code below
works fine with the majority of characters, with the exception of various
characters such as, U+00FC and U+00FE  Any assistance would be greatly
appreciated.

Luke

Dim MessageString As String
Dim MessageBuffer() As Byte
Dim MessageChar() As Char
Dim Decoder As System.Text.Decoder
Dim UTF8Code As String
'
MessageString = CurrentMessage.Value
'
' Decode the UTF bytes that are displayed within the XML Service
MessageBuffer = System.Text.Encoding.UTF8.GetBytes(MessageString.ToCharArray)

ReDim MessageChar(MessageBuffer.Length)
Decoder = System.Text.Encoding.UTF8.GetDecoder()
Decoder.GetChars(MessageBuffer, 0, MessageBuffer.Length, MessageChar, 0)
'Loop through the Char() array
UTF8Code = String.Empty
For Each Character As Char In MessageChar
            ' Format for RTB output
            UTF8Code &= "\u" & Convert.ToUInt32(Character).ToString() & "?"
Next Character
MessageString = UTF8Code

Author
13 Jan 2005 5:56 PM
Jon Skeet [C# MVP]
Luke <L***@discussions.microsoft.com> wrote:
Show quote
> I am using the following code to decode a UTF string from an XML element. 
> All the fonts being used is Arial Unicode MS and the UTF characters are not
> corrupt as they display correctly within the raw XML page.  The code below
> works fine with the majority of characters, with the exception of various
> characters such as, U+00FC and U+00FE  Any assistance would be greatly
> appreciated.
>
> Luke
>
> Dim MessageString As String
> Dim MessageBuffer() As Byte
> Dim MessageChar() As Char
> Dim Decoder As System.Text.Decoder
> Dim UTF8Code As String
> '
> MessageString = CurrentMessage.Value
> '
> ' Decode the UTF bytes that are displayed within the XML Service
> MessageBuffer = System.Text.Encoding.UTF8.GetBytes(MessageString.ToCharArray)
>
> ReDim MessageChar(MessageBuffer.Length)
> Decoder = System.Text.Encoding.UTF8.GetDecoder()
> Decoder.GetChars(MessageBuffer, 0, MessageBuffer.Length, MessageChar, 0)
> 'Loop through the Char() array
> UTF8Code = String.Empty
> For Each Character As Char In MessageChar
>             ' Format for RTB output
>             UTF8Code &= "\u" & Convert.ToUInt32(Character).ToString() & "?"
> Next Character
> MessageString = UTF8Code

It's not entirely clear what you're seeing that you don't expect.

Could you post a short but complete program which demonstrates the
problem?

See http://www.pobox.com/~skeet/csharp/complete.html for details of
what I mean by that.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too

AddThis Social Bookmark Button