|
dev
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
UTF8 DecoderAll the fonts being used is Arial Unicode MS and the UTF characters are not corrupt as they display correctly within the raw XML page. The code below works fine with the majority of characters, with the exception of various characters such as, U+00FC and U+00FE Any assistance would be greatly appreciated. Luke Dim MessageString As String Dim MessageBuffer() As Byte Dim MessageChar() As Char Dim Decoder As System.Text.Decoder Dim UTF8Code As String ' MessageString = CurrentMessage.Value ' ' Decode the UTF bytes that are displayed within the XML Service MessageBuffer = System.Text.Encoding.UTF8.GetBytes(MessageString.ToCharArray) ReDim MessageChar(MessageBuffer.Length) Decoder = System.Text.Encoding.UTF8.GetDecoder() Decoder.GetChars(MessageBuffer, 0, MessageBuffer.Length, MessageChar, 0) 'Loop through the Char() array UTF8Code = String.Empty For Each Character As Char In MessageChar ' Format for RTB output UTF8Code &= "\u" & Convert.ToUInt32(Character).ToString() & "?" Next Character MessageString = UTF8Code Luke <L***@discussions.microsoft.com> wrote:
Show quote > I am using the following code to decode a UTF string from an XML element. It's not entirely clear what you're seeing that you don't expect.> All the fonts being used is Arial Unicode MS and the UTF characters are not > corrupt as they display correctly within the raw XML page. The code below > works fine with the majority of characters, with the exception of various > characters such as, U+00FC and U+00FE Any assistance would be greatly > appreciated. > > Luke > > Dim MessageString As String > Dim MessageBuffer() As Byte > Dim MessageChar() As Char > Dim Decoder As System.Text.Decoder > Dim UTF8Code As String > ' > MessageString = CurrentMessage.Value > ' > ' Decode the UTF bytes that are displayed within the XML Service > MessageBuffer = System.Text.Encoding.UTF8.GetBytes(MessageString.ToCharArray) > > ReDim MessageChar(MessageBuffer.Length) > Decoder = System.Text.Encoding.UTF8.GetDecoder() > Decoder.GetChars(MessageBuffer, 0, MessageBuffer.Length, MessageChar, 0) > 'Loop through the Char() array > UTF8Code = String.Empty > For Each Character As Char In MessageChar > ' Format for RTB output > UTF8Code &= "\u" & Convert.ToUInt32(Character).ToString() & "?" > Next Character > MessageString = UTF8Code Could you post a short but complete program which demonstrates the problem? See http://www.pobox.com/~skeet/csharp/complete.html for details of what I mean by that. -- Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet If replying to the group, please do not mail me too |
|||||||||||||||||||||||