|
dev
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
Character encoding - 1252 vs. ISO-8859-1I was wondering why one would specify character encoding of 1252 vs.
ISO-8859-1 when retrieving data via HTTP. My circumstance is that I am retrieving XML via HTTP with French characters in it and I have specified the encoding as follows: Dim str as New StreamReader([data source], system.text.encoding.getencoding("ISO-8859-1")) Doing this works fine and I retrieve the data without the special French characters being dropped. When I change the above line of code to the following: Dim str as New StreamReader([data source], System.Text.Encoding.GetEncoding(1252)) The end result is the same. Is there any advantage to one encoding over another? Thus wrote js,
Show quote > I was wondering why one would specify character encoding of 1252 vs. Well, both are dated. Windows-1252 is actually an extension of ISO-8859-1. > ISO-8859-1 when retrieving data via HTTP. My circumstance is that I > am retrieving XML via HTTP with French characters in it and I have > specified the encoding as follows: > > Dim str as New StreamReader([data source], > system.text.encoding.getencoding("ISO-8859-1")) > Doing this works fine and I retrieve the data without the special > French characters being dropped. When I change the above line of code > to the following: > > Dim str as New StreamReader([data source], > System.Text.Encoding.GetEncoding(1252)) > The end result is the same. > > Is there any advantage to one encoding over another? See http://www.microsoft.com/globaldev/reference/sbcs/1252.mspx and http://www.microsoft.com/globaldev/reference/iso/28591.mspx. ISO-8859-1 does not contain €, nor the uppercase and lowercase "oe" ligature (Unicode \u0152 and \u0153). Windows-1252 contains both. Modern applications should rather use one of the Unicode Transformation Formats like UTF-8. Cheers, -- Joerg Jooss news-re***@joergjooss.de > Well, both are dated. Windows-1252 is actually an extension of Okay, that is what I was thinking (in terms of the difference between> ISO-8859-1. See > http://www.microsoft.com/globaldev/reference/sbcs/1252.mspx and > http://www.microsoft.com/globaldev/reference/iso/28591.mspx. > ISO-8859-1 does not contain €, nor the uppercase and > lowercase "oe" ligature (Unicode \u0152 and \u0153). > Windows-1252 contains both. > > Modern applications should rather use one of the Unicode > Transformation Formats like UTF-8. the two of them) when I was researching the issue but figured that there must be something else I was missing. Unfortunately I cannot get our remote partners to switch to UTF-8 (or something else more current) so I am stuck with it but at least I feel comfortable with what I am doing. Thank you Joerg; great informations and assistance as always. J. |
|||||||||||||||||||||||