|
dev
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
string.Trim() and White spaces list?Hi
I'm working on a documentation on my application. I need to explain the reader that the white spaces will be removed from a text. I use string.Trim() method. Note: no arguments passed to the method. It is not enough to tell this to an untrained person; I need to tell him the complete list of white spaces, like: 1. space: ' ' 2. tab: '\t' My knowledge of what "whitespace" means stops here: space character and tab character. What else? May I dynamically query the framework the complete list of whitespaces? I'm only able to test a particular character if it's a whitespace or not (using char.IsWhiteSpace(...)) Thanks. Hi Adi,
There is a list of whitespace characters under the documentation for String.Trim() http://msdn2.microsoft.com/en-us/library/t97s7bs3(VS.80).aspx On Tue, 07 Nov 2006 07:48:23 +0100, adi <adrian.rot***@ikonsoft.ro> wrote: Show quote > Hi > > I'm working on a documentation on my application. > I need to explain the reader that the white spaces will be removed from > a text. > I use string.Trim() method. Note: no arguments passed to the method. > It is not enough to tell this to an untrained person; I need to tell > him the complete list of white spaces, like: > 1. space: ' ' > 2. tab: '\t' > > My knowledge of what "whitespace" means stops here: space character and > tab character. What else? > May I dynamically query the framework the complete list of whitespaces? > I'm only able to test a particular character if it's a whitespace or > not (using char.IsWhiteSpace(...)) > > Thanks. > -- Happy Coding! Morten Wennevik [C# MVP] Thanks Morten
The list is very useful. Now, for the second part of my question: is there a possibility to get this list in runtime? Note: I'm (still) using the 1.1 version of the framework, but solutions for later versions are welcome. Thanks. Adi. Morten Wennevik a scris: Show quote > Hi Adi, > > There is a list of whitespace characters under the documentation for > String.Trim() > > http://msdn2.microsoft.com/en-us/library/t97s7bs3(VS.80).aspx > > > > On Tue, 07 Nov 2006 07:48:23 +0100, adi <adrian.rot***@ikonsoft.ro> wrote: > > > Hi > > > > I'm working on a documentation on my application. > > I need to explain the reader that the white spaces will be removed from > > a text. > > I use string.Trim() method. Note: no arguments passed to the method. > > It is not enough to tell this to an untrained person; I need to tell > > him the complete list of white spaces, like: > > 1. space: ' ' > > 2. tab: '\t' > > > > My knowledge of what "whitespace" means stops here: space character and > > tab character. What else? > > May I dynamically query the framework the complete list of whitespaces? > > I'm only able to test a particular character if it's a whitespace or > > not (using char.IsWhiteSpace(...)) > > > > Thanks. > > > > > > -- > Happy Coding! > Morten Wennevik [C# MVP] The list is the same for any .Net 1.0, 1.1 or 2.0 or possibly above too.
As for getting this list at runtime I don't see how you can do that other than testing for Char.IsWhiteSpace for a whole range of numbers, which may take some time to compute. I did a few tests and I ended up with a list with far more characters than listed under String.Trim when using Char.IsWhiteSpace. Why do you need this list programmatically anyway? On Tue, 07 Nov 2006 08:42:14 +0100, adi <adrian.rot***@ikonsoft.ro> wrote: Show quote > Thanks Morten > > The list is very useful. > Now, for the second part of my question: is there a possibility to get > this list in runtime? > Note: I'm (still) using the 1.1 version of the framework, but solutions > for later versions are welcome. > > Thanks. > Adi. > > > Morten Wennevik a scris: >> Hi Adi, >> >> There is a list of whitespace characters under the documentation for >> String.Trim() >> >> http://msdn2.microsoft.com/en-us/library/t97s7bs3(VS.80).aspx >> >> >> >> On Tue, 07 Nov 2006 07:48:23 +0100, adi <adrian.rot***@ikonsoft.ro> >> wrote: >> >> > Hi >> > >> > I'm working on a documentation on my application. >> > I need to explain the reader that the white spaces will be removed >> from >> > a text. >> > I use string.Trim() method. Note: no arguments passed to the method. >> > It is not enough to tell this to an untrained person; I need to tell >> > him the complete list of white spaces, like: >> > 1. space: ' ' >> > 2. tab: '\t' >> > >> > My knowledge of what "whitespace" means stops here: space character >> and >> > tab character. What else? >> > May I dynamically query the framework the complete list of >> whitespaces? >> > I'm only able to test a particular character if it's a whitespace or >> > not (using char.IsWhiteSpace(...)) >> > >> > Thanks. >> > >> >> >> >> -- >> Happy Coding! >> Morten Wennevik [C# MVP] > -- Happy Coding! Morten Wennevik [C# MVP] Actually, you can't use IsWhiteSpace to determine which caracter is =
trimmed or not as there are whitespace characters that are not trimmed. = = Furthermore, there are characters that are trimmed but still not listed = in = the documentation. In the end, to get the proper list you may need to try to trim every = single character to determine if it will be trimmed with String.Trim() The code below will display which characters are considered whitespace a= nd = which will be trimmed. StringBuilder sb =3D new StringBuilder(); for (int i =3D 0; i < 65535; i++) { char c =3D (char)i; string s =3D c.ToString(); if (char.IsWhiteSpace(c) || s.Trim().Length =3D=3D 0) { sb.Append(i.ToString("X").PadLeft(4, '0')); if (char.IsWhiteSpace(c)) sb.Append("\tWhiteSpace"); else sb.Append("\t\t"); if (s.Trim().Length =3D=3D 0) sb.Append("\tTrimmed"); sb.AppendLine(); // use sb.Append("\r\n"); for .Net= 1.1 } } MessageBox.Show(sb.ToString()); Compared to the documentatet list this indicates that U+0085, U+1680, = U+2028, U+2029 will also be trimmed, despite not being listed, while = whitespace characters U+180E, U+202F, U+205F will not be trimmed. = Characters U+200B and U+FEFF is not considered whitespace characters but= = will be trimmed anyway. Upon even further research, in .Net 1.1 the list is correct and only = documented characters will be trimmed, but the documentations have not = been updated for .Net 2.0 On Tue, 07 Nov 2006 09:16:53 +0100, Morten Wennevik = <MortenWenne***@hotmail.com> wrote: Show quote > The list is the same for any .Net 1.0, 1.1 or 2.0 or possibly above to= -- =o. > > As for getting this list at runtime I don't see how you can do that = > other than testing for Char.IsWhiteSpace for a whole range of numbers,= = > which may take some time to compute. I did a few tests and I ended up= = > with a list with far more characters than listed under String.Trim whe= n = > using Char.IsWhiteSpace. > > Why do you need this list programmatically anyway? > > On Tue, 07 Nov 2006 08:42:14 +0100, adi <adrian.rot***@ikonsoft.ro> = > wrote: > >> Thanks Morten >> >> The list is very useful. >> Now, for the second part of my question: is there a possibility to ge= t >> this list in runtime? >> Note: I'm (still) using the 1.1 version of the framework, but solutio= ns >> for later versions are welcome. >> >> Thanks. >> Adi. >> >> >> Morten Wennevik a scris: >>> Hi Adi, >>> >>> There is a list of whitespace characters under the documentation for= >>> String.Trim() >>> >>> http://msdn2.microsoft.com/en-us/library/t97s7bs3(VS.80).aspx >>> >>> >>> >>> On Tue, 07 Nov 2006 07:48:23 +0100, adi <adrian.rot***@ikonsoft.ro> = = >>> wrote: >>> >>> > Hi >>> > >>> > I'm working on a documentation on my application. >>> > I need to explain the reader that the white spaces will be removed= = >>> from >>> > a text. >>> > I use string.Trim() method. Note: no arguments passed to the metho= d. >>> > It is not enough to tell this to an untrained person; I need to te= ll >>> > him the complete list of white spaces, like: >>> > 1. space: ' ' >>> > 2. tab: '\t' >>> > >>> > My knowledge of what "whitespace" means stops here: space characte= r = >>> and >>> > tab character. What else? >>> > May I dynamically query the framework the complete list of = >>> whitespaces? >>> > I'm only able to test a particular character if it's a whitespace = or >>> > not (using char.IsWhiteSpace(...)) >>> > >>> > Thanks. >>> > >>> >>> >>> >>> -- >>> Happy Coding! >>> Morten Wennevik [C# MVP] >> > > > Happy Coding! Morten Wennevik [C# MVP] Many thanks
Morten Wennevik a scris: Show quote > Actually, you can't use IsWhiteSpace to determine which caracter is > trimmed or not as there are whitespace characters that are not trimmed. > Furthermore, there are characters that are trimmed but still not listed in > the documentation. > > In the end, to get the proper list you may need to try to trim every > single character to determine if it will be trimmed with String.Trim() > > The code below will display which characters are considered whitespace and > which will be trimmed. > > StringBuilder sb = new StringBuilder(); > for (int i = 0; i < 65535; i++) > { > char c = (char)i; > string s = c.ToString(); > > if (char.IsWhiteSpace(c) || s.Trim().Length == 0) > { > sb.Append(i.ToString("X").PadLeft(4, '0')); > if (char.IsWhiteSpace(c)) > sb.Append("\tWhiteSpace"); > else > sb.Append("\t\t"); > if (s.Trim().Length == 0) > sb.Append("\tTrimmed"); > sb.AppendLine(); // use sb.Append("\r\n"); for .Net 1.1 > } > } > MessageBox.Show(sb.ToString()); > > Compared to the documentatet list this indicates that U+0085, U+1680, > U+2028, U+2029 will also be trimmed, despite not being listed, while > whitespace characters U+180E, U+202F, U+205F will not be trimmed. > Characters U+200B and U+FEFF is not considered whitespace characters but > will be trimmed anyway. > > Upon even further research, in .Net 1.1 the list is correct and only > documented characters will be trimmed, but the documentations have not > been updated for .Net 2.0 > > > > On Tue, 07 Nov 2006 09:16:53 +0100, Morten Wennevik > <MortenWenne***@hotmail.com> wrote: > > > The list is the same for any .Net 1.0, 1.1 or 2.0 or possibly above too. > > > > As for getting this list at runtime I don't see how you can do that > > other than testing for Char.IsWhiteSpace for a whole range of numbers, > > which may take some time to compute. I did a few tests and I ended up > > with a list with far more characters than listed under String.Trim when > > using Char.IsWhiteSpace. > > > > Why do you need this list programmatically anyway? > > > > On Tue, 07 Nov 2006 08:42:14 +0100, adi <adrian.rot***@ikonsoft.ro> > > wrote: > > > >> Thanks Morten > >> > >> The list is very useful. > >> Now, for the second part of my question: is there a possibility to get > >> this list in runtime? > >> Note: I'm (still) using the 1.1 version of the framework, but solutions > >> for later versions are welcome. > >> > >> Thanks. > >> Adi. > >> > >> > >> Morten Wennevik a scris: > >>> Hi Adi, > >>> > >>> There is a list of whitespace characters under the documentation for > >>> String.Trim() > >>> > >>> http://msdn2.microsoft.com/en-us/library/t97s7bs3(VS.80).aspx > >>> > >>> > >>> > >>> On Tue, 07 Nov 2006 07:48:23 +0100, adi <adrian.rot***@ikonsoft.ro> > >>> wrote: > >>> > >>> > Hi > >>> > > >>> > I'm working on a documentation on my application. > >>> > I need to explain the reader that the white spaces will be removed > >>> from > >>> > a text. > >>> > I use string.Trim() method. Note: no arguments passed to the method. > >>> > It is not enough to tell this to an untrained person; I need to tell > >>> > him the complete list of white spaces, like: > >>> > 1. space: ' ' > >>> > 2. tab: '\t' > >>> > > >>> > My knowledge of what "whitespace" means stops here: space character > >>> and > >>> > tab character. What else? > >>> > May I dynamically query the framework the complete list of > >>> whitespaces? > >>> > I'm only able to test a particular character if it's a whitespace or > >>> > not (using char.IsWhiteSpace(...)) > >>> > > >>> > Thanks. > >>> > > >>> > >>> > >>> > >>> -- > >>> Happy Coding! > >>> Morten Wennevik [C# MVP] > >> > > > > > > > > > > -- > Happy Coding! > Morten Wennevik [C# MVP] |
|||||||||||||||||||||||