Home All Groups Group Topic Archive Search About

string.Trim() and White spaces list?

Author
7 Nov 2006 6:48 AM
adi
Hi

I'm working on a documentation on my application.
I need to explain the reader that the white spaces will be removed from
a text.
I use string.Trim() method. Note: no arguments passed to the method.
It is not enough to tell this to an untrained person; I need to tell
him the complete list of white spaces, like:
1. space: ' '
2. tab: '\t'

My knowledge of what "whitespace" means stops here: space character and
tab character. What else?
May I dynamically query the framework the complete list of whitespaces?
I'm only able to test a particular character if it's a whitespace or
not (using char.IsWhiteSpace(...))

Thanks.

Author
7 Nov 2006 7:19 AM
Morten Wennevik
Hi Adi,

There is a list of whitespace characters under the documentation for 
String.Trim()

http://msdn2.microsoft.com/en-us/library/t97s7bs3(VS.80).aspx



On Tue, 07 Nov 2006 07:48:23 +0100, adi <adrian.rot***@ikonsoft.ro> wrote:

Show quote
> Hi
>
> I'm working on a documentation on my application.
> I need to explain the reader that the white spaces will be removed from
> a text.
> I use string.Trim() method. Note: no arguments passed to the method.
> It is not enough to tell this to an untrained person; I need to tell
> him the complete list of white spaces, like:
> 1. space: ' '
> 2. tab: '\t'
>
> My knowledge of what "whitespace" means stops here: space character and
> tab character. What else?
> May I dynamically query the framework the complete list of whitespaces?
> I'm only able to test a particular character if it's a whitespace or
> not (using char.IsWhiteSpace(...))
>
> Thanks.
>



--
Happy Coding!
Morten Wennevik [C# MVP]
Author
7 Nov 2006 7:42 AM
adi
Thanks Morten

The list is very useful.
Now, for the second part of my question: is there a possibility to get
this list in runtime?
Note: I'm (still) using the 1.1 version of the framework, but solutions
for later versions are welcome.

Thanks.
Adi.


Morten Wennevik a scris:
Show quote
> Hi Adi,
>
> There is a list of whitespace characters under the documentation for
> String.Trim()
>
> http://msdn2.microsoft.com/en-us/library/t97s7bs3(VS.80).aspx
>
>
>
> On Tue, 07 Nov 2006 07:48:23 +0100, adi <adrian.rot***@ikonsoft.ro> wrote:
>
> > Hi
> >
> > I'm working on a documentation on my application.
> > I need to explain the reader that the white spaces will be removed from
> > a text.
> > I use string.Trim() method. Note: no arguments passed to the method.
> > It is not enough to tell this to an untrained person; I need to tell
> > him the complete list of white spaces, like:
> > 1. space: ' '
> > 2. tab: '\t'
> >
> > My knowledge of what "whitespace" means stops here: space character and
> > tab character. What else?
> > May I dynamically query the framework the complete list of whitespaces?
> > I'm only able to test a particular character if it's a whitespace or
> > not (using char.IsWhiteSpace(...))
> >
> > Thanks.
> >
>
>
>
> --
> Happy Coding!
> Morten Wennevik [C# MVP]
Author
7 Nov 2006 8:16 AM
Morten Wennevik
The list is the same for any .Net 1.0, 1.1 or 2.0 or possibly above too.

As for getting this list at runtime I don't see how you can do that other 
than testing for Char.IsWhiteSpace for a whole range of numbers, which may 
take some time to compute.  I did a few tests and I ended up with a list 
with far more characters than listed under String.Trim when using 
Char.IsWhiteSpace.

Why do you need this list programmatically anyway?

On Tue, 07 Nov 2006 08:42:14 +0100, adi <adrian.rot***@ikonsoft.ro> wrote:

Show quote
> Thanks Morten
>
> The list is very useful.
> Now, for the second part of my question: is there a possibility to get
> this list in runtime?
> Note: I'm (still) using the 1.1 version of the framework, but solutions
> for later versions are welcome.
>
> Thanks.
> Adi.
>
>
> Morten Wennevik a scris:
>> Hi Adi,
>>
>> There is a list of whitespace characters under the documentation for
>> String.Trim()
>>
>> http://msdn2.microsoft.com/en-us/library/t97s7bs3(VS.80).aspx
>>
>>
>>
>> On Tue, 07 Nov 2006 07:48:23 +0100, adi <adrian.rot***@ikonsoft.ro> 
>> wrote:
>>
>> > Hi
>> >
>> > I'm working on a documentation on my application.
>> > I need to explain the reader that the white spaces will be removed 
>> from
>> > a text.
>> > I use string.Trim() method. Note: no arguments passed to the method.
>> > It is not enough to tell this to an untrained person; I need to tell
>> > him the complete list of white spaces, like:
>> > 1. space: ' '
>> > 2. tab: '\t'
>> >
>> > My knowledge of what "whitespace" means stops here: space character 
>> and
>> > tab character. What else?
>> > May I dynamically query the framework the complete list of 
>> whitespaces?
>> > I'm only able to test a particular character if it's a whitespace or
>> > not (using char.IsWhiteSpace(...))
>> >
>> > Thanks.
>> >
>>
>>
>>
>> --
>> Happy Coding!
>> Morten Wennevik [C# MVP]
>



--
Happy Coding!
Morten Wennevik [C# MVP]
Author
7 Nov 2006 8:59 AM
Morten Wennevik
Actually, you can't use IsWhiteSpace to determine which caracter is  =

trimmed or not as there are whitespace characters that are not trimmed. =
  =

Furthermore, there are characters that are trimmed but still not listed =
in  =

the documentation.

In the end, to get the proper list you may need to try to trim every  =

single character to determine if it will be trimmed with String.Trim()

The code below will display which characters are considered whitespace a=
nd  =

which will be trimmed.

             StringBuilder sb =3D new StringBuilder();
             for (int i =3D 0; i < 65535; i++)
             {
                 char c =3D (char)i;
                 string s =3D c.ToString();

                 if (char.IsWhiteSpace(c) || s.Trim().Length =3D=3D 0)
                 {
                     sb.Append(i.ToString("X").PadLeft(4, '0'));
                     if (char.IsWhiteSpace(c))
                         sb.Append("\tWhiteSpace");
                     else
                         sb.Append("\t\t");
                     if (s.Trim().Length =3D=3D 0)
                         sb.Append("\tTrimmed");
                     sb.AppendLine(); // use sb.Append("\r\n"); for .Net=
1.1
                 }
             }
             MessageBox.Show(sb.ToString());

Compared to the documentatet list this indicates that U+0085, U+1680,  =

U+2028, U+2029 will also be trimmed, despite not being listed, while  =

whitespace characters U+180E, U+202F, U+205F will not be trimmed.   =

Characters U+200B and U+FEFF is not considered whitespace characters but=
  =

will be trimmed anyway.

Upon even further research, in .Net 1.1 the list is correct and only  =

documented characters will be trimmed, but the documentations have not  =

been updated for .Net 2.0



On Tue, 07 Nov 2006 09:16:53 +0100, Morten Wennevik  =

<MortenWenne***@hotmail.com> wrote:

Show quote
> The list is the same for any .Net 1.0, 1.1 or 2.0 or possibly above to=
o.
>
> As for getting this list at runtime I don't see how you can do that  =

> other than testing for Char.IsWhiteSpace for a whole range of numbers,=
  =

> which may take some time to compute.  I did a few tests and I ended up=
  =

> with a list with far more characters than listed under String.Trim whe=
n  =

> using Char.IsWhiteSpace.
>
> Why do you need this list programmatically anyway?
>
> On Tue, 07 Nov 2006 08:42:14 +0100, adi <adrian.rot***@ikonsoft.ro>  =

> wrote:
>
>> Thanks Morten
>>
>> The list is very useful.
>> Now, for the second part of my question: is there a possibility to ge=
t
>> this list in runtime?
>> Note: I'm (still) using the 1.1 version of the framework, but solutio=
ns
>> for later versions are welcome.
>>
>> Thanks.
>> Adi.
>>
>>
>> Morten Wennevik a scris:
>>> Hi Adi,
>>>
>>> There is a list of whitespace characters under the documentation for=

>>> String.Trim()
>>>
>>> http://msdn2.microsoft.com/en-us/library/t97s7bs3(VS.80).aspx
>>>
>>>
>>>
>>> On Tue, 07 Nov 2006 07:48:23 +0100, adi <adrian.rot***@ikonsoft.ro> =
=

>>> wrote:
>>>
>>> > Hi
>>> >
>>> > I'm working on a documentation on my application.
>>> > I need to explain the reader that the white spaces will be removed=
  =

>>> from
>>> > a text.
>>> > I use string.Trim() method. Note: no arguments passed to the metho=
d.
>>> > It is not enough to tell this to an untrained person; I need to te=
ll
>>> > him the complete list of white spaces, like:
>>> > 1. space: ' '
>>> > 2. tab: '\t'
>>> >
>>> > My knowledge of what "whitespace" means stops here: space characte=
r  =

>>> and
>>> > tab character. What else?
>>> > May I dynamically query the framework the complete list of  =

>>> whitespaces?
>>> > I'm only able to test a particular character if it's a whitespace =
or
>>> > not (using char.IsWhiteSpace(...))
>>> >
>>> > Thanks.
>>> >
>>>
>>>
>>>
>>> --
>>> Happy Coding!
>>> Morten Wennevik [C# MVP]
>>
>
>
>



-- =

Happy Coding!
Morten Wennevik [C# MVP]
Author
7 Nov 2006 10:59 AM
adi
Many thanks


Morten Wennevik a scris:
Show quote
> Actually, you can't use IsWhiteSpace to determine which caracter is
> trimmed or not as there are whitespace characters that are not trimmed.
> Furthermore, there are characters that are trimmed but still not listed in
> the documentation.
>
> In the end, to get the proper list you may need to try to trim every
> single character to determine if it will be trimmed with String.Trim()
>
> The code below will display which characters are considered whitespace and
> which will be trimmed.
>
>              StringBuilder sb = new StringBuilder();
>              for (int i = 0; i < 65535; i++)
>              {
>                  char c = (char)i;
>                  string s = c.ToString();
>
>                  if (char.IsWhiteSpace(c) || s.Trim().Length == 0)
>                  {
>                      sb.Append(i.ToString("X").PadLeft(4, '0'));
>                      if (char.IsWhiteSpace(c))
>                          sb.Append("\tWhiteSpace");
>                      else
>                          sb.Append("\t\t");
>                      if (s.Trim().Length == 0)
>                          sb.Append("\tTrimmed");
>                      sb.AppendLine(); // use sb.Append("\r\n"); for .Net 1.1
>                  }
>              }
>              MessageBox.Show(sb.ToString());
>
> Compared to the documentatet list this indicates that U+0085, U+1680,
> U+2028, U+2029 will also be trimmed, despite not being listed, while
> whitespace characters U+180E, U+202F, U+205F will not be trimmed.
> Characters U+200B and U+FEFF is not considered whitespace characters but
> will be trimmed anyway.
>
> Upon even further research, in .Net 1.1 the list is correct and only
> documented characters will be trimmed, but the documentations have not
> been updated for .Net 2.0
>
>
>
> On Tue, 07 Nov 2006 09:16:53 +0100, Morten Wennevik
> <MortenWenne***@hotmail.com> wrote:
>
> > The list is the same for any .Net 1.0, 1.1 or 2.0 or possibly above too.
> >
> > As for getting this list at runtime I don't see how you can do that
> > other than testing for Char.IsWhiteSpace for a whole range of numbers,
> > which may take some time to compute.  I did a few tests and I ended up
> > with a list with far more characters than listed under String.Trim when
> > using Char.IsWhiteSpace.
> >
> > Why do you need this list programmatically anyway?
> >
> > On Tue, 07 Nov 2006 08:42:14 +0100, adi <adrian.rot***@ikonsoft.ro>
> > wrote:
> >
> >> Thanks Morten
> >>
> >> The list is very useful.
> >> Now, for the second part of my question: is there a possibility to get
> >> this list in runtime?
> >> Note: I'm (still) using the 1.1 version of the framework, but solutions
> >> for later versions are welcome.
> >>
> >> Thanks.
> >> Adi.
> >>
> >>
> >> Morten Wennevik a scris:
> >>> Hi Adi,
> >>>
> >>> There is a list of whitespace characters under the documentation for
> >>> String.Trim()
> >>>
> >>> http://msdn2.microsoft.com/en-us/library/t97s7bs3(VS.80).aspx
> >>>
> >>>
> >>>
> >>> On Tue, 07 Nov 2006 07:48:23 +0100, adi <adrian.rot***@ikonsoft.ro>
> >>> wrote:
> >>>
> >>> > Hi
> >>> >
> >>> > I'm working on a documentation on my application.
> >>> > I need to explain the reader that the white spaces will be removed
> >>> from
> >>> > a text.
> >>> > I use string.Trim() method. Note: no arguments passed to the method.
> >>> > It is not enough to tell this to an untrained person; I need to tell
> >>> > him the complete list of white spaces, like:
> >>> > 1. space: ' '
> >>> > 2. tab: '\t'
> >>> >
> >>> > My knowledge of what "whitespace" means stops here: space character
> >>> and
> >>> > tab character. What else?
> >>> > May I dynamically query the framework the complete list of
> >>> whitespaces?
> >>> > I'm only able to test a particular character if it's a whitespace or
> >>> > not (using char.IsWhiteSpace(...))
> >>> >
> >>> > Thanks.
> >>> >
> >>>
> >>>
> >>>
> >>> --
> >>> Happy Coding!
> >>> Morten Wennevik [C# MVP]
> >>
> >
> >
> >
>
>
>
> --
> Happy Coding!
> Morten Wennevik [C# MVP]

AddThis Social Bookmark Button