Home All Groups Group Topic Archive Search About

Custom character escape in a regular expression?

Author
19 Apr 2006 9:57 PM
Bob
I am trying to parse a fairly simple expression:

arg1|arg2|arg3

The problem is that in arg2, sometimes I see the pipe character.  The
difference is that the pipe character is prefixed by a backslash.  For
example:

arg1|a\|rg2|arg3

Still at this point it wouldn't be to hard to parse, but the problem is that
these arguments may also contain any other standard character escapes.  For
example:

arg2\n|a\\\|rg2\\|arg3

Should match:

arg2{lf}
a\|rg2\
arg3

Is it possible to do this?

Thanks

Author
20 Apr 2006 11:13 AM
Kevin Spencer
(?:\\\\|\\\||\\\w+|\w+)+

I used a slight variation of yours with a oine break to test:

arg2\n|a\\\|rg2\\|arg3|
a\|rg4

The way this regular expression works is using alternation. In regular
expressions, alternations always work in the order in which they appear.
That is, the alternate is only used if the first pattern fails to match. So,
first, it looks for "\\" - which effectively eliminates all "\\"
combinations from the group. This leaves:

arg2\n|a\|rg2|arg3|
a\|rg4

Next, it looks for the "\|" combination, and by matching that, effectively
eliminates it from the group, leaving:

arg2\n|arg2|arg3|
arg4

Next, it looks for the "\w" combination (any backslash followed by any
"word" character), leaving:

arg2|arg2|arg3|
arg4

Finally, it checks for any "w" (word) character, leaving:

|||

It groups the matches into a single group, which it defines as any number of
the matches contained in it, so that there is one match per item.

--
HTH,

Kevin Spencer
Microsoft MVP
Professional Numbskull

Hard work is a medication for which
there is no placebo.

Show quote
"Bob" <nob***@nowhere.com> wrote in message
news:OHNjZx$YGHA.444@TK2MSFTNGP05.phx.gbl...
>I am trying to parse a fairly simple expression:
>
> arg1|arg2|arg3
>
> The problem is that in arg2, sometimes I see the pipe character.  The
> difference is that the pipe character is prefixed by a backslash.  For
> example:
>
> arg1|a\|rg2|arg3
>
> Still at this point it wouldn't be to hard to parse, but the problem is
> that these arguments may also contain any other standard character
> escapes.  For example:
>
> arg2\n|a\\\|rg2\\|arg3
>
> Should match:
>
> arg2{lf}
> a\|rg2\
> arg3
>
> Is it possible to do this?
>
> Thanks
>
>

AddThis Social Bookmark Button