Home All Groups Group Topic Archive Search About

Regular Expression Bug?

Author
10 Nov 2006 1:54 PM
Ringwraith
By working with RegularExpressions I found some inconsistency between the
Escape and Match method.

Has anyone an explanation for this behaviour or is this a bug?

Following the description of the problem:

'Escape'  DOES NOT recognize ']' as a metacharacter.
'Match(es)' DOES recognize ']' as a metacharacter.

void Wanted_but_not_working()
{
  string OpenMask = "[";
  string CloseMask = "]";
  OpenMask = Regex.Escape(OpenMask);
  OpenMask = Regex.Escape(OpenMask);
  //Expected Result: OpenMask="\\\[";
  CloseMask = Regex.Escape(CloseMask);
  CloseMask = Regex.Escape(CloseMask);
  //Expected Result: CloseMask="\\\]";  
  //Why is ']' no metacharacter?
  //It has to be. Otherwise the Match below should work.
  //Either the Escape or the Match operation does not work properly.

  string actualDefinition = "[KG]*[IND]*[VHW]*[KZ]";
  actualDefinition = Regex.Escape(actualDefinition);
  //Expected Result: "\[KG\]\*\[IND\]\*\[VHW\]\*\[KZ\]"

  string pattern =
String.Concat(OpenMask,"[^",OpenMask,CloseMask,"]+",CloseMask);
  //Expected Result: "\\\[[^\\\[\\\]]+\\\]"
  MatchCollection ma = Regex.Matches(actualDefinition, pattern,
RegexOptions.Compiled | RegexOptions.CultureInvariant |
RegexOptions.IgnoreCase | RegexOptions.IgnorePatternWhitespace);
  //Expected Result: 4 matches... NOTHING
}

void Manual_and_working_but_not_wanted()
{
  string OpenMask = @"\\\[";
  string CloseMask = @"\\\]";

  string actualDefinition = @"\[KG\]\*\[IND\]\*\[VHW\]\*\[KZ\]";
  string pattern = String.Concat(OpenMask, "[^", OpenMask, CloseMask, "]+",
CloseMask);
  MatchCollection ma = Regex.Matches(actualDefinition, pattern,
RegexOptions.Compiled | RegexOptions.CultureInvariant |
RegexOptions.IgnoreCase | RegexOptions.IgnorePatternWhitespace);
  //Expected Result: 4 matches ... OK
}

void Crazy_Workaround()
{
  string OpenMask = "[";
  string CloseMask = @"\]";
  OpenMask = Regex.Escape(OpenMask);
  OpenMask = Regex.Escape(OpenMask);
  //Expected Result: OpenMask="\\\[";

  string actualDefinition = "[KG]*[IND]*[VHW]*[KZ]";
  actualDefinition = Regex.Escape(actualDefinition);
  string pattern = String.Concat(OpenMask, "[^", OpenMask, CloseMask, "]+",
CloseMask);
  MatchCollection ma = Regex.Matches(actualDefinition, pattern,
RegexOptions.Compiled | RegexOptions.CultureInvariant |
RegexOptions.IgnoreCase | RegexOptions.IgnorePatternWhitespace);
  //Expected Result: 4 matches... OK
}

Author
10 Nov 2006 3:12 PM
Carl Daniel [VC++ MVP]
Ringwraith wrote:
> By working with RegularExpressions I found some inconsistency between
> the Escape and Match method.
>
> Has anyone an explanation for this behaviour or is this a bug?

I'd call it an oversight in Regex.Escape.  The issue is that ] is
technically not a meta character; nor is ) or }.  These characters only have
special meaning when they follow an unmatched, unescaped instance of the
corresponding "open" character:  they're contextual meta-characters, if you
will.

You can submit a bug report on
http://connect.microsoft.com/feedback/default.aspx?SiteID=210

I'd expect that it will be closed as by-design though, since Regex.Escape is
working exactly as documented:

http://msdn2.microsoft.com/en-gb/library/system.text.regularexpressions.regex.escape.aspx

-cd
Author
11 Nov 2006 2:19 PM
Ringwraith
Thanks for the link to the feedback-site.

This 'bug' has already been submitted in March 2005. https://connect.microsoft.com/feedback/viewfeedback.aspx?FeedbackID=95536&wa=wsignin1.0&siteid=210

AddThis Social Bookmark Button