|
dev
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
strings vs regular expressionshi,
I need to comapare or check for substrings in a given string. which would give better performance - string related comapare functions or regualr expressions.... Hello AVL,
AIFAIK String.Contains will be the fastest, because it's only call IndexOf when the regexp makes really waste processing --- WBR, Michael Nemtsev [.NET/C# MVP]. My blog: http://spaces.live.com/laflour Team blog: http://devkids.blogspot.com/ "The greatest danger for most of us is not that our aim is too high and we miss it, but that it is too low and we reach it" (c) Michelangelo A> hi, A> I need to comapare or check for substrings in a given string. A> which would give better performance - string related comapare A> functions or A> regualr expressions.... Hello,
but string.IndexOf has very bad implemention. If you want a fast string search, look for a .NET implementation of the Boyer-Moore algorithm - this is also used in regular expressions internals. Depending on the length of the text being searched and the frequency, you might want to consider a precompiled regex. Anyway, you should perform some performance testing yourself. It really depends on the circumstances. Best regards, Henning Krause Show quote "Michael Nemtsev" <nemt***@msn.com> wrote in message news:a279a63a3eb6c18c94a4da25d1b8e@msnews.microsoft.com... > Hello AVL, > > AIFAIK String.Contains will be the fastest, because it's only call IndexOf > when the regexp makes really waste processing > > --- > WBR, Michael Nemtsev [.NET/C# MVP]. My blog: > http://spaces.live.com/laflour > Team blog: http://devkids.blogspot.com/ > > "The greatest danger for most of us is not that our aim is too high and we > miss it, but that it is too low and we reach it" (c) Michelangelo > > A> hi, > A> I need to comapare or check for substrings in a given string. > A> which would give better performance - string related comapare > A> functions or > A> regualr expressions.... > > Hello Henning Krause [MVP - Exchange],
H> Anyway, you should perform some performance testing yourself. It H> really depends on the circumstances. That's the point, coz we dont know what the OP is looking for Show quote >> >> A> hi, >> A> I need to comapare or check for substrings in a given string. >> A> which would give better performance - string related comapare >> A> functions or >> A> regualr expressions.... On Apr 11, 9:48 am, "Henning Krause [MVP - Exchange]"
<newsgroups_rem***@this.infinitec.de> wrote: > but string.IndexOf has very bad implemention. If you want a fast string I wouldn't say that IndexOf has a "very bad" implementation. In *some*> search, look for a .NET implementation of the Boyer-Moore algorithm - this > is also used in regular expressions internals. cases it won't be as fast as doing the "pre-work" involved for Boyer- Moore, but I suspect in the vast majority of cases used in the real world, it's far quicker to use the "brute force" method, given that you're only looking for the string once (as far as String.IndexOf is concerned - you may be calling it multiple times, of course). I suppose String.IndexOf could apply some heuristics and guess whether it's worth building the tables (or whatever) for Boyer-Moore, but as I say, in the vast majority of real cases it won't make any odds. > Depending on the length of the text being searched and the frequency, you Agreed. If you know you're going to have to search for the same string> might want to consider a precompiled regex. > > Anyway, you should perform some performance testing yourself. It really > depends on the circumstances. lots of times in a performance-critical environment, it may be worth using regular expressions. I would use Contains until I'd actually proved it was a bottleneck though :) Jon Hi,
> Agreed. If you know you're going to have to search for the same string "The First Rule of Program Optimization: Don't do it. The Second Rule of > lots of times in a performance-critical environment, it may be worth > using regular expressions. I would use Contains until I'd actually > proved it was a bottleneck though :) > Program Optimization (for experts only!): Don't do it yet." - Michael A. Jackson Hi AVL,
Just to clear things up regarding regular expressions versus string functions. Use regular expressions when looking for a *pattern* of characters in a string, which may be different characters in the same pattern, and string functions for looking for substrings. What I mean by "patterns" is, for example, a hyperlink in an HTML document. A hyperlink is a string that must follow certain rules. It must begin with the character sequence "<a" followed by one or more white space characters, followed by 0 or more attribute name=value pairs, followed by the ">" character. This is followed by a string of text that is followed by the "</a>" character sequence. Note that only several of the characters are specified, and you don't know what the rest of them will be. So, how do you look for a string that satisfies these rules? Example: (?m)(?i)(?<=<a)(?:(?:\s+href=(?<href>[^>]+))|(?:\s+[^=>]+=[^>]+))*>(?<innerHtml>[^<]*)(?=</a>) The above is a regular expression that identifies substrings that satisfy those rules. In addition, it captures 2 groups, one for the link text, one for the innerHtml of the anchor. You could not use a string function to find this pattern. Generally, string functions are faster than regular expressions, but when looking for patterns (groups of characters that satisfy rules), regular expressions are the fastest method. -- Show quoteHTH, Kevin Spencer Microsoft MVP Printing Components, Email Components, FTP Client Classes, Enhanced Data Controls, much more. DSI PrintManager, Miradyne Component Libraries: http://www.miradyne.net "AVL" <A**@discussions.microsoft.com> wrote in message news:A15C2E96-968B-4323-A3BF-D1F5663FB776@microsoft.com... > hi, > I need to comapare or check for substrings in a given string. > which would give better performance - string related comapare functions or > regualr expressions.... |
|||||||||||||||||||||||