|
dev
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
RegEx to NOT find matchesthree texts: "Mail was send to...", "No data matching your search input" and "Request timed out" Problem is, one of the matches fits always and comes back very fast, the two others take forever to execute (and confirm this text is NOT in the page given back). With "fast" I mean "not measureable quick" and by "slow" I mean around 3 minutes! The webpage (=the text searched in) is between 60 and 150 kb long, the patters are searched with Dim m4 As Match = Regex.Match(ResponsePage, ".*Script\stimed\sout.*", RegexOptions.Singleline Or RegexOptions.IgnoreCase) Is that the normal behaviour of a RegEx to need so long to confirm something is NOT in the text? Can I speed that up or avoid or??? Thanks! Ralf On Fri, 29 Dec 2006 13:14:54 +0100, Ralf Hermanns wrote:
Show quote > In my project, I get the HTML code of a website, and I need to check for Hey Ralf,> three texts: "Mail was send to...", "No data matching your search input" > and "Request timed out" > > Problem is, one of the matches fits always and comes back very fast, the > two others take forever to execute (and confirm this text is NOT in the > page given back). With "fast" I mean "not measureable quick" and by > "slow" I mean around 3 minutes! > > The webpage (=the text searched in) is between 60 and 150 kb long, the > patters are searched with > Dim m4 As Match = Regex.Match(ResponsePage, ".*Script\stimed\sout.*", > RegexOptions.Singleline Or RegexOptions.IgnoreCase) > > Is that the normal behaviour of a RegEx to need so long to confirm > something is NOT in the text? Can I speed that up or avoid or??? > > Thanks! > Ralf You could try a couple of things: 1) Pre-compile the regular expression since you are clearly using it heavily 2) Remove the IgnoreCase and specify the error in the exact case, thus the regex has less work to do 3) If the error page has an error number (e.g. an IIS error, look for the error number instead e.g. 500, 401, etc) 4) You might also want to investigate lazy quantifiers. They MIGHT help Hey Ralf,
As you are only concerned with the presence of a match I would use the static (shared) Regex.IsMatch method as this returns a value indicating if there is a single match and does not return a match collection containing all matches. Also I would remove the .* from the start and end of the pattern as they are not needed and as Rad said I would remove the IgnoreCase option as Regex class will call input.ToLower() if the option is specified. I would then adjust the pattern as such: "[Ss]cript\s[Tt]imed\s[Oo]ut" Alternation is very fast but this pattern will only work with initial capitals. The best course of action is to initialise the Regex once (instance members of the Regex class are thread safe), assign it to a static member and then use the instance IsMatch method. Andy. Ralf Hermanns wrote: Show quote > In my project, I get the HTML code of a website, and I need to check for > three texts: "Mail was send to...", "No data matching your search input" > and "Request timed out" > > Problem is, one of the matches fits always and comes back very fast, the > two others take forever to execute (and confirm this text is NOT in the > page given back). With "fast" I mean "not measureable quick" and by > "slow" I mean around 3 minutes! > > The webpage (=the text searched in) is between 60 and 150 kb long, the > patters are searched with > Dim m4 As Match = Regex.Match(ResponsePage, ".*Script\stimed\sout.*", > RegexOptions.Singleline Or RegexOptions.IgnoreCase) > > > > > Is that the normal behaviour of a RegEx to need so long to confirm > something is NOT in the text? Can I speed that up or avoid or??? > > Thanks! > Ralf |
|||||||||||||||||||||||