Home All Groups Group Topic Archive Search About
Author
15 Jun 2006 10:07 PM
Serge
Hello,

I have a log file that looks something like this

ABCDEF ddfs adasd
A BRE asdd asd dfddf
EROI DFIOU eeroo
B BRE errt ssdrr
AAA eIR DFDF
C BRE AAA asdd

All lines are seperated by NEWLINE (\r\n) I want to extract lines that start
with BRE all the way to the end of the line and put them into a collection or
an array. So in this case I want line 2,4,6

Does any of you RegularExpressions gurus have an idea?

Thank you

Author
15 Jun 2006 10:17 PM
Sanjib Biswas
In your example, you said line starts with BRE but at the end you said you
want lines 2,4, and 6. So I am assuming you mean to say line containing BRE.
In that case after you have open the log file, read a line and do a string
match to see whether that line contains BRE and if its true then store that
line into an array or collection.

Regards
Sanjib

Show quote
"Serge" <Se***@discussions.microsoft.com> wrote in message
news:501E40F3-F275-457B-9956-440D1324C7A8@microsoft.com...
> Hello,
>
> I have a log file that looks something like this
>
> ABCDEF ddfs adasd
> A BRE asdd asd dfddf
> EROI DFIOU eeroo
> B BRE errt ssdrr
> AAA eIR DFDF
> C BRE AAA asdd
>
> All lines are seperated by NEWLINE (\r\n) I want to extract lines that
> start
> with BRE all the way to the end of the line and put them into a collection
> or
> an array. So in this case I want line 2,4,6
>
> Does any of you RegularExpressions gurus have an idea?
>
> Thank you
>
Author
15 Jun 2006 10:28 PM
Serge
Thats the thing...Im asking for a RegularExpression pattern. I know i can
just loop through all lines and use IndexOf and Substring...but I have a huge
file and it will take forever. That is why Im asking if anyone has more
experience with RegularExpressions cause it is new to me.

Also yes, I said I want to extract all lines that start with BRE. So in my
example I want lines

BRE asdd asd dfddf
BRE errt ssdrr
BRE AAA asdd

Notice how I dont want the first character (or it could be more than 1
character) I just want to get the line that starts with BRE to the end of the
line.

Thank you very much for your time

Serge


Show quote
"Sanjib Biswas" wrote:

> In your example, you said line starts with BRE but at the end you said you
> want lines 2,4, and 6. So I am assuming you mean to say line containing BRE.
> In that case after you have open the log file, read a line and do a string
> match to see whether that line contains BRE and if its true then store that
> line into an array or collection.
>
> Regards
> Sanjib
>
> "Serge" <Se***@discussions.microsoft.com> wrote in message
> news:501E40F3-F275-457B-9956-440D1324C7A8@microsoft.com...
> > Hello,
> >
> > I have a log file that looks something like this
> >
> > ABCDEF ddfs adasd
> > A BRE asdd asd dfddf
> > EROI DFIOU eeroo
> > B BRE errt ssdrr
> > AAA eIR DFDF
> > C BRE AAA asdd
> >
> > All lines are seperated by NEWLINE (\r\n) I want to extract lines that
> > start
> > with BRE all the way to the end of the line and put them into a collection
> > or
> > an array. So in this case I want line 2,4,6
> >
> > Does any of you RegularExpressions gurus have an idea?
> >
> > Thank you
> >
>
>
>
Author
15 Jun 2006 11:03 PM
Kevin Spencer
Hi Serge,

I'm a little confused by your first and second example. You mentioned
something about not wanting the "first letter," and in your first example,
the lines with "BRE" in them all started with a single letter, but the other
lines did not, and in your second example, the lines did *not* start with a
first letter.

Be that as it may, I know you're chomping at the bit to use regular
expressions here, but in this case you don't want to use a Regular
Expression, even though it would be easy enough to write. Why? Because you
said "I have a huge file." Regular Expressions work with strings, and I
don't think that (1) you want to read a "huge file" into a single string,
and (2) use a regular expression on a string that large.

In fact, from what you've described about the size of the file, and wanting
to parse by line, your best bet (IMHO) would be to use a TextReader to read
the file one line at a time, and use String.IndexOf to evaluate whether or
not to include that line in your results. You could, for example, use a
single character array to read the lines into one at a time.

--
HTH,

Kevin Spencer
Microsoft MVP
Professional Chicken Salad Alchemist

A lifetime is made up of
Lots of short moments.

Show quote
"Serge" <Se***@discussions.microsoft.com> wrote in message
news:A5EFF452-BB34-44CB-8477-7FAE1A3EC505@microsoft.com...
> Thats the thing...Im asking for a RegularExpression pattern. I know i can
> just loop through all lines and use IndexOf and Substring...but I have a
> huge
> file and it will take forever. That is why Im asking if anyone has more
> experience with RegularExpressions cause it is new to me.
>
> Also yes, I said I want to extract all lines that start with BRE. So in my
> example I want lines
>
> BRE asdd asd dfddf
> BRE errt ssdrr
> BRE AAA asdd
>
> Notice how I dont want the first character (or it could be more than 1
> character) I just want to get the line that starts with BRE to the end of
> the
> line.
>
> Thank you very much for your time
>
> Serge
>
>
> "Sanjib Biswas" wrote:
>
>> In your example, you said line starts with BRE but at the end you said
>> you
>> want lines 2,4, and 6. So I am assuming you mean to say line containing
>> BRE.
>> In that case after you have open the log file, read a line and do a
>> string
>> match to see whether that line contains BRE and if its true then store
>> that
>> line into an array or collection.
>>
>> Regards
>> Sanjib
>>
>> "Serge" <Se***@discussions.microsoft.com> wrote in message
>> news:501E40F3-F275-457B-9956-440D1324C7A8@microsoft.com...
>> > Hello,
>> >
>> > I have a log file that looks something like this
>> >
>> > ABCDEF ddfs adasd
>> > A BRE asdd asd dfddf
>> > EROI DFIOU eeroo
>> > B BRE errt ssdrr
>> > AAA eIR DFDF
>> > C BRE AAA asdd
>> >
>> > All lines are seperated by NEWLINE (\r\n) I want to extract lines that
>> > start
>> > with BRE all the way to the end of the line and put them into a
>> > collection
>> > or
>> > an array. So in this case I want line 2,4,6
>> >
>> > Does any of you RegularExpressions gurus have an idea?
>> >
>> > Thank you
>> >
>>
>>
>>
Author
15 Jun 2006 11:19 PM
Serge
Thank you for your response. I already tried using it on a PER LINE basis. It
just takes too long. But HUGE file I mean about 10-50 MB. The example I
showed is not the actual file. I was just trying to get a sense of what
pattern to use as RegEx is a lot fast string manipulation then IndexOf,
Substring thing. Basically here are 2 lines from the actual log file.


3148:48 BRE: 17 13 51: penelope baraz King-Ice1 12:21AM
3148:48 ALD: 17 13 51: penelope baraz King-Ice1 12:26AM
3148:48 BRE: 17 13 51: penelope baraz King-Ice1 12:34AM
3148:48 LLD: 17 13 51: penelope baraz King-Ice1 12:45AM

As you can see I have 4 lines in this example but i want to extract lines:

BRE: 17 13 51: penelope baraz King-Ice1 12:21AM
BRE: 17 13 51: penelope baraz King-Ice1 12:34AM

Notice that before the word BRE there are some other info that I dont want.

Looping through all lines takes too much time however it takes only a sec or
2 to read it into a string variable so I dont think its a problem.

Thank you very much.
Show quote
"Kevin Spencer" wrote:

> Hi Serge,
>
> I'm a little confused by your first and second example. You mentioned
> something about not wanting the "first letter," and in your first example,
> the lines with "BRE" in them all started with a single letter, but the other
> lines did not, and in your second example, the lines did *not* start with a
> first letter.
>
> Be that as it may, I know you're chomping at the bit to use regular
> expressions here, but in this case you don't want to use a Regular
> Expression, even though it would be easy enough to write. Why? Because you
> said "I have a huge file." Regular Expressions work with strings, and I
> don't think that (1) you want to read a "huge file" into a single string,
> and (2) use a regular expression on a string that large.
>
> In fact, from what you've described about the size of the file, and wanting
> to parse by line, your best bet (IMHO) would be to use a TextReader to read
> the file one line at a time, and use String.IndexOf to evaluate whether or
> not to include that line in your results. You could, for example, use a
> single character array to read the lines into one at a time.
>
> --
> HTH,
>
> Kevin Spencer
> Microsoft MVP
> Professional Chicken Salad Alchemist
>
> A lifetime is made up of
> Lots of short moments.
>
> "Serge" <Se***@discussions.microsoft.com> wrote in message
> news:A5EFF452-BB34-44CB-8477-7FAE1A3EC505@microsoft.com...
> > Thats the thing...Im asking for a RegularExpression pattern. I know i can
> > just loop through all lines and use IndexOf and Substring...but I have a
> > huge
> > file and it will take forever. That is why Im asking if anyone has more
> > experience with RegularExpressions cause it is new to me.
> >
> > Also yes, I said I want to extract all lines that start with BRE. So in my
> > example I want lines
> >
> > BRE asdd asd dfddf
> > BRE errt ssdrr
> > BRE AAA asdd
> >
> > Notice how I dont want the first character (or it could be more than 1
> > character) I just want to get the line that starts with BRE to the end of
> > the
> > line.
> >
> > Thank you very much for your time
> >
> > Serge
> >
> >
> > "Sanjib Biswas" wrote:
> >
> >> In your example, you said line starts with BRE but at the end you said
> >> you
> >> want lines 2,4, and 6. So I am assuming you mean to say line containing
> >> BRE.
> >> In that case after you have open the log file, read a line and do a
> >> string
> >> match to see whether that line contains BRE and if its true then store
> >> that
> >> line into an array or collection.
> >>
> >> Regards
> >> Sanjib
> >>
> >> "Serge" <Se***@discussions.microsoft.com> wrote in message
> >> news:501E40F3-F275-457B-9956-440D1324C7A8@microsoft.com...
> >> > Hello,
> >> >
> >> > I have a log file that looks something like this
> >> >
> >> > ABCDEF ddfs adasd
> >> > A BRE asdd asd dfddf
> >> > EROI DFIOU eeroo
> >> > B BRE errt ssdrr
> >> > AAA eIR DFDF
> >> > C BRE AAA asdd
> >> >
> >> > All lines are seperated by NEWLINE (\r\n) I want to extract lines that
> >> > start
> >> > with BRE all the way to the end of the line and put them into a
> >> > collection
> >> > or
> >> > an array. So in this case I want line 2,4,6
> >> >
> >> > Does any of you RegularExpressions gurus have an idea?
> >> >
> >> > Thank you
> >> >
> >>
> >>
> >>
>
>
>
Author
16 Jun 2006 2:12 AM
Kevin Spencer
Hi Serge,

I see. Well, parsing it is likely to add some memory to the equation, but
you could read it in blocks if necessary. I still think a Regular Expression
would not be the way to go, though. Regular Expressions do some
backtracking, and I think that wouldn't be necessary. How about if you read
a block (or the whole) into an array of characters? You could then move
through the array one character at a time. The sequence would be a loop (in
pseudo-code):

Start at the beginning of the string, or at the first line break character
or sequence ("\r\n" or '\r' - depending on the document type).Read one
character at a time.
Find the character 'B'.
See if it is followed by 'R'.
If so, see if it is followed by an 'E'.
If all 3 are found in a row, read to the next line break.

Basically, that is what a regular expression does, but in a more roundabout
fashion, with backtracking, etc., because it is not looking for literal
characters, but for patterns. Since you're looking for literal characters in
a specific sequence, this solution would be faster, especially if you used a
pointer.

--
HTH,

Kevin Spencer
Microsoft MVP
Professional Chicken Salad Alchemist

A lifetime is made up of
Lots of short moments.

Show quote
"Serge" <Se***@discussions.microsoft.com> wrote in message
news:85214A0F-A301-4E06-BA7A-A3D1EBD43200@microsoft.com...
> Thank you for your response. I already tried using it on a PER LINE basis.
> It
> just takes too long. But HUGE file I mean about 10-50 MB. The example I
> showed is not the actual file. I was just trying to get a sense of what
> pattern to use as RegEx is a lot fast string manipulation then IndexOf,
> Substring thing. Basically here are 2 lines from the actual log file.
>
>
> 3148:48 BRE: 17 13 51: penelope baraz King-Ice1 12:21AM
> 3148:48 ALD: 17 13 51: penelope baraz King-Ice1 12:26AM
> 3148:48 BRE: 17 13 51: penelope baraz King-Ice1 12:34AM
> 3148:48 LLD: 17 13 51: penelope baraz King-Ice1 12:45AM
>
> As you can see I have 4 lines in this example but i want to extract lines:
>
> BRE: 17 13 51: penelope baraz King-Ice1 12:21AM
> BRE: 17 13 51: penelope baraz King-Ice1 12:34AM
>
> Notice that before the word BRE there are some other info that I dont
> want.
>
> Looping through all lines takes too much time however it takes only a sec
> or
> 2 to read it into a string variable so I dont think its a problem.
>
> Thank you very much.
> "Kevin Spencer" wrote:
>
>> Hi Serge,
>>
>> I'm a little confused by your first and second example. You mentioned
>> something about not wanting the "first letter," and in your first
>> example,
>> the lines with "BRE" in them all started with a single letter, but the
>> other
>> lines did not, and in your second example, the lines did *not* start with
>> a
>> first letter.
>>
>> Be that as it may, I know you're chomping at the bit to use regular
>> expressions here, but in this case you don't want to use a Regular
>> Expression, even though it would be easy enough to write. Why? Because
>> you
>> said "I have a huge file." Regular Expressions work with strings, and I
>> don't think that (1) you want to read a "huge file" into a single string,
>> and (2) use a regular expression on a string that large.
>>
>> In fact, from what you've described about the size of the file, and
>> wanting
>> to parse by line, your best bet (IMHO) would be to use a TextReader to
>> read
>> the file one line at a time, and use String.IndexOf to evaluate whether
>> or
>> not to include that line in your results. You could, for example, use a
>> single character array to read the lines into one at a time.
>>
>> --
>> HTH,
>>
>> Kevin Spencer
>> Microsoft MVP
>> Professional Chicken Salad Alchemist
>>
>> A lifetime is made up of
>> Lots of short moments.
>>
>> "Serge" <Se***@discussions.microsoft.com> wrote in message
>> news:A5EFF452-BB34-44CB-8477-7FAE1A3EC505@microsoft.com...
>> > Thats the thing...Im asking for a RegularExpression pattern. I know i
>> > can
>> > just loop through all lines and use IndexOf and Substring...but I have
>> > a
>> > huge
>> > file and it will take forever. That is why Im asking if anyone has more
>> > experience with RegularExpressions cause it is new to me.
>> >
>> > Also yes, I said I want to extract all lines that start with BRE. So in
>> > my
>> > example I want lines
>> >
>> > BRE asdd asd dfddf
>> > BRE errt ssdrr
>> > BRE AAA asdd
>> >
>> > Notice how I dont want the first character (or it could be more than 1
>> > character) I just want to get the line that starts with BRE to the end
>> > of
>> > the
>> > line.
>> >
>> > Thank you very much for your time
>> >
>> > Serge
>> >
>> >
>> > "Sanjib Biswas" wrote:
>> >
>> >> In your example, you said line starts with BRE but at the end you said
>> >> you
>> >> want lines 2,4, and 6. So I am assuming you mean to say line
>> >> containing
>> >> BRE.
>> >> In that case after you have open the log file, read a line and do a
>> >> string
>> >> match to see whether that line contains BRE and if its true then store
>> >> that
>> >> line into an array or collection.
>> >>
>> >> Regards
>> >> Sanjib
>> >>
>> >> "Serge" <Se***@discussions.microsoft.com> wrote in message
>> >> news:501E40F3-F275-457B-9956-440D1324C7A8@microsoft.com...
>> >> > Hello,
>> >> >
>> >> > I have a log file that looks something like this
>> >> >
>> >> > ABCDEF ddfs adasd
>> >> > A BRE asdd asd dfddf
>> >> > EROI DFIOU eeroo
>> >> > B BRE errt ssdrr
>> >> > AAA eIR DFDF
>> >> > C BRE AAA asdd
>> >> >
>> >> > All lines are seperated by NEWLINE (\r\n) I want to extract lines
>> >> > that
>> >> > start
>> >> > with BRE all the way to the end of the line and put them into a
>> >> > collection
>> >> > or
>> >> > an array. So in this case I want line 2,4,6
>> >> >
>> >> > Does any of you RegularExpressions gurus have an idea?
>> >> >
>> >> > Thank you
>> >> >
>> >>
>> >>
>> >>
>>
>>
>>
Author
16 Jun 2006 2:49 PM
Serge
Thank you for your response. I will try to do it by using an array of
characters. Meanwhile, could you show me the RegEx patter that I can use? I
just want to compare the 2 approaches and use the faststest of the 2.

Thank you very much

Show quote
"Kevin Spencer" wrote:

> Hi Serge,
>
> I see. Well, parsing it is likely to add some memory to the equation, but
> you could read it in blocks if necessary. I still think a Regular Expression
> would not be the way to go, though. Regular Expressions do some
> backtracking, and I think that wouldn't be necessary. How about if you read
> a block (or the whole) into an array of characters? You could then move
> through the array one character at a time. The sequence would be a loop (in
> pseudo-code):
>
> Start at the beginning of the string, or at the first line break character
> or sequence ("\r\n" or '\r' - depending on the document type).Read one
> character at a time.
> Find the character 'B'.
> See if it is followed by 'R'.
> If so, see if it is followed by an 'E'.
> If all 3 are found in a row, read to the next line break.
>
> Basically, that is what a regular expression does, but in a more roundabout
> fashion, with backtracking, etc., because it is not looking for literal
> characters, but for patterns. Since you're looking for literal characters in
> a specific sequence, this solution would be faster, especially if you used a
> pointer.
>
> --
> HTH,
>
> Kevin Spencer
> Microsoft MVP
> Professional Chicken Salad Alchemist
>
> A lifetime is made up of
> Lots of short moments.
>
> "Serge" <Se***@discussions.microsoft.com> wrote in message
> news:85214A0F-A301-4E06-BA7A-A3D1EBD43200@microsoft.com...
> > Thank you for your response. I already tried using it on a PER LINE basis.
> > It
> > just takes too long. But HUGE file I mean about 10-50 MB. The example I
> > showed is not the actual file. I was just trying to get a sense of what
> > pattern to use as RegEx is a lot fast string manipulation then IndexOf,
> > Substring thing. Basically here are 2 lines from the actual log file.
> >
> >
> > 3148:48 BRE: 17 13 51: penelope baraz King-Ice1 12:21AM
> > 3148:48 ALD: 17 13 51: penelope baraz King-Ice1 12:26AM
> > 3148:48 BRE: 17 13 51: penelope baraz King-Ice1 12:34AM
> > 3148:48 LLD: 17 13 51: penelope baraz King-Ice1 12:45AM
> >
> > As you can see I have 4 lines in this example but i want to extract lines:
> >
> > BRE: 17 13 51: penelope baraz King-Ice1 12:21AM
> > BRE: 17 13 51: penelope baraz King-Ice1 12:34AM
> >
> > Notice that before the word BRE there are some other info that I dont
> > want.
> >
> > Looping through all lines takes too much time however it takes only a sec
> > or
> > 2 to read it into a string variable so I dont think its a problem.
> >
> > Thank you very much.
> > "Kevin Spencer" wrote:
> >
> >> Hi Serge,
> >>
> >> I'm a little confused by your first and second example. You mentioned
> >> something about not wanting the "first letter," and in your first
> >> example,
> >> the lines with "BRE" in them all started with a single letter, but the
> >> other
> >> lines did not, and in your second example, the lines did *not* start with
> >> a
> >> first letter.
> >>
> >> Be that as it may, I know you're chomping at the bit to use regular
> >> expressions here, but in this case you don't want to use a Regular
> >> Expression, even though it would be easy enough to write. Why? Because
> >> you
> >> said "I have a huge file." Regular Expressions work with strings, and I
> >> don't think that (1) you want to read a "huge file" into a single string,
> >> and (2) use a regular expression on a string that large.
> >>
> >> In fact, from what you've described about the size of the file, and
> >> wanting
> >> to parse by line, your best bet (IMHO) would be to use a TextReader to
> >> read
> >> the file one line at a time, and use String.IndexOf to evaluate whether
> >> or
> >> not to include that line in your results. You could, for example, use a
> >> single character array to read the lines into one at a time.
> >>
> >> --
> >> HTH,
> >>
> >> Kevin Spencer
> >> Microsoft MVP
> >> Professional Chicken Salad Alchemist
> >>
> >> A lifetime is made up of
> >> Lots of short moments.
> >>
> >> "Serge" <Se***@discussions.microsoft.com> wrote in message
> >> news:A5EFF452-BB34-44CB-8477-7FAE1A3EC505@microsoft.com...
> >> > Thats the thing...Im asking for a RegularExpression pattern. I know i
> >> > can
> >> > just loop through all lines and use IndexOf and Substring...but I have
> >> > a
> >> > huge
> >> > file and it will take forever. That is why Im asking if anyone has more
> >> > experience with RegularExpressions cause it is new to me.
> >> >
> >> > Also yes, I said I want to extract all lines that start with BRE. So in
> >> > my
> >> > example I want lines
> >> >
> >> > BRE asdd asd dfddf
> >> > BRE errt ssdrr
> >> > BRE AAA asdd
> >> >
> >> > Notice how I dont want the first character (or it could be more than 1
> >> > character) I just want to get the line that starts with BRE to the end
> >> > of
> >> > the
> >> > line.
> >> >
> >> > Thank you very much for your time
> >> >
> >> > Serge
> >> >
> >> >
> >> > "Sanjib Biswas" wrote:
> >> >
> >> >> In your example, you said line starts with BRE but at the end you said
> >> >> you
> >> >> want lines 2,4, and 6. So I am assuming you mean to say line
> >> >> containing
> >> >> BRE.
> >> >> In that case after you have open the log file, read a line and do a
> >> >> string
> >> >> match to see whether that line contains BRE and if its true then store
> >> >> that
> >> >> line into an array or collection.
> >> >>
> >> >> Regards
> >> >> Sanjib
> >> >>
> >> >> "Serge" <Se***@discussions.microsoft.com> wrote in message
> >> >> news:501E40F3-F275-457B-9956-440D1324C7A8@microsoft.com...
> >> >> > Hello,
> >> >> >
> >> >> > I have a log file that looks something like this
> >> >> >
> >> >> > ABCDEF ddfs adasd
> >> >> > A BRE asdd asd dfddf
> >> >> > EROI DFIOU eeroo
> >> >> > B BRE errt ssdrr
> >> >> > AAA eIR DFDF
> >> >> > C BRE AAA asdd
> >> >> >
> >> >> > All lines are seperated by NEWLINE (\r\n) I want to extract lines
> >> >> > that
> >> >> > start
> >> >> > with BRE all the way to the end of the line and put them into a
> >> >> > collection
> >> >> > or
> >> >> > an array. So in this case I want line 2,4,6
> >> >> >
> >> >> > Does any of you RegularExpressions gurus have an idea?
> >> >> >
> >> >> > Thank you
> >> >> >
> >> >>
> >> >>
> >> >>
> >>
> >>
> >>
>
>
>
Author
16 Jun 2006 3:30 PM
Kevin Spencer
Hi Serge,

Here you go:

(?m)BRE.*$

Explanation:

"(?m)" means caret (^) and dollar sign ($) match at line breaks.
Match the letters "BRE" followed by zero or more characters that are not
line break (.*), followed by a line break or the end of the string ($).

--
HTH,

Kevin Spencer
Microsoft MVP
Professional Chicken Salad Alchemist

A lifetime is made up of
Lots of short moments.

Show quote
"Serge" <Se***@discussions.microsoft.com> wrote in message
news:A66BA6B9-E908-4CA7-A7BE-EC8A1FFD1354@microsoft.com...
> Thank you for your response. I will try to do it by using an array of
> characters. Meanwhile, could you show me the RegEx patter that I can use?
> I
> just want to compare the 2 approaches and use the faststest of the 2.
>
> Thank you very much
>
> "Kevin Spencer" wrote:
>
>> Hi Serge,
>>
>> I see. Well, parsing it is likely to add some memory to the equation, but
>> you could read it in blocks if necessary. I still think a Regular
>> Expression
>> would not be the way to go, though. Regular Expressions do some
>> backtracking, and I think that wouldn't be necessary. How about if you
>> read
>> a block (or the whole) into an array of characters? You could then move
>> through the array one character at a time. The sequence would be a loop
>> (in
>> pseudo-code):
>>
>> Start at the beginning of the string, or at the first line break
>> character
>> or sequence ("\r\n" or '\r' - depending on the document type).Read one
>> character at a time.
>> Find the character 'B'.
>> See if it is followed by 'R'.
>> If so, see if it is followed by an 'E'.
>> If all 3 are found in a row, read to the next line break.
>>
>> Basically, that is what a regular expression does, but in a more
>> roundabout
>> fashion, with backtracking, etc., because it is not looking for literal
>> characters, but for patterns. Since you're looking for literal characters
>> in
>> a specific sequence, this solution would be faster, especially if you
>> used a
>> pointer.
>>
>> --
>> HTH,
>>
>> Kevin Spencer
>> Microsoft MVP
>> Professional Chicken Salad Alchemist
>>
>> A lifetime is made up of
>> Lots of short moments.
>>
>> "Serge" <Se***@discussions.microsoft.com> wrote in message
>> news:85214A0F-A301-4E06-BA7A-A3D1EBD43200@microsoft.com...
>> > Thank you for your response. I already tried using it on a PER LINE
>> > basis.
>> > It
>> > just takes too long. But HUGE file I mean about 10-50 MB. The example I
>> > showed is not the actual file. I was just trying to get a sense of what
>> > pattern to use as RegEx is a lot fast string manipulation then IndexOf,
>> > Substring thing. Basically here are 2 lines from the actual log file.
>> >
>> >
>> > 3148:48 BRE: 17 13 51: penelope baraz King-Ice1 12:21AM
>> > 3148:48 ALD: 17 13 51: penelope baraz King-Ice1 12:26AM
>> > 3148:48 BRE: 17 13 51: penelope baraz King-Ice1 12:34AM
>> > 3148:48 LLD: 17 13 51: penelope baraz King-Ice1 12:45AM
>> >
>> > As you can see I have 4 lines in this example but i want to extract
>> > lines:
>> >
>> > BRE: 17 13 51: penelope baraz King-Ice1 12:21AM
>> > BRE: 17 13 51: penelope baraz King-Ice1 12:34AM
>> >
>> > Notice that before the word BRE there are some other info that I dont
>> > want.
>> >
>> > Looping through all lines takes too much time however it takes only a
>> > sec
>> > or
>> > 2 to read it into a string variable so I dont think its a problem.
>> >
>> > Thank you very much.
>> > "Kevin Spencer" wrote:
>> >
>> >> Hi Serge,
>> >>
>> >> I'm a little confused by your first and second example. You mentioned
>> >> something about not wanting the "first letter," and in your first
>> >> example,
>> >> the lines with "BRE" in them all started with a single letter, but the
>> >> other
>> >> lines did not, and in your second example, the lines did *not* start
>> >> with
>> >> a
>> >> first letter.
>> >>
>> >> Be that as it may, I know you're chomping at the bit to use regular
>> >> expressions here, but in this case you don't want to use a Regular
>> >> Expression, even though it would be easy enough to write. Why? Because
>> >> you
>> >> said "I have a huge file." Regular Expressions work with strings, and
>> >> I
>> >> don't think that (1) you want to read a "huge file" into a single
>> >> string,
>> >> and (2) use a regular expression on a string that large.
>> >>
>> >> In fact, from what you've described about the size of the file, and
>> >> wanting
>> >> to parse by line, your best bet (IMHO) would be to use a TextReader to
>> >> read
>> >> the file one line at a time, and use String.IndexOf to evaluate
>> >> whether
>> >> or
>> >> not to include that line in your results. You could, for example, use
>> >> a
>> >> single character array to read the lines into one at a time.
>> >>
>> >> --
>> >> HTH,
>> >>
>> >> Kevin Spencer
>> >> Microsoft MVP
>> >> Professional Chicken Salad Alchemist
>> >>
>> >> A lifetime is made up of
>> >> Lots of short moments.
>> >>
>> >> "Serge" <Se***@discussions.microsoft.com> wrote in message
>> >> news:A5EFF452-BB34-44CB-8477-7FAE1A3EC505@microsoft.com...
>> >> > Thats the thing...Im asking for a RegularExpression pattern. I know
>> >> > i
>> >> > can
>> >> > just loop through all lines and use IndexOf and Substring...but I
>> >> > have
>> >> > a
>> >> > huge
>> >> > file and it will take forever. That is why Im asking if anyone has
>> >> > more
>> >> > experience with RegularExpressions cause it is new to me.
>> >> >
>> >> > Also yes, I said I want to extract all lines that start with BRE. So
>> >> > in
>> >> > my
>> >> > example I want lines
>> >> >
>> >> > BRE asdd asd dfddf
>> >> > BRE errt ssdrr
>> >> > BRE AAA asdd
>> >> >
>> >> > Notice how I dont want the first character (or it could be more than
>> >> > 1
>> >> > character) I just want to get the line that starts with BRE to the
>> >> > end
>> >> > of
>> >> > the
>> >> > line.
>> >> >
>> >> > Thank you very much for your time
>> >> >
>> >> > Serge
>> >> >
>> >> >
>> >> > "Sanjib Biswas" wrote:
>> >> >
>> >> >> In your example, you said line starts with BRE but at the end you
>> >> >> said
>> >> >> you
>> >> >> want lines 2,4, and 6. So I am assuming you mean to say line
>> >> >> containing
>> >> >> BRE.
>> >> >> In that case after you have open the log file, read a line and do a
>> >> >> string
>> >> >> match to see whether that line contains BRE and if its true then
>> >> >> store
>> >> >> that
>> >> >> line into an array or collection.
>> >> >>
>> >> >> Regards
>> >> >> Sanjib
>> >> >>
>> >> >> "Serge" <Se***@discussions.microsoft.com> wrote in message
>> >> >> news:501E40F3-F275-457B-9956-440D1324C7A8@microsoft.com...
>> >> >> > Hello,
>> >> >> >
>> >> >> > I have a log file that looks something like this
>> >> >> >
>> >> >> > ABCDEF ddfs adasd
>> >> >> > A BRE asdd asd dfddf
>> >> >> > EROI DFIOU eeroo
>> >> >> > B BRE errt ssdrr
>> >> >> > AAA eIR DFDF
>> >> >> > C BRE AAA asdd
>> >> >> >
>> >> >> > All lines are seperated by NEWLINE (\r\n) I want to extract lines
>> >> >> > that
>> >> >> > start
>> >> >> > with BRE all the way to the end of the line and put them into a
>> >> >> > collection
>> >> >> > or
>> >> >> > an array. So in this case I want line 2,4,6
>> >> >> >
>> >> >> > Does any of you RegularExpressions gurus have an idea?
>> >> >> >
>> >> >> > Thank you
>> >> >> >
>> >> >>
>> >> >>
>> >> >>
>> >>
>> >>
>> >>
>>
>>
>>
Author
16 Jun 2006 3:58 PM
Serge
Thank you very much for your help. Ive tried using your pattern but for some
reason I didnt return any matches. But after playing around with it I finally
came up with a patternt that works.

BRE.*?\n

Also it works about 10 times faster then looping through each line. Thank
you very much Keving for inspiring me :)

Serge

Show quote
"Kevin Spencer" wrote:

> Hi Serge,
>
> Here you go:
>
> (?m)BRE.*$
>
> Explanation:
>
> "(?m)" means caret (^) and dollar sign ($) match at line breaks.
> Match the letters "BRE" followed by zero or more characters that are not
> line break (.*), followed by a line break or the end of the string ($).
>
> --
> HTH,
>
> Kevin Spencer
> Microsoft MVP
> Professional Chicken Salad Alchemist
>
> A lifetime is made up of
> Lots of short moments.
>
> "Serge" <Se***@discussions.microsoft.com> wrote in message
> news:A66BA6B9-E908-4CA7-A7BE-EC8A1FFD1354@microsoft.com...
> > Thank you for your response. I will try to do it by using an array of
> > characters. Meanwhile, could you show me the RegEx patter that I can use?
> > I
> > just want to compare the 2 approaches and use the faststest of the 2.
> >
> > Thank you very much
> >
> > "Kevin Spencer" wrote:
> >
> >> Hi Serge,
> >>
> >> I see. Well, parsing it is likely to add some memory to the equation, but
> >> you could read it in blocks if necessary. I still think a Regular
> >> Expression
> >> would not be the way to go, though. Regular Expressions do some
> >> backtracking, and I think that wouldn't be necessary. How about if you
> >> read
> >> a block (or the whole) into an array of characters? You could then move
> >> through the array one character at a time. The sequence would be a loop
> >> (in
> >> pseudo-code):
> >>
> >> Start at the beginning of the string, or at the first line break
> >> character
> >> or sequence ("\r\n" or '\r' - depending on the document type).Read one
> >> character at a time.
> >> Find the character 'B'.
> >> See if it is followed by 'R'.
> >> If so, see if it is followed by an 'E'.
> >> If all 3 are found in a row, read to the next line break.
> >>
> >> Basically, that is what a regular expression does, but in a more
> >> roundabout
> >> fashion, with backtracking, etc., because it is not looking for literal
> >> characters, but for patterns. Since you're looking for literal characters
> >> in
> >> a specific sequence, this solution would be faster, especially if you
> >> used a
> >> pointer.
> >>
> >> --
> >> HTH,
> >>
> >> Kevin Spencer
> >> Microsoft MVP
> >> Professional Chicken Salad Alchemist
> >>
> >> A lifetime is made up of
> >> Lots of short moments.
> >>
> >> "Serge" <Se***@discussions.microsoft.com> wrote in message
> >> news:85214A0F-A301-4E06-BA7A-A3D1EBD43200@microsoft.com...
> >> > Thank you for your response. I already tried using it on a PER LINE
> >> > basis.
> >> > It
> >> > just takes too long. But HUGE file I mean about 10-50 MB. The example I
> >> > showed is not the actual file. I was just trying to get a sense of what
> >> > pattern to use as RegEx is a lot fast string manipulation then IndexOf,
> >> > Substring thing. Basically here are 2 lines from the actual log file.
> >> >
> >> >
> >> > 3148:48 BRE: 17 13 51: penelope baraz King-Ice1 12:21AM
> >> > 3148:48 ALD: 17 13 51: penelope baraz King-Ice1 12:26AM
> >> > 3148:48 BRE: 17 13 51: penelope baraz King-Ice1 12:34AM
> >> > 3148:48 LLD: 17 13 51: penelope baraz King-Ice1 12:45AM
> >> >
> >> > As you can see I have 4 lines in this example but i want to extract
> >> > lines:
> >> >
> >> > BRE: 17 13 51: penelope baraz King-Ice1 12:21AM
> >> > BRE: 17 13 51: penelope baraz King-Ice1 12:34AM
> >> >
> >> > Notice that before the word BRE there are some other info that I dont
> >> > want.
> >> >
> >> > Looping through all lines takes too much time however it takes only a
> >> > sec
> >> > or
> >> > 2 to read it into a string variable so I dont think its a problem.
> >> >
> >> > Thank you very much.
> >> > "Kevin Spencer" wrote:
> >> >
> >> >> Hi Serge,
> >> >>
> >> >> I'm a little confused by your first and second example. You mentioned
> >> >> something about not wanting the "first letter," and in your first
> >> >> example,
> >> >> the lines with "BRE" in them all started with a single letter, but the
> >> >> other
> >> >> lines did not, and in your second example, the lines did *not* start
> >> >> with
> >> >> a
> >> >> first letter.
> >> >>
> >> >> Be that as it may, I know you're chomping at the bit to use regular
> >> >> expressions here, but in this case you don't want to use a Regular
> >> >> Expression, even though it would be easy enough to write. Why? Because
> >> >> you
> >> >> said "I have a huge file." Regular Expressions work with strings, and
> >> >> I
> >> >> don't think that (1) you want to read a "huge file" into a single
> >> >> string,
> >> >> and (2) use a regular expression on a string that large.
> >> >>
> >> >> In fact, from what you've described about the size of the file, and
> >> >> wanting
> >> >> to parse by line, your best bet (IMHO) would be to use a TextReader to
> >> >> read
> >> >> the file one line at a time, and use String.IndexOf to evaluate
> >> >> whether
> >> >> or
> >> >> not to include that line in your results. You could, for example, use
> >> >> a
> >> >> single character array to read the lines into one at a time.
> >> >>
> >> >> --
> >> >> HTH,
> >> >>
> >> >> Kevin Spencer
> >> >> Microsoft MVP
> >> >> Professional Chicken Salad Alchemist
> >> >>
> >> >> A lifetime is made up of
> >> >> Lots of short moments.
> >> >>
> >> >> "Serge" <Se***@discussions.microsoft.com> wrote in message
> >> >> news:A5EFF452-BB34-44CB-8477-7FAE1A3EC505@microsoft.com...
> >> >> > Thats the thing...Im asking for a RegularExpression pattern. I know
> >> >> > i
> >> >> > can
> >> >> > just loop through all lines and use IndexOf and Substring...but I
> >> >> > have
> >> >> > a
> >> >> > huge
> >> >> > file and it will take forever. That is why Im asking if anyone has
> >> >> > more
> >> >> > experience with RegularExpressions cause it is new to me.
> >> >> >
> >> >> > Also yes, I said I want to extract all lines that start with BRE. So
> >> >> > in
> >> >> > my
> >> >> > example I want lines
> >> >> >
> >> >> > BRE asdd asd dfddf
> >> >> > BRE errt ssdrr
> >> >> > BRE AAA asdd
> >> >> >
> >> >> > Notice how I dont want the first character (or it could be more than
> >> >> > 1
> >> >> > character) I just want to get the line that starts with BRE to the
> >> >> > end
> >> >> > of
> >> >> > the
> >> >> > line.
> >> >> >
> >> >> > Thank you very much for your time
> >> >> >
> >> >> > Serge
> >> >> >
> >> >> >
> >> >> > "Sanjib Biswas" wrote:
> >> >> >
> >> >> >> In your example, you said line starts with BRE but at the end you
> >> >> >> said
> >> >> >> you
> >> >> >> want lines 2,4, and 6. So I am assuming you mean to say line
> >> >> >> containing
> >> >> >> BRE.
> >> >> >> In that case after you have open the log file, read a line and do a
> >> >> >> string
> >> >> >> match to see whether that line contains BRE and if its true then
> >> >> >> store
> >> >> >> that
> >> >> >> line into an array or collection.
> >> >> >>
> >> >> >> Regards
> >> >> >> Sanjib
> >> >> >>
> >> >> >> "Serge" <Se***@discussions.microsoft.com> wrote in message
> >> >> >> news:501E40F3-F275-457B-9956-440D1324C7A8@microsoft.com...
> >> >> >> > Hello,
> >> >> >> >
> >> >> >> > I have a log file that looks something like this
> >> >> >> >
> >> >> >> > ABCDEF ddfs adasd
> >> >> >> > A BRE asdd asd dfddf
> >> >> >> > EROI DFIOU eeroo
> >> >> >> > B BRE errt ssdrr
> >> >> >> > AAA eIR DFDF
> >> >> >> > C BRE AAA asdd
> >> >> >> >
> >> >> >> > All lines are seperated by NEWLINE (\r\n) I want to extract lines
> >> >> >> > that
> >> >> >> > start
> >> >> >> > with BRE all the way to the end of the line and put them into a
> >> >> >> > collection
> >> >> >> > or
> >> >> >> > an array. So in this case I want line 2,4,6
> >> >> >> >
> >> >> >> > Does any of you RegularExpressions gurus have an idea?
> >> >> >> >
> >> >> >> > Thank you
> >> >> >> >
> >> >> >>
> >> >> >>
> >> >> >>
> >> >>
> >> >>
> >> >>
> >>
> >>
> >>
>
>
>
Author
16 Jun 2006 4:00 PM
Serge
Never mind... :) Your solution worked as well. Thank you Kevin

Show quote
"Serge" wrote:

> Thank you for your response. I will try to do it by using an array of
> characters. Meanwhile, could you show me the RegEx patter that I can use? I
> just want to compare the 2 approaches and use the faststest of the 2.
>
> Thank you very much
>
> "Kevin Spencer" wrote:
>
> > Hi Serge,
> >
> > I see. Well, parsing it is likely to add some memory to the equation, but
> > you could read it in blocks if necessary. I still think a Regular Expression
> > would not be the way to go, though. Regular Expressions do some
> > backtracking, and I think that wouldn't be necessary. How about if you read
> > a block (or the whole) into an array of characters? You could then move
> > through the array one character at a time. The sequence would be a loop (in
> > pseudo-code):
> >
> > Start at the beginning of the string, or at the first line break character
> > or sequence ("\r\n" or '\r' - depending on the document type).Read one
> > character at a time.
> > Find the character 'B'.
> > See if it is followed by 'R'.
> > If so, see if it is followed by an 'E'.
> > If all 3 are found in a row, read to the next line break.
> >
> > Basically, that is what a regular expression does, but in a more roundabout
> > fashion, with backtracking, etc., because it is not looking for literal
> > characters, but for patterns. Since you're looking for literal characters in
> > a specific sequence, this solution would be faster, especially if you used a
> > pointer.
> >
> > --
> > HTH,
> >
> > Kevin Spencer
> > Microsoft MVP
> > Professional Chicken Salad Alchemist
> >
> > A lifetime is made up of
> > Lots of short moments.
> >
> > "Serge" <Se***@discussions.microsoft.com> wrote in message
> > news:85214A0F-A301-4E06-BA7A-A3D1EBD43200@microsoft.com...
> > > Thank you for your response. I already tried using it on a PER LINE basis.
> > > It
> > > just takes too long. But HUGE file I mean about 10-50 MB. The example I
> > > showed is not the actual file. I was just trying to get a sense of what
> > > pattern to use as RegEx is a lot fast string manipulation then IndexOf,
> > > Substring thing. Basically here are 2 lines from the actual log file.
> > >
> > >
> > > 3148:48 BRE: 17 13 51: penelope baraz King-Ice1 12:21AM
> > > 3148:48 ALD: 17 13 51: penelope baraz King-Ice1 12:26AM
> > > 3148:48 BRE: 17 13 51: penelope baraz King-Ice1 12:34AM
> > > 3148:48 LLD: 17 13 51: penelope baraz King-Ice1 12:45AM
> > >
> > > As you can see I have 4 lines in this example but i want to extract lines:
> > >
> > > BRE: 17 13 51: penelope baraz King-Ice1 12:21AM
> > > BRE: 17 13 51: penelope baraz King-Ice1 12:34AM
> > >
> > > Notice that before the word BRE there are some other info that I dont
> > > want.
> > >
> > > Looping through all lines takes too much time however it takes only a sec
> > > or
> > > 2 to read it into a string variable so I dont think its a problem.
> > >
> > > Thank you very much.
> > > "Kevin Spencer" wrote:
> > >
> > >> Hi Serge,
> > >>
> > >> I'm a little confused by your first and second example. You mentioned
> > >> something about not wanting the "first letter," and in your first
> > >> example,
> > >> the lines with "BRE" in them all started with a single letter, but the
> > >> other
> > >> lines did not, and in your second example, the lines did *not* start with
> > >> a
> > >> first letter.
> > >>
> > >> Be that as it may, I know you're chomping at the bit to use regular
> > >> expressions here, but in this case you don't want to use a Regular
> > >> Expression, even though it would be easy enough to write. Why? Because
> > >> you
> > >> said "I have a huge file." Regular Expressions work with strings, and I
> > >> don't think that (1) you want to read a "huge file" into a single string,
> > >> and (2) use a regular expression on a string that large.
> > >>
> > >> In fact, from what you've described about the size of the file, and
> > >> wanting
> > >> to parse by line, your best bet (IMHO) would be to use a TextReader to
> > >> read
> > >> the file one line at a time, and use String.IndexOf to evaluate whether
> > >> or
> > >> not to include that line in your results. You could, for example, use a
> > >> single character array to read the lines into one at a time.
> > >>
> > >> --
> > >> HTH,
> > >>
> > >> Kevin Spencer
> > >> Microsoft MVP
> > >> Professional Chicken Salad Alchemist
> > >>
> > >> A lifetime is made up of
> > >> Lots of short moments.
> > >>
> > >> "Serge" <Se***@discussions.microsoft.com> wrote in message
> > >> news:A5EFF452-BB34-44CB-8477-7FAE1A3EC505@microsoft.com...
> > >> > Thats the thing...Im asking for a RegularExpression pattern. I know i
> > >> > can
> > >> > just loop through all lines and use IndexOf and Substring...but I have
> > >> > a
> > >> > huge
> > >> > file and it will take forever. That is why Im asking if anyone has more
> > >> > experience with RegularExpressions cause it is new to me.
> > >> >
> > >> > Also yes, I said I want to extract all lines that start with BRE. So in
> > >> > my
> > >> > example I want lines
> > >> >
> > >> > BRE asdd asd dfddf
> > >> > BRE errt ssdrr
> > >> > BRE AAA asdd
> > >> >
> > >> > Notice how I dont want the first character (or it could be more than 1
> > >> > character) I just want to get the line that starts with BRE to the end
> > >> > of
> > >> > the
> > >> > line.
> > >> >
> > >> > Thank you very much for your time
> > >> >
> > >> > Serge
> > >> >
> > >> >
> > >> > "Sanjib Biswas" wrote:
> > >> >
> > >> >> In your example, you said line starts with BRE but at the end you said
> > >> >> you
> > >> >> want lines 2,4, and 6. So I am assuming you mean to say line
> > >> >> containing
> > >> >> BRE.
> > >> >> In that case after you have open the log file, read a line and do a
> > >> >> string
> > >> >> match to see whether that line contains BRE and if its true then store
> > >> >> that
> > >> >> line into an array or collection.
> > >> >>
> > >> >> Regards
> > >> >> Sanjib
> > >> >>
> > >> >> "Serge" <Se***@discussions.microsoft.com> wrote in message
> > >> >> news:501E40F3-F275-457B-9956-440D1324C7A8@microsoft.com...
> > >> >> > Hello,
> > >> >> >
> > >> >> > I have a log file that looks something like this
> > >> >> >
> > >> >> > ABCDEF ddfs adasd
> > >> >> > A BRE asdd asd dfddf
> > >> >> > EROI DFIOU eeroo
> > >> >> > B BRE errt ssdrr
> > >> >> > AAA eIR DFDF
> > >> >> > C BRE AAA asdd
> > >> >> >
> > >> >> > All lines are seperated by NEWLINE (\r\n) I want to extract lines
> > >> >> > that
> > >> >> > start
> > >> >> > with BRE all the way to the end of the line and put them into a
> > >> >> > collection
> > >> >> > or
> > >> >> > an array. So in this case I want line 2,4,6
> > >> >> >
> > >> >> > Does any of you RegularExpressions gurus have an idea?
> > >> >> >
> > >> >> > Thank you
> > >> >> >
> > >> >>
> > >> >>
> > >> >>
> > >>
> > >>
> > >>
> >
> >
> >
Author
16 Jun 2006 8:50 PM
Kevin Spencer
:-D

--

Kevin Spencer
Microsoft MVP
Professional Chicken Salad Alchemist

I recycle.
I send everything back to the planet it came from.

Show quote
"Serge" <Se***@discussions.microsoft.com> wrote in message
news:B9F263A3-B8F6-4FE9-9BA5-3634C67E7C90@microsoft.com...
> Never mind... :) Your solution worked as well. Thank you Kevin
>
> "Serge" wrote:
>
>> Thank you for your response. I will try to do it by using an array of
>> characters. Meanwhile, could you show me the RegEx patter that I can use?
>> I
>> just want to compare the 2 approaches and use the faststest of the 2.
>>
>> Thank you very much
>>
>> "Kevin Spencer" wrote:
>>
>> > Hi Serge,
>> >
>> > I see. Well, parsing it is likely to add some memory to the equation,
>> > but
>> > you could read it in blocks if necessary. I still think a Regular
>> > Expression
>> > would not be the way to go, though. Regular Expressions do some
>> > backtracking, and I think that wouldn't be necessary. How about if you
>> > read
>> > a block (or the whole) into an array of characters? You could then move
>> > through the array one character at a time. The sequence would be a loop
>> > (in
>> > pseudo-code):
>> >
>> > Start at the beginning of the string, or at the first line break
>> > character
>> > or sequence ("\r\n" or '\r' - depending on the document type).Read one
>> > character at a time.
>> > Find the character 'B'.
>> > See if it is followed by 'R'.
>> > If so, see if it is followed by an 'E'.
>> > If all 3 are found in a row, read to the next line break.
>> >
>> > Basically, that is what a regular expression does, but in a more
>> > roundabout
>> > fashion, with backtracking, etc., because it is not looking for literal
>> > characters, but for patterns. Since you're looking for literal
>> > characters in
>> > a specific sequence, this solution would be faster, especially if you
>> > used a
>> > pointer.
>> >
>> > --
>> > HTH,
>> >
>> > Kevin Spencer
>> > Microsoft MVP
>> > Professional Chicken Salad Alchemist
>> >
>> > A lifetime is made up of
>> > Lots of short moments.
>> >
>> > "Serge" <Se***@discussions.microsoft.com> wrote in message
>> > news:85214A0F-A301-4E06-BA7A-A3D1EBD43200@microsoft.com...
>> > > Thank you for your response. I already tried using it on a PER LINE
>> > > basis.
>> > > It
>> > > just takes too long. But HUGE file I mean about 10-50 MB. The example
>> > > I
>> > > showed is not the actual file. I was just trying to get a sense of
>> > > what
>> > > pattern to use as RegEx is a lot fast string manipulation then
>> > > IndexOf,
>> > > Substring thing. Basically here are 2 lines from the actual log file.
>> > >
>> > >
>> > > 3148:48 BRE: 17 13 51: penelope baraz King-Ice1 12:21AM
>> > > 3148:48 ALD: 17 13 51: penelope baraz King-Ice1 12:26AM
>> > > 3148:48 BRE: 17 13 51: penelope baraz King-Ice1 12:34AM
>> > > 3148:48 LLD: 17 13 51: penelope baraz King-Ice1 12:45AM
>> > >
>> > > As you can see I have 4 lines in this example but i want to extract
>> > > lines:
>> > >
>> > > BRE: 17 13 51: penelope baraz King-Ice1 12:21AM
>> > > BRE: 17 13 51: penelope baraz King-Ice1 12:34AM
>> > >
>> > > Notice that before the word BRE there are some other info that I dont
>> > > want.
>> > >
>> > > Looping through all lines takes too much time however it takes only a
>> > > sec
>> > > or
>> > > 2 to read it into a string variable so I dont think its a problem.
>> > >
>> > > Thank you very much.
>> > > "Kevin Spencer" wrote:
>> > >
>> > >> Hi Serge,
>> > >>
>> > >> I'm a little confused by your first and second example. You
>> > >> mentioned
>> > >> something about not wanting the "first letter," and in your first
>> > >> example,
>> > >> the lines with "BRE" in them all started with a single letter, but
>> > >> the
>> > >> other
>> > >> lines did not, and in your second example, the lines did *not* start
>> > >> with
>> > >> a
>> > >> first letter.
>> > >>
>> > >> Be that as it may, I know you're chomping at the bit to use regular
>> > >> expressions here, but in this case you don't want to use a Regular
>> > >> Expression, even though it would be easy enough to write. Why?
>> > >> Because
>> > >> you
>> > >> said "I have a huge file." Regular Expressions work with strings,
>> > >> and I
>> > >> don't think that (1) you want to read a "huge file" into a single
>> > >> string,
>> > >> and (2) use a regular expression on a string that large.
>> > >>
>> > >> In fact, from what you've described about the size of the file, and
>> > >> wanting
>> > >> to parse by line, your best bet (IMHO) would be to use a TextReader
>> > >> to
>> > >> read
>> > >> the file one line at a time, and use String.IndexOf to evaluate
>> > >> whether
>> > >> or
>> > >> not to include that line in your results. You could, for example,
>> > >> use a
>> > >> single character array to read the lines into one at a time.
>> > >>
>> > >> --
>> > >> HTH,
>> > >>
>> > >> Kevin Spencer
>> > >> Microsoft MVP
>> > >> Professional Chicken Salad Alchemist
>> > >>
>> > >> A lifetime is made up of
>> > >> Lots of short moments.
>> > >>
>> > >> "Serge" <Se***@discussions.microsoft.com> wrote in message
>> > >> news:A5EFF452-BB34-44CB-8477-7FAE1A3EC505@microsoft.com...
>> > >> > Thats the thing...Im asking for a RegularExpression pattern. I
>> > >> > know i
>> > >> > can
>> > >> > just loop through all lines and use IndexOf and Substring...but I
>> > >> > have
>> > >> > a
>> > >> > huge
>> > >> > file and it will take forever. That is why Im asking if anyone has
>> > >> > more
>> > >> > experience with RegularExpressions cause it is new to me.
>> > >> >
>> > >> > Also yes, I said I want to extract all lines that start with BRE.
>> > >> > So in
>> > >> > my
>> > >> > example I want lines
>> > >> >
>> > >> > BRE asdd asd dfddf
>> > >> > BRE errt ssdrr
>> > >> > BRE AAA asdd
>> > >> >
>> > >> > Notice how I dont want the first character (or it could be more
>> > >> > than 1
>> > >> > character) I just want to get the line that starts with BRE to the
>> > >> > end
>> > >> > of
>> > >> > the
>> > >> > line.
>> > >> >
>> > >> > Thank you very much for your time
>> > >> >
>> > >> > Serge
>> > >> >
>> > >> >
>> > >> > "Sanjib Biswas" wrote:
>> > >> >
>> > >> >> In your example, you said line starts with BRE but at the end you
>> > >> >> said
>> > >> >> you
>> > >> >> want lines 2,4, and 6. So I am assuming you mean to say line
>> > >> >> containing
>> > >> >> BRE.
>> > >> >> In that case after you have open the log file, read a line and do
>> > >> >> a
>> > >> >> string
>> > >> >> match to see whether that line contains BRE and if its true then
>> > >> >> store
>> > >> >> that
>> > >> >> line into an array or collection.
>> > >> >>
>> > >> >> Regards
>> > >> >> Sanjib
>> > >> >>
>> > >> >> "Serge" <Se***@discussions.microsoft.com> wrote in message
>> > >> >> news:501E40F3-F275-457B-9956-440D1324C7A8@microsoft.com...
>> > >> >> > Hello,
>> > >> >> >
>> > >> >> > I have a log file that looks something like this
>> > >> >> >
>> > >> >> > ABCDEF ddfs adasd
>> > >> >> > A BRE asdd asd dfddf
>> > >> >> > EROI DFIOU eeroo
>> > >> >> > B BRE errt ssdrr
>> > >> >> > AAA eIR DFDF
>> > >> >> > C BRE AAA asdd
>> > >> >> >
>> > >> >> > All lines are seperated by NEWLINE (\r\n) I want to extract
>> > >> >> > lines
>> > >> >> > that
>> > >> >> > start
>> > >> >> > with BRE all the way to the end of the line and put them into a
>> > >> >> > collection
>> > >> >> > or
>> > >> >> > an array. So in this case I want line 2,4,6
>> > >> >> >
>> > >> >> > Does any of you RegularExpressions gurus have an idea?
>> > >> >> >
>> > >> >> > Thank you
>> > >> >> >
>> > >> >>
>> > >> >>
>> > >> >>
>> > >>
>> > >>
>> > >>
>> >
>> >
>> >

AddThis Social Bookmark Button