|
dev
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
A nagging Regex questionrandom Text Key 1.00 More random Text Item more random text = "Wanted1", "More random Text" even more random Text EndItem Item more random text = "Wanted2", "Quoted Random Text" more Random Text even more random Text even more random Text EndItem even more random Text even more random Text (where the random Text does not contain "Item", "=", or any """") I would like to, in one Regex, capture such that Group(1) = 1.00 Group(2) = Wanted1 Group(3) = Wanted2 I have been reduced to using 2 Regexs: Key (\d\.\d+) -- to capture the 1.00 and Item[^=]+=\s?"(\w+)" to capture Wanted1 and Wanted2 All attempts to combine the two Regexs result in (at best): Group(1) = 1.00 Group(2) = Wanted2 I understand that the greedy match leads me to this result, but it seems to me that this should be doable in 1 Regex. -- Thanks for any guidance, Jim Parsells > I would like to, in one Regex, capture such that Regular Expressions match patterns. A pattern is a set of rules regarding > Group(1) = 1.00 > Group(2) = Wanted1 > Group(3) = Wanted2 the text to match. What you have posted is not a set of rules. Therefore, it is not possible to determine what your Regular Expression should be. For example, do you really want to capture the literal string "1.00"? If so, you can use "1.00". But your Regular Expression indicates that the rules for this match are: (\d\.\d+) - Exactly1 digit character, followed by exactly1 dot, followed by at least 1 digit character. So, that Regular Expression would match: 2.1 1.0 3.987654321 But would *not* match: 5 100 67.9 (it would capture "7.9") -9.5 (it would capture "9.5") Could you please define the rules exactly for both patterns? -- Show quoteHTH, Kevin Spencer Microsoft MVP ..Net Developer Presuming that God is "only an idea" - Ideas exist. Therefore, God exists. "Jim Parsells" <JimParse***@discussions.microsoft.com> wrote in message news:BC52863C-3C98-4450-B9A4-438813E0FDEC@microsoft.com... > Given some text that looks like: > random Text Key 1.00 > More random Text > Item more random text = "Wanted1", "More random Text" > even more random Text > EndItem > Item more random text = "Wanted2", "Quoted Random Text" more Random Text > even more random Text > even more random Text > EndItem > even more random Text > even more random Text > (where the random Text does not contain "Item", "=", or any """") > I would like to, in one Regex, capture such that > Group(1) = 1.00 > Group(2) = Wanted1 > Group(3) = Wanted2 > > I have been reduced to using 2 Regexs: > Key (\d\.\d+) -- to capture the 1.00 and > Item[^=]+=\s?"(\w+)" to capture Wanted1 and Wanted2 > > All attempts to combine the two Regexs result in (at best): > Group(1) = 1.00 > Group(2) = Wanted2 > > I understand that the greedy match leads me to this result, but it seems > to > me that this should be doable in 1 Regex. > > -- > Thanks for any guidance, > Jim Parsells Indeed, the (\d\.\d+) is capturing exactly what I want. If the data is
3.987654321, then that is what I want to capture. By the same token, I wish to capture the first quoted value following each literal occurance of "Item" that is not a part of "EndItem" (though the pattern I am using doesn't exclude "EndItem", I know). The meat of the question is that "Item" .....quoted text ..... "EndItem" will occur 1 or more times, and I wish to capture ALL of the quoted text occurances, not just the first or last. The pattern Item[^=]+=\s?"(\w+)" , used alone will do just that. However and for example, the pattern: (?:Key (\d\.\d+).*){1,1}.?(?:(?:Item[^=]+=\s?"(\w+)"))+? will only capture "1.00" and "Wanted2" because of the greedy match behavior of the engine. I am looking for a pattern that will capture ALL of instances of: first quoted text occurring After "Item" followed by NOT "=" followed by "=" and prior to "EndItem". This is actually a recurring requirement: A need to capture some info at the beginning of a string followed by recurring groups of identically formatted information. For example, segments of XML may follow this pattern. Thanks, Jim Parsells -- Show quoteJim Parsells "Kevin Spencer" wrote: > > I would like to, in one Regex, capture such that > > Group(1) = 1.00 > > Group(2) = Wanted1 > > Group(3) = Wanted2 > > Regular Expressions match patterns. A pattern is a set of rules regarding > the text to match. What you have posted is not a set of rules. Therefore, it > is not possible to determine what your Regular Expression should be. > > For example, do you really want to capture the literal string "1.00"? If so, > you can use "1.00". But your Regular Expression indicates that the rules for > this match are: > > (\d\.\d+) - Exactly1 digit character, followed by exactly1 dot, followed by > at least 1 digit character. > > So, that Regular Expression would match: > 2.1 > 1.0 > 3.987654321 > > But would *not* match: > 5 > 100 > 67.9 (it would capture "7.9") > -9.5 (it would capture "9.5") > > Could you please define the rules exactly for both patterns? > > -- > HTH, > > Kevin Spencer > Microsoft MVP > ..Net Developer > > Presuming that God is "only an idea" - > Ideas exist. > Therefore, God exists. > > "Jim Parsells" <JimParse***@discussions.microsoft.com> wrote in message > news:BC52863C-3C98-4450-B9A4-438813E0FDEC@microsoft.com... > > Given some text that looks like: > > random Text Key 1.00 > > More random Text > > Item more random text = "Wanted1", "More random Text" > > even more random Text > > EndItem > > Item more random text = "Wanted2", "Quoted Random Text" more Random Text > > even more random Text > > even more random Text > > EndItem > > even more random Text > > even more random Text > > (where the random Text does not contain "Item", "=", or any """") > > I would like to, in one Regex, capture such that > > Group(1) = 1.00 > > Group(2) = Wanted1 > > Group(3) = Wanted2 > > > > I have been reduced to using 2 Regexs: > > Key (\d\.\d+) -- to capture the 1.00 and > > Item[^=]+=\s?"(\w+)" to capture Wanted1 and Wanted2 > > > > All attempts to combine the two Regexs result in (at best): > > Group(1) = 1.00 > > Group(2) = Wanted2 > > > > I understand that the greedy match leads me to this result, but it seems > > to > > me that this should be doable in 1 Regex. > > > > -- > > Thanks for any guidance, > > Jim Parsells > > > Hi Jim,
Forgive me, but I am very careful about making sure that I know what I'm discussing before I say anything about it, so I tend to ask more questions! Now that you've confirmed what the rules are, I think the solution is fairly easy: Key (\d\.\d+)|Item[^=]+=\s?"(\w+)" What this does is combine the 2 Regular Expressions using the "|" (or) operator, and grouping the results as Group 1 (Key value) and Group 2 (Item value). -- Show quoteHTH, Kevin Spencer Microsoft MVP ..Net Developer Presuming that God is "only an idea" - Ideas exist. Therefore, God exists. "Jim Parsells" <JimParse***@discussions.microsoft.com> wrote in message news:C5E0BC69-FDE4-465A-B381-7322298BCD44@microsoft.com... > Indeed, the (\d\.\d+) is capturing exactly what I want. If the data is > 3.987654321, then that is what I want to capture. > > By the same token, I wish to capture the first quoted value following > each > literal occurance of "Item" that is not a part of "EndItem" (though the > pattern I am using doesn't exclude "EndItem", I know). > > The meat of the question is that "Item" .....quoted text ..... "EndItem" > will occur 1 or more times, and I wish to capture ALL of the quoted text > occurances, not just the first or last. The pattern Item[^=]+=\s?"(\w+)" , > used alone will do just that. However and for example, the pattern: > > (?:Key (\d\.\d+).*){1,1}.?(?:(?:Item[^=]+=\s?"(\w+)"))+? > > will only capture "1.00" and "Wanted2" because of the greedy match > behavior > of the engine. I am looking for a pattern that will capture ALL of > instances > of: > first quoted text occurring > After "Item" followed by NOT "=" followed by "=" > and prior to "EndItem". > > This is actually a recurring requirement: A need to capture some info at > the > beginning of a string followed by recurring groups of identically > formatted > information. For example, segments of XML may follow this pattern. > > Thanks, Jim Parsells > -- > Jim Parsells > > > "Kevin Spencer" wrote: > >> > I would like to, in one Regex, capture such that >> > Group(1) = 1.00 >> > Group(2) = Wanted1 >> > Group(3) = Wanted2 >> >> Regular Expressions match patterns. A pattern is a set of rules regarding >> the text to match. What you have posted is not a set of rules. Therefore, >> it >> is not possible to determine what your Regular Expression should be. >> >> For example, do you really want to capture the literal string "1.00"? If >> so, >> you can use "1.00". But your Regular Expression indicates that the rules >> for >> this match are: >> >> (\d\.\d+) - Exactly1 digit character, followed by exactly1 dot, followed >> by >> at least 1 digit character. >> >> So, that Regular Expression would match: >> 2.1 >> 1.0 >> 3.987654321 >> >> But would *not* match: >> 5 >> 100 >> 67.9 (it would capture "7.9") >> -9.5 (it would capture "9.5") >> >> Could you please define the rules exactly for both patterns? >> >> -- >> HTH, >> >> Kevin Spencer >> Microsoft MVP >> ..Net Developer >> >> Presuming that God is "only an idea" - >> Ideas exist. >> Therefore, God exists. >> >> "Jim Parsells" <JimParse***@discussions.microsoft.com> wrote in message >> news:BC52863C-3C98-4450-B9A4-438813E0FDEC@microsoft.com... >> > Given some text that looks like: >> > random Text Key 1.00 >> > More random Text >> > Item more random text = "Wanted1", "More random Text" >> > even more random Text >> > EndItem >> > Item more random text = "Wanted2", "Quoted Random Text" more Random >> > Text >> > even more random Text >> > even more random Text >> > EndItem >> > even more random Text >> > even more random Text >> > (where the random Text does not contain "Item", "=", or any """") >> > I would like to, in one Regex, capture such that >> > Group(1) = 1.00 >> > Group(2) = Wanted1 >> > Group(3) = Wanted2 >> > >> > I have been reduced to using 2 Regexs: >> > Key (\d\.\d+) -- to capture the 1.00 and >> > Item[^=]+=\s?"(\w+)" to capture Wanted1 and Wanted2 >> > >> > All attempts to combine the two Regexs result in (at best): >> > Group(1) = 1.00 >> > Group(2) = Wanted2 >> > >> > I understand that the greedy match leads me to this result, but it >> > seems >> > to >> > me that this should be doable in 1 Regex. >> > >> > -- >> > Thanks for any guidance, >> > Jim Parsells >> >> >> Thanks Kevin. I must have some mental block about "|". That is exactly what
I was looking for. -- Show quoteJim Parsells "Kevin Spencer" wrote: > Hi Jim, > > Forgive me, but I am very careful about making sure that I know what I'm > discussing before I say anything about it, so I tend to ask more questions! > > Now that you've confirmed what the rules are, I think the solution is fairly > easy: > > Key (\d\.\d+)|Item[^=]+=\s?"(\w+)" > > What this does is combine the 2 Regular Expressions using the "|" (or) > operator, and grouping the results as Group 1 (Key value) and Group 2 (Item > value). > > -- > HTH, > > Kevin Spencer > Microsoft MVP > ..Net Developer > > Presuming that God is "only an idea" - > Ideas exist. > Therefore, God exists. > > "Jim Parsells" <JimParse***@discussions.microsoft.com> wrote in message > news:C5E0BC69-FDE4-465A-B381-7322298BCD44@microsoft.com... > > Indeed, the (\d\.\d+) is capturing exactly what I want. If the data is > > 3.987654321, then that is what I want to capture. > > > > By the same token, I wish to capture the first quoted value following > > each > > literal occurance of "Item" that is not a part of "EndItem" (though the > > pattern I am using doesn't exclude "EndItem", I know). > > > > The meat of the question is that "Item" .....quoted text ..... "EndItem" > > will occur 1 or more times, and I wish to capture ALL of the quoted text > > occurances, not just the first or last. The pattern Item[^=]+=\s?"(\w+)" , > > used alone will do just that. However and for example, the pattern: > > > > (?:Key (\d\.\d+).*){1,1}.?(?:(?:Item[^=]+=\s?"(\w+)"))+? > > > > will only capture "1.00" and "Wanted2" because of the greedy match > > behavior > > of the engine. I am looking for a pattern that will capture ALL of > > instances > > of: > > first quoted text occurring > > After "Item" followed by NOT "=" followed by "=" > > and prior to "EndItem". > > > > This is actually a recurring requirement: A need to capture some info at > > the > > beginning of a string followed by recurring groups of identically > > formatted > > information. For example, segments of XML may follow this pattern. > > > > Thanks, Jim Parsells > > -- > > Jim Parsells > > > > > > "Kevin Spencer" wrote: > > > >> > I would like to, in one Regex, capture such that > >> > Group(1) = 1.00 > >> > Group(2) = Wanted1 > >> > Group(3) = Wanted2 > >> > >> Regular Expressions match patterns. A pattern is a set of rules regarding > >> the text to match. What you have posted is not a set of rules. Therefore, > >> it > >> is not possible to determine what your Regular Expression should be. > >> > >> For example, do you really want to capture the literal string "1.00"? If > >> so, > >> you can use "1.00". But your Regular Expression indicates that the rules > >> for > >> this match are: > >> > >> (\d\.\d+) - Exactly1 digit character, followed by exactly1 dot, followed > >> by > >> at least 1 digit character. > >> > >> So, that Regular Expression would match: > >> 2.1 > >> 1.0 > >> 3.987654321 > >> > >> But would *not* match: > >> 5 > >> 100 > >> 67.9 (it would capture "7.9") > >> -9.5 (it would capture "9.5") > >> > >> Could you please define the rules exactly for both patterns? > >> > >> -- > >> HTH, > >> > >> Kevin Spencer > >> Microsoft MVP > >> ..Net Developer > >> > >> Presuming that God is "only an idea" - > >> Ideas exist. > >> Therefore, God exists. > >> > >> "Jim Parsells" <JimParse***@discussions.microsoft.com> wrote in message > >> news:BC52863C-3C98-4450-B9A4-438813E0FDEC@microsoft.com... > >> > Given some text that looks like: > >> > random Text Key 1.00 > >> > More random Text > >> > Item more random text = "Wanted1", "More random Text" > >> > even more random Text > >> > EndItem > >> > Item more random text = "Wanted2", "Quoted Random Text" more Random > >> > Text > >> > even more random Text > >> > even more random Text > >> > EndItem > >> > even more random Text > >> > even more random Text > >> > (where the random Text does not contain "Item", "=", or any """") > >> > I would like to, in one Regex, capture such that > >> > Group(1) = 1.00 > >> > Group(2) = Wanted1 > >> > Group(3) = Wanted2 > >> > > >> > I have been reduced to using 2 Regexs: > >> > Key (\d\.\d+) -- to capture the 1.00 and > >> > Item[^=]+=\s?"(\w+)" to capture Wanted1 and Wanted2 > >> > > >> > All attempts to combine the two Regexs result in (at best): > >> > Group(1) = 1.00 > >> > Group(2) = Wanted2 > >> > > >> > I understand that the greedy match leads me to this result, but it > >> > seems > >> > to > >> > me that this should be doable in 1 Regex. > >> > > >> > -- > >> > Thanks for any guidance, > >> > Jim Parsells > >> > >> > >> > > > > I must have some mental block about "|". Don't be hard on yourself, Jim. Writing efficient Regular Expressions is a challenging art. I am constantly on the lookout for more efficient solutions than I have thought of, and constantly finding them! -- Show quoteHTH, Kevin Spencer Microsoft MVP ..Net Developer Presuming that God is "only an idea" - Ideas exist. Therefore, God exists. "Jim Parsells" <JimParse***@discussions.microsoft.com> wrote in message news:FEF8D98B-1CA1-4E72-A71D-F6856C60300F@microsoft.com... > Thanks Kevin. I must have some mental block about "|". That is exactly > what > I was looking for. > -- > Jim Parsells > > > "Kevin Spencer" wrote: > >> Hi Jim, >> >> Forgive me, but I am very careful about making sure that I know what I'm >> discussing before I say anything about it, so I tend to ask more >> questions! >> >> Now that you've confirmed what the rules are, I think the solution is >> fairly >> easy: >> >> Key (\d\.\d+)|Item[^=]+=\s?"(\w+)" >> >> What this does is combine the 2 Regular Expressions using the "|" (or) >> operator, and grouping the results as Group 1 (Key value) and Group 2 >> (Item >> value). >> >> -- >> HTH, >> >> Kevin Spencer >> Microsoft MVP >> ..Net Developer >> >> Presuming that God is "only an idea" - >> Ideas exist. >> Therefore, God exists. >> >> "Jim Parsells" <JimParse***@discussions.microsoft.com> wrote in message >> news:C5E0BC69-FDE4-465A-B381-7322298BCD44@microsoft.com... >> > Indeed, the (\d\.\d+) is capturing exactly what I want. If the data is >> > 3.987654321, then that is what I want to capture. >> > >> > By the same token, I wish to capture the first quoted value following >> > each >> > literal occurance of "Item" that is not a part of "EndItem" (though the >> > pattern I am using doesn't exclude "EndItem", I know). >> > >> > The meat of the question is that "Item" .....quoted text ..... >> > "EndItem" >> > will occur 1 or more times, and I wish to capture ALL of the quoted >> > text >> > occurances, not just the first or last. The pattern >> > Item[^=]+=\s?"(\w+)" , >> > used alone will do just that. However and for example, the pattern: >> > >> > (?:Key (\d\.\d+).*){1,1}.?(?:(?:Item[^=]+=\s?"(\w+)"))+? >> > >> > will only capture "1.00" and "Wanted2" because of the greedy match >> > behavior >> > of the engine. I am looking for a pattern that will capture ALL of >> > instances >> > of: >> > first quoted text occurring >> > After "Item" followed by NOT "=" followed by "=" >> > and prior to "EndItem". >> > >> > This is actually a recurring requirement: A need to capture some info >> > at >> > the >> > beginning of a string followed by recurring groups of identically >> > formatted >> > information. For example, segments of XML may follow this pattern. >> > >> > Thanks, Jim Parsells >> > -- >> > Jim Parsells >> > >> > >> > "Kevin Spencer" wrote: >> > >> >> > I would like to, in one Regex, capture such that >> >> > Group(1) = 1.00 >> >> > Group(2) = Wanted1 >> >> > Group(3) = Wanted2 >> >> >> >> Regular Expressions match patterns. A pattern is a set of rules >> >> regarding >> >> the text to match. What you have posted is not a set of rules. >> >> Therefore, >> >> it >> >> is not possible to determine what your Regular Expression should be. >> >> >> >> For example, do you really want to capture the literal string "1.00"? >> >> If >> >> so, >> >> you can use "1.00". But your Regular Expression indicates that the >> >> rules >> >> for >> >> this match are: >> >> >> >> (\d\.\d+) - Exactly1 digit character, followed by exactly1 dot, >> >> followed >> >> by >> >> at least 1 digit character. >> >> >> >> So, that Regular Expression would match: >> >> 2.1 >> >> 1.0 >> >> 3.987654321 >> >> >> >> But would *not* match: >> >> 5 >> >> 100 >> >> 67.9 (it would capture "7.9") >> >> -9.5 (it would capture "9.5") >> >> >> >> Could you please define the rules exactly for both patterns? >> >> >> >> -- >> >> HTH, >> >> >> >> Kevin Spencer >> >> Microsoft MVP >> >> ..Net Developer >> >> >> >> Presuming that God is "only an idea" - >> >> Ideas exist. >> >> Therefore, God exists. >> >> >> >> "Jim Parsells" <JimParse***@discussions.microsoft.com> wrote in >> >> message >> >> news:BC52863C-3C98-4450-B9A4-438813E0FDEC@microsoft.com... >> >> > Given some text that looks like: >> >> > random Text Key 1.00 >> >> > More random Text >> >> > Item more random text = "Wanted1", "More random Text" >> >> > even more random Text >> >> > EndItem >> >> > Item more random text = "Wanted2", "Quoted Random Text" more Random >> >> > Text >> >> > even more random Text >> >> > even more random Text >> >> > EndItem >> >> > even more random Text >> >> > even more random Text >> >> > (where the random Text does not contain "Item", "=", or any """") >> >> > I would like to, in one Regex, capture such that >> >> > Group(1) = 1.00 >> >> > Group(2) = Wanted1 >> >> > Group(3) = Wanted2 >> >> > >> >> > I have been reduced to using 2 Regexs: >> >> > Key (\d\.\d+) -- to capture the 1.00 and >> >> > Item[^=]+=\s?"(\w+)" to capture Wanted1 and Wanted2 >> >> > >> >> > All attempts to combine the two Regexs result in (at best): >> >> > Group(1) = 1.00 >> >> > Group(2) = Wanted2 >> >> > >> >> > I understand that the greedy match leads me to this result, but it >> >> > seems >> >> > to >> >> > me that this should be doable in 1 Regex. >> >> > >> >> > -- >> >> > Thanks for any guidance, >> >> > Jim Parsells >> >> >> >> >> >> >> >> >> I am not 100% sure of what you are trying to do with your Regex, so I cannot
work out the string and be sure it will be of help to you, but I can give you a way of setting it up. Here is what I think you are trying to do (although I have some questions from the way your post is written): Find a line where there is an equal sign that has either a) a word value or b) a decimal value. Take it to the next level and pseudo code it: (Find Equal sign) with ((word) or (decimal value)) If you need something before the equal sign (the area I am fuzzy on looking at your post), you can set the condition(s) before the find on the equal sign. Great site for playing: http://www.regular-expressions.info/tutorial.html -- Show quoteGregory A. Beamer MVP; MCP: +I, SE, SD, DBA *************************** Think Outside the Box! *************************** "Jim Parsells" wrote: > Given some text that looks like: > random Text Key 1.00 > More random Text > Item more random text = "Wanted1", "More random Text" > even more random Text > EndItem > Item more random text = "Wanted2", "Quoted Random Text" more Random Text > even more random Text > even more random Text > EndItem > even more random Text > even more random Text > (where the random Text does not contain "Item", "=", or any """") > I would like to, in one Regex, capture such that > Group(1) = 1.00 > Group(2) = Wanted1 > Group(3) = Wanted2 > > I have been reduced to using 2 Regexs: > Key (\d\.\d+) -- to capture the 1.00 and > Item[^=]+=\s?"(\w+)" to capture Wanted1 and Wanted2 > > All attempts to combine the two Regexs result in (at best): > Group(1) = 1.00 > Group(2) = Wanted2 > > I understand that the greedy match leads me to this result, but it seems to > me that this should be doable in 1 Regex. > > -- > Thanks for any guidance, > Jim Parsells Thanks for the response. However, the question has been answered.
See my elaboration to Kevin and his reply. Problem solved. Key (\d\.\d+)|Item[^=]+=\s?"(\w+)" I was looking for a more complex solution when the simple did the job. -- Show quoteJim Parsells "Cowboy (Gregory A. Beamer) - MVP" wrote: > I am not 100% sure of what you are trying to do with your Regex, so I cannot > work out the string and be sure it will be of help to you, but I can give you > a way of setting it up. > > Here is what I think you are trying to do (although I have some questions > from the way your post is written): Find a line where there is an equal sign > that has either a) a word value or b) a decimal value. > > Take it to the next level and pseudo code it: > > (Find Equal sign) with ((word) or (decimal value)) > > If you need something before the equal sign (the area I am fuzzy on looking > at your post), you can set the condition(s) before the find on the equal sign. > > Great site for playing: > http://www.regular-expressions.info/tutorial.html > > -- > Gregory A. Beamer > MVP; MCP: +I, SE, SD, DBA > > *************************** > Think Outside the Box! > *************************** > > > "Jim Parsells" wrote: > > > Given some text that looks like: > > random Text Key 1.00 > > More random Text > > Item more random text = "Wanted1", "More random Text" > > even more random Text > > EndItem > > Item more random text = "Wanted2", "Quoted Random Text" more Random Text > > even more random Text > > even more random Text > > EndItem > > even more random Text > > even more random Text > > (where the random Text does not contain "Item", "=", or any """") > > I would like to, in one Regex, capture such that > > Group(1) = 1.00 > > Group(2) = Wanted1 > > Group(3) = Wanted2 > > > > I have been reduced to using 2 Regexs: > > Key (\d\.\d+) -- to capture the 1.00 and > > Item[^=]+=\s?"(\w+)" to capture Wanted1 and Wanted2 > > > > All attempts to combine the two Regexs result in (at best): > > Group(1) = 1.00 > > Group(2) = Wanted2 > > > > I understand that the greedy match leads me to this result, but it seems to > > me that this should be doable in 1 Regex. > > > > -- > > Thanks for any guidance, > > Jim Parsells |
|||||||||||||||||||||||