|
dev
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
Regular ExpressionsI need to take a short piece of html/xml as a string
IE: <tagName attribute1='value' attribute2=value/> Note that one attribute value is quoted and one is not. The input string could have any number of attributes with any combination of quoted and unquoted values. The ouput string needs all values quoted. Can I do this with regular expressions or can you offer another suggestion? -- Bill Bill Long wrote:
>I need to take a short piece of html/xml as a string Sure you can do that with regular expressions. For example, try to replace > >IE: <tagName attribute1='value' attribute2=value/> > >Note that one attribute value is quoted and one is not. > >The input string could have any number of attributes with any combination >of >quoted and unquoted values. > >The ouput string needs all values quoted. > >Can I do this with regular expressions or can you offer another suggestion? this expression =(?<val>[A-Za-z0-9_-]+) with ="${val}" I say for example, because you'll need to figure out which characters you want to use for the non-quoted attribute value - the example expression uses all upper and lower case characters, digits, the underscore and the dash. Oliver Sturm -- Expert programming and consulting services available See http://www.sturmnet.org (try /blog as well) Here's one we worked out for getting HTML form field tag attributes. It also
checks for the 2 unnamed form field attributes "selected" or "checked". It can work with or without the trailing slash character. It puts the name of the attribute into group 1, the value into group 2, and any "selected" or "checked" into group 3. You can remove the condition for "selected" or "checked" if you like. -- Show quoteHTH, Kevin Spencer Microsoft MVP ..Net Developer Ambiguity has a certain quality to it. "Bill Long" <BillL***@discussions.microsoft.com> wrote in message news:2CE0B7DD-1C92-4BC9-95AA-75BC8CD94018@microsoft.com... >I need to take a short piece of html/xml as a string > > IE: <tagName attribute1='value' attribute2=value/> > > Note that one attribute value is quoted and one is not. > > The input string could have any number of attributes with any combination > of > quoted and unquoted values. > > The ouput string needs all values quoted. > > Can I do this with regular expressions or can you offer another > suggestion? > > -- > Bill Whoa. Just realized I didn't post the regular expression!
(?i)\s+(?:(\w+)=(?:["']?([^"'/>=]*)["']?)(?<!\s*(?:selected|checked))(?=\s|/?>)|(?:\s*(selected|checked))(?=\s|/?>)) -- Show quoteHTH, Kevin Spencer Microsoft MVP ..Net Developer Ambiguity has a certain quality to it. "Kevin Spencer" <kevin@DIESPAMMERSDIEtakempis.com> wrote in message news:ebZkgcGzFHA.2800@TK2MSFTNGP10.phx.gbl... > Here's one we worked out for getting HTML form field tag attributes. It > also checks for the 2 unnamed form field attributes "selected" or > "checked". It can work with or without the trailing slash character. It > puts the name of the attribute into group 1, the value into group 2, and > any "selected" or "checked" into group 3. You can remove the condition for > "selected" or "checked" if you like. > > -- > HTH, > > Kevin Spencer > Microsoft MVP > .Net Developer > Ambiguity has a certain quality to it. > > "Bill Long" <BillL***@discussions.microsoft.com> wrote in message > news:2CE0B7DD-1C92-4BC9-95AA-75BC8CD94018@microsoft.com... >>I need to take a short piece of html/xml as a string >> >> IE: <tagName attribute1='value' attribute2=value/> >> >> Note that one attribute value is quoted and one is not. >> >> The input string could have any number of attributes with any combination >> of >> quoted and unquoted values. >> >> The ouput string needs all values quoted. >> >> Can I do this with regular expressions or can you offer another >> suggestion? >> >> -- >> Bill > > I'll give this a try. Thank you
-- Show quoteBill "Kevin Spencer" wrote: > Whoa. Just realized I didn't post the regular expression! > > (?i)\s+(?:(\w+)=(?:["']?([^"'/>=]*)["']?)(?<!\s*(?:selected|checked))(?=\s|/?>)|(?:\s*(selected|checked))(?=\s|/?>)) > > -- > HTH, > > Kevin Spencer > Microsoft MVP > ..Net Developer > Ambiguity has a certain quality to it. > > "Kevin Spencer" <kevin@DIESPAMMERSDIEtakempis.com> wrote in message > news:ebZkgcGzFHA.2800@TK2MSFTNGP10.phx.gbl... > > Here's one we worked out for getting HTML form field tag attributes. It > > also checks for the 2 unnamed form field attributes "selected" or > > "checked". It can work with or without the trailing slash character. It > > puts the name of the attribute into group 1, the value into group 2, and > > any "selected" or "checked" into group 3. You can remove the condition for > > "selected" or "checked" if you like. > > > > -- > > HTH, > > > > Kevin Spencer > > Microsoft MVP > > .Net Developer > > Ambiguity has a certain quality to it. > > > > "Bill Long" <BillL***@discussions.microsoft.com> wrote in message > > news:2CE0B7DD-1C92-4BC9-95AA-75BC8CD94018@microsoft.com... > >>I need to take a short piece of html/xml as a string > >> > >> IE: <tagName attribute1='value' attribute2=value/> > >> > >> Note that one attribute value is quoted and one is not. > >> > >> The input string could have any number of attributes with any combination > >> of > >> quoted and unquoted values. > >> > >> The ouput string needs all values quoted. > >> > >> Can I do this with regular expressions or can you offer another > >> suggestion? > >> > >> -- > >> Bill > > > > > > > |
|||||||||||||||||||||||