|
|||
|
I know better than to work late at night, but sometimes it just can't be helped :-)
I'm doing a simple s///, converting "www." to "http://www." when "www." occurs without a preceding "http://". Here's what I'm doing: $text = "www.example.com"; $text =~ s#[^(http://)]www\.#http://www\.#gi; print $text; If $text is this, though: $text = "<div>www.example.com</div>"; the regex is catching the > in <div>, printing: <divhttp://www.example.com</div> Where am I screwing up? |
|
|
||||
|
||||
|
|
|
|||
|
Am 18.07.2012 06:01, schrieb Jason C:
> I know better than to work late at night, but sometimes it just can't be helped :-) > > I'm doing a simple s///, converting "www." to "http://www." > when "www." occurs without a preceding "http://". Here's what I'm doing: > > $text = "www.example.com"; > $text =~ s#[^(http://)]www\.#http://www\.#gi; > print $text; > > If $text is this, though: > > $text = "<div>www.example.com</div>"; > > the regex is catching the > in <div>, printing: > > <divhttp://www.example.com</div> > > Where am I screwing up? You don't want to use a character class (square brackets). [^(http://)] tells perl to look for any character not listed inside the square brackets after the negation (^), so this might as well read [^)(/:hpt]. What you're trying to do is a zero width negative look-behind assertion. s#(?<!http://)www\.#http://www.#gi should do the trick. The "(?<!...)" tells the regex engine to only match the following pattern if it is not preceded by the pattern in the look-behind, without capturing anything. "perldoc perlre" has good explanations for character classes and look-around assertions. -Chris |
|
|||
|
On Wednesday, July 18, 2012 12:57:00 AM UTC-4, thepoet wrote:
> What you're trying to do is a zero width negative look-behind > assertion. > s#(?<!http://)www\.#http://www.#gi should do the trick. > The "(?<!...)" tells the regex engine to only match the following > pattern if it is not preceded by the pattern in the look-behind, > without capturing anything. > > "perldoc perlre" has good explanations for character classes > and look-around assertions. > > -Chris Thanks for the help, Chris. Character classes aren't exactly intuitive when a symbol changes definition completely based on context, so I'm still struggling with that a little. The modification you suggested was perfect, though! Thanks again :-) |
|
|||
|
Jason C <jwcarlton@gmail.com> writes:
> On Wednesday, July 18, 2012 12:57:00 AM UTC-4, thepoet wrote: >> What you're trying to do is a zero width negative look-behind >> assertion. >> s#(?<!http://)www\.#http://www.#gi should do the trick. >> The "(?<!...)" tells the regex engine to only match the following >> pattern if it is not preceded by the pattern in the look-behind, >> without capturing anything. >> >> "perldoc perlre" has good explanations for character classes >> and look-around assertions. >> >> -Chris > > Thanks for the help, Chris. Character classes aren't exactly > intuitive when a symbol changes definition completely based on > context, so I'm still struggling with that a little. A character class denotes an unordered set of characters, meaning [^http://] [^htp:/] [^ ppppth/][^:/hpt] [^h:t/p] all represent identical sets and they all match a single character. But you wanted to match the string http:// and a regex matching a string is just the string itself, IOW, THIS sequence of characters. |
|
|
![]() |
| Thread Tools | |
| Display Modes | |
|
|