Wouldn’t it be great if books had embedded videos? That would make programming textbooks so much easier.
We have to give some hints about how regexes work in the first regex chapter in Learning Perl. It’s hard to describe something like greedy matching and backtracking with only words. It seems like it should be simple to describe, but you are probably like me: you think that because you already understand those concepts.
The Regexp::Debugger can animate a regular expression to show the steps Perl takes as it applies parts of a pattern. Unfortunately, we haven’t shown readers how to install Perl modules yet. That’s not a big deal. Since we can’t show videos in the book (not even in ebooks!), we have to do it without them.
We can show animations on this website though!
We include an example of the *
quantifier to match zero or more times. The quantifier is greedy, so inside the regex engine, the .*
grabs the rest of the string. When Perl needs to match the rest of the pattern, it has to backtrack. Here’s the code I wish I could animate in the book:
# greedy $_ = 'Bamm bamm'; if (/B.*m/) { print "It matched!\n"; }
To run this under Regexp::Debugger, you load that module. You can do this without changing the program by loading it on the command line:
% perl -MRegexp::Debugger greedy
There’s an rxrx
convenience program from that distribution that does the same thing:
% rxrx greedy
In the animation, the blue highlights note the parts that matched. The magenta color shows parts that are in the processing of matching (and might yet fail). Notice in the lower right there’s a step counter to show how hard the regex engine has to work.
The next example has a more extreme version of backtracking.
$_ = "fred and barney went bowling last night"; if (/fred.+barney/) { print "It matched!\n"; }
It’s tedious, but perhaps by experiencing that tedium you’ll write better regular expressions:
Curious why in the second example, as it is searching in reverse order, the checking doesn’t jump by the length of the string it’s trying to match (i.e. barney)
I wish I had this module when I read Jeffrey Friedl’s “Mastering Regular Expressions”. Frankly, I can’t believe that is wasn’t listed as a resource in the book.
that’s wonderful! i can’t wait until chapter 11 to understand how should i use this on my own system!