收录日期:2019/09/19 00:16:56 时间:2008-12-07 02:47:20 标签:regex,perl,substitution

I have been programming in Perl, off and on, for years now, although only sporadically is it my primary language. Because I often go months without writing any perl, I rely heavily on my dog-eared Camel Book to remind me how to do things. However, when I copy recipes verbatim with no understanding, this bothers me. This is one of the most vexing: On page 154 of the 3rd edition Camel, there is an example for "modifying strings en passant, which reads like this:

($lotr = $hobbit) =~ s/Bilbo/Frodo/g;

Q1) what is going on here? On what, exactly, is the regex operating?

Q2) Is this near-magical syntax necessary for such a basic operation as "take a string from $a, modify it with a regex, place result in $b"?

Q3) How do I do this operation using the loop default variable as the initial string?

Apologies in advance to Perl dreamers for whom the above looks perfectly natural.

Hmmm... ($lotr=$hobbit) =~ s/Bilbo/Frodo/g is one of the many magicks of Perl. Now for some answers.

Q1) $lotr is being assigned the value contained in $hobbit. After the assignment, we can forget about the source variable. Treat ($lotr = $hobbit) as it's own statement as-if we had written:

$lotr = $hobbit;
$lotr =~ s/Bilbo/Frodo/g;

instead. The regex is operating on $lotr.

Q2) The syntax is simply a one-line version of the snippet given above. Think of it as "copy the string from $a, copy it into $b, and modify $b with the regex" instead of "take a string from $a, modify it with a regex, place result in $b"

Q3) I'm assuming that you mean the default pattern searching space by "loop default variable"? In that case, just use $_ instead of $hobbit:

while (<>) {
  chomp;
  ($lotr = $_) =~ s/Bilbo/Frodo/g;
  print "\$lotr = [$lotr]\n";
  print "\$_    = [$_]\n";
}

Interestingly enough, the magic var $_ is not modified by this operation. This is how you can conclude that the assignment happens before the regex substitution and that the substitution does not interact on the default pattern space in any way.

And for the experienced Perl programmers thing... I don't know too many people that are thrown by some piece of Perl syntax regardless of how long they have been staring at it 'cept Mr. Schwartz of course ;)

Different languages do different things with the assignment operator. In some, it is a statement, not an expression, so can't be combined into a larger expression. In some, it returns the value assigned, but only as a value (often called an rvalue), not as something that itself can be modified or assigned (an lvalue). An example of this is C, where you can write:

 a = b = c;  # assign value of c to b and a

but not:

 ++(b = a);  # assign value of a to b, then increment b

And in some, the result of an assignment is the lvalue assigned to. (In Perl, this applies only to scalar assignments; list assignments are a more complex beast.) So you can do ++($b=$a) or ($b=$a) =~ s/a/b/ or any other operation that expects an lvalue on the assignment, and it will first do the assignment and then further modify the lvalue assigned to.

If in Q2 you mean to take a sub-string from $a, you might find this idiom useful:

($b) = $a =~ /(substring-to-match)/;
$b =~ s/regex-on-susbtring/result-string/;

Also do note that $a and $b are not normal variables in Perl since they have special scope rules related to the sort function. See 'perldoc perlvar' for details.