$ /usr/bin/perl --version
This is perl, v5.8.0 built for i386-linux-thread-multi (with 1 registered patch, see perl -V for more detail)Full Perl license text removed for brevity
$ /usr/local/bin/perl --version This is perl, v5.8.8 built for i686-linuxFull Perl license text removed for brevity
$ echo á | /usr/bin/perl -pe 's/á/aye/' á $ echo á | /usr/local/bin/perl -pe 's/á/aye/' ayeSo, is there any way to work around this problem? Nope. You might think "use utf8" will fix this issue. It doesn't.
$ cat unicode.char
á
$ cat unicode.script
use utf8;
open(A,"< unicode.char");
while(<A>) {
$_ =~ s/á/aye/;
print;
}
$ /usr/bin/perl unicode.script
aye
$ /usr/local/bin/perl unicode.script
á
As you can see, "use utf8" just causes Perl 5.8.0 to do the
right thing, yet breaks Perl 5.8.8. So maybe we can fix this
with a conditional statement.
$ cat unicode.script.2
$vers=sprintf("%vd",$^V);
if($vers =~ /5.8.0/) {
use utf8;
}
open(A,"< unicode.char");
while(<A>) {
$_ =~ s/á/aye/;
print;
}
$ /usr/bin/perl unicode.script.2
á
$ /usr/local/bin/perl unicode.script.2
aye
At this point, I gave up. These days, I write in either awk
(for simple stuff) or Python (for complicated stuff). For example,
none of the four freely downloadable awk interpreters have this
problem:
$ echo á | busybox awk '{gsub(/á/,"aye");print}'
aye
$ echo á | gawk '{gsub(/á/,"aye");print}'
aye
$ echo á | mawk '{gsub(/á/,"aye");print}'
aye
$ echo á | bwk-awk '{gsub(/á/,"aye");print}'
aye
The nice thing about awk is that there is a Posix
standard out
there; this guarantees that I can write my awk scripts in a
manner that will work on any modern system with an awk interpreter.
The nice thing about Python is that there is a strong committment from the Python community to not arbitrarily break things or make changes which break scripts between bugfix releases.