Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

s///x

9 views
Skip to first unread message

Dr.Ruud

unread,
Nov 2, 2005, 7:31:43 AM11/2/05
to
I was trying the s///x syntax and got unexpected results.
Somebody cares to explain?

Simplified example:

#!/usr/bin/perl

use warnings;
use strict;

local ($,, $\) = ("\t", "\n");
my $x;

$_ = "abc 123 def 123 ghi";

$x = s/ # Replace
1 # ONE
2 # TWO
3 # THREE
/ # by
4 # FOUR
5 # FIVE
6 # SIX
/gsx; # global, single line, extended format

print 'Made', $x, 'replacements.';
print;


This printed:

Made 2 replacements.
abc # by
4 # FOUR
5 # FIVE
6 # SIX
def # by
4 # FOUR
5 # FIVE
6 # SIX
ghi

I expected: abc 456 def 456 ghi

--
Affijn, Ruud & perl, v5.8.6 built for i386-freebsd-64int

"Gewoon is een tijger."

Anno Siegel

unread,
Nov 2, 2005, 8:00:54 AM11/2/05
to
Dr.Ruud <rvtol...@isolution.nl> wrote in comp.lang.perl.misc:

The changes by /x only affect the regex proper. The replacement part
is still an ordinary double-quotish string.

Anno
--
If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers.

Paul Lalli

unread,
Nov 2, 2005, 8:02:52 AM11/2/05
to

Your expectations were incorrect.

The /x modifier causes whitespace in the *pattern match* to be ignored.
The replacement portion of a s/// operation is not a pattern match -
it is a double-quoted string. /x has no effect on this replacement.

Paul Lalli

Gunnar Hjalmarsson

unread,
Nov 2, 2005, 8:09:00 AM11/2/05
to

Try:

$x = s/ # Replace
1 # ONE
2 # TWO
3 # THREE

/456/gx; # global, extended format

Note that the /s modifier is redundant (see "perldoc perlre").

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl

Dr.Ruud

unread,
Nov 2, 2005, 8:37:42 AM11/2/05
to
Gunnar Hjalmarsson schreef:


> [s/ 123 #one-two-three / 456 #four-five-six /x]


>
> Try:
>
> $x = s/ # Replace
> 1 # ONE
> 2 # TWO
> 3 # THREE
> /456/gx; # global, extended format

Of course what is in my real and working code is a lot more like that.
But I like the commented format much better and was real disappointed
that it didn't work.


> Note that the /s modifier is redundant (see "perldoc perlre").

I don't consider the /s modifier redundant. It was not needed in my
example, so maybe you meant "redundant here"?

--
Affijn, Ruud

"Gewoon is een tijger."

Paul Lalli

unread,
Nov 2, 2005, 9:05:04 AM11/2/05
to
Dr.Ruud wrote:
> Gunnar Hjalmarsson schreef:

> > Note that the /s modifier is redundant (see "perldoc perlre").
>
> I don't consider the /s modifier redundant. It was not needed in my
> example, so maybe you meant "redundant here"?

Redundant would be if you had something in your pattern match like:
/stuff(?:.|\n)stuff/s

Here, I think /s is simply extraneous.

Paul Lalli

Gunnar Hjalmarsson

unread,
Nov 2, 2005, 10:45:09 AM11/2/05
to
Dr.Ruud wrote:
> Gunnar Hjalmarsson schreef:
>>Note that the /s modifier is redundant (see "perldoc perlre").
>
> I don't consider the /s modifier redundant. It was not needed in my
> example, so maybe you meant "redundant here"?

Okay, redundant (or extraneous...) here. I mentioned it because people
misunderstand the meaning of it all the time, and I believe one reason
for that is that "perldoc perlre" - unlike e.g. "perldoc perlop" - is
the only place in the docs (to my knowledge) where its meaning is
properly explained.

Dr.Ruud

unread,
Nov 2, 2005, 3:58:04 PM11/2/05
to
Gunnar Hjalmarsson:
> Dr.Ruud:
>> Gunnar Hjalmarsson:

>>> Note that the /s modifier is redundant (see "perldoc perlre").
>>
>> I don't consider the /s modifier redundant. It was not needed in my
>> example, so maybe you meant "redundant here"?
>
> Okay, redundant (or extraneous...) here. I mentioned it because people
> misunderstand the meaning of it all the time, and I believe one reason
> for that is that "perldoc perlre" - unlike e.g. "perldoc perlop" - is
> the only place in the docs (to my knowledge) where its meaning is
> properly explained.

OK. It would be nice to have an educational piece of code about /m and
/s.

Let me make a start:

# a.1: without /s, the .* will match up to the first \n
$ echo 'first
second
third' | perl -pe 's/.*/#/'
#
#
#

# a.2: with /s, the .* will match until the very end
$ echo 'first
second
third' | perl -pe 's/.*/#/s'
###


# b.1: without /s or /m, the .$ will match nothing if there are
# two newlines at the end
$ echo 'first
second
third' | perl -pe '$_.="\n"; s/.$/#/'
first

second

third


# b.2: with /s, the .$ will match anything before the last \n
$ echo 'first
second
third' | perl -pe '$_.="\n"; s/.$/#/s'
first#
second#
third#


# b.3: with /m, the .$ will match anything before the first \n
$ echo 'first
second
third' | perl -pe '$_.="\n"; s/.$/#/m'
firs#

secon#

thir#

Abigail

unread,
Nov 2, 2005, 4:45:33 PM11/2/05
to
Gunnar Hjalmarsson (nor...@gunnar.cc) wrote on MMMMCDXLVI September
MCMXCIII in <URL:news:3ss54cF...@individual.net>:
** Dr.Ruud wrote:
** > Gunnar Hjalmarsson schreef:
** >>Note that the /s modifier is redundant (see "perldoc perlre").
** >
** > I don't consider the /s modifier redundant. It was not needed in my
** > example, so maybe you meant "redundant here"?
**
** Okay, redundant (or extraneous...) here. I mentioned it because people
** misunderstand the meaning of it all the time, and I believe one reason
** for that is that "perldoc perlre" - unlike e.g. "perldoc perlop" - is
** the only place in the docs (to my knowledge) where its meaning is
** properly explained.


Damian makes a good argument in PBP to always use /s and /m.

I don't think it's worth raising your finger if someone uses /s or /m
on a regex where it doesn't matter. It's like complaining someone uses
'use warnings' on a piece of code where it didn't matter.

Abigail
--
perl -we 'print split /(?=(.*))/s => "Just another Perl Hacker\n";'

Dr.Ruud

unread,
Nov 2, 2005, 6:26:59 PM11/2/05
to
Anno Siegel schreef:

> The changes by /x only affect the regex proper. The replacement part
> is still an ordinary double-quotish string.

OK. I am still trying to think up why it was chosen to not affect the
replacement part. I have no doubt that there is a simple explanation why
it is not feasible, but I just can't think it up (tired of working some
very long days, but very satisfied with the results and very happy with
Perl).

Gunnar Hjalmarsson

unread,
Nov 2, 2005, 8:46:10 PM11/2/05
to
Abigail wrote:
> Gunnar Hjalmarsson (nor...@gunnar.cc) wrote on MMMMCDXLVI September
> MCMXCIII in <URL:news:3ss54cF...@individual.net>:
> ** Dr.Ruud wrote:
> ** > I don't consider the /s modifier redundant. It was not needed in my
> ** > example, so maybe you meant "redundant here"?
> **
> ** Okay, redundant (or extraneous...) here. I mentioned it because people
> ** misunderstand the meaning of it all the time, and I believe one reason
> ** for that is that "perldoc perlre" - unlike e.g. "perldoc perlop" - is
> ** the only place in the docs (to my knowledge) where its meaning is
> ** properly explained.
>
> Damian makes a good argument in PBP to always use /s and /m.

What's PBP?

> I don't think it's worth raising your finger if someone uses /s or /m
> on a regex where it doesn't matter. It's like complaining someone uses
> 'use warnings' on a piece of code where it didn't matter.

A better parallel IMO is that it's like complaining when someone calls a
function using '&' without knowing the implications of doing so. It
'works' most of the time, but not always...
(Not saying that Dr. Ruud doesn't know the implications of using the /s
modifier. It's now obvious that he does.)

John Bokma

unread,
Nov 2, 2005, 9:30:47 PM11/2/05
to
Gunnar Hjalmarsson <nor...@gunnar.cc> wrote:

> A better parallel IMO is that it's like complaining when someone calls
> a function using '&' without knowing the implications of doing so. It
> 'works' most of the time, but not always...

Yup, I agree on that one. If I see &sub, I assume that the user requires
the & there. Same with /s or /m. It confuses me if it's just there and adds
line noise.


--
John Small Perl scripts: http://johnbokma.com/perl/
Perl programmer available: http://castleamber.com/
I ploink googlegroups.com :-)

Tad McClellan

unread,
Nov 2, 2005, 9:02:05 PM11/2/05
to
Dr.Ruud <rvtol...@isolution.nl> wrote:
> Anno Siegel schreef:
>
>> The changes by /x only affect the regex proper. The replacement part
>> is still an ordinary double-quotish string.
>
> OK. I am still trying to think up why it was chosen to not affect the
> replacement part.


Because spaces are _supposed_ to matter when they are in a string.


--
Tad McClellan SGML consulting
ta...@augustmail.com Perl programming
Fort Worth, Texas

Stan R.

unread,
Nov 2, 2005, 10:27:07 PM11/2/05
to
Tad McClellan wrote:
> Dr.Ruud <rvtol...@isolution.nl> wrote:
>> Anno Siegel schreef:
>>
>>> The changes by /x only affect the regex proper. The replacement
>>> part is still an ordinary double-quotish string.
>>
>> OK. I am still trying to think up why it was chosen to not affect the
>> replacement part.
>
>
> Because spaces are _supposed_ to matter when they are in a string.

Tad, what I think he might be getting at is if there soem a possibility
to have a modifier on a literal strings to allow cmments. I cna see how
doign that might not make a lot of sense in many ways (its a string for
cryin' out loud!), but I just thought I'd point out it seems his
thinking is hinting in that general direction perhaps.

--
Stan


Tad McClellan

unread,
Nov 2, 2005, 9:47:53 PM11/2/05
to
Abigail <abi...@abigail.nl> wrote:


> Damian makes a good argument in PBP to always use /s and /m.


I'd better go read it.


> I don't think it's worth raising your finger if someone uses /s or /m
> on a regex where it doesn't matter.


To me, modifiers mean "something out of the ordinary here, pay attention!".

I feel tricked when I try to figure out why the programmer wanted dot
to match newline, only to find that there isn't even a dot in the pattern.


> It's like complaining someone uses
> 'use warnings' on a piece of code where it didn't matter.


'use warnings' always matters.[1] (heh)

[1] Message-ID: <slrn99mn0h.n9t....@gdndev32.lido-tech>

Tad McClellan

unread,
Nov 2, 2005, 9:51:19 PM11/2/05
to
Gunnar Hjalmarsson <nor...@gunnar.cc> wrote:


> What's PBP?


Peanut Butter Perl? :-)


Or "Perl Best Practices":

http://www.oreilly.com/catalog/perlbp/

Abigail

unread,
Nov 3, 2005, 3:13:25 AM11/3/05
to
Tad McClellan (ta...@augustmail.com) wrote on MMMMCDXLVII September
MCMXCIII in <URL:news:slrndmiuip...@magna.augustmail.com>:
)) Abigail <abi...@abigail.nl> wrote:
))
))
)) > Damian makes a good argument in PBP to always use /s and /m.
))
))
)) I'd better go read it.


It's a good read. One of the best Perl books published by O'Reilly.


)) > I don't think it's worth raising your finger if someone uses /s or /m
)) > on a regex where it doesn't matter.
))
))
)) To me, modifiers mean "something out of the ordinary here, pay attention!".
))
)) I feel tricked when I try to figure out why the programmer wanted dot
)) to match newline, only to find that there isn't even a dot in the pattern.


That could be, but that's _your_ problem. That's not a reason at all why
said programmer shouldn't use /s or /m. I don't expect you to program in
a style that suits me, so I don't expect you to demand that from someone
else. Your inability to understand code is something you have to solve
yourself. (Practise! ;-))

Damian's argument is that most programmers expect "." to match any
character. And for "^" and "$" to match the beginning and end of a
line. He says that if you always use /sm, you _never_ have to wonder
whether "." matches a newline or not.

Abigail
--
BEGIN {print "Just " }
INIT {print "Perl " }
END {print "Hacker\n"}
CHECK {print "another "}

Anno Siegel

unread,
Nov 3, 2005, 4:46:13 AM11/3/05
to
Abigail <abi...@abigail.nl> wrote in comp.lang.perl.misc:

> Gunnar Hjalmarsson (nor...@gunnar.cc) wrote on MMMMCDXLVI September
> MCMXCIII in <URL:news:3ss54cF...@individual.net>:
> ** Dr.Ruud wrote:
> ** > Gunnar Hjalmarsson schreef:
> ** >>Note that the /s modifier is redundant (see "perldoc perlre").
> ** >
> ** > I don't consider the /s modifier redundant. It was not needed in my
> ** > example, so maybe you meant "redundant here"?
> **
> ** Okay, redundant (or extraneous...) here. I mentioned it because people
> ** misunderstand the meaning of it all the time, and I believe one reason
> ** for that is that "perldoc perlre" - unlike e.g. "perldoc perlop" - is
> ** the only place in the docs (to my knowledge) where its meaning is
> ** properly explained.
>
>
> Damian makes a good argument in PBP to always use /s and /m.

The recommendation is to use /xms on all regular expressions, whether
the modifiers make a difference or not. It is not an invitation to add
combinations of /x, /m and /s at random.

> I don't think it's worth raising your finger if someone uses /s or /m
> on a regex where it doesn't matter. It's like complaining someone uses
> 'use warnings' on a piece of code where it didn't matter.

...or like using "sort keys ..." where "keys ..." would have done?

It really depends on what the rest of the code is like -- context. If
the general quality of the code is good, an redundant /m is, of course,
no big deal. In code that is clearly written by a beginner, it is a
sign of insecurity and/or cargo culting and ought to be pointed out.

As a reader of a piece of code, it is important to develop a feeling
for the authors competence -- how far can you trust the code. Redundant
constructs are an important indicator *against* the authors competence.
That's why it is generally a good idea to avoid them.

Dr.Ruud

unread,
Nov 3, 2005, 5:15:28 AM11/3/05
to
Abigail:

> Damian's argument is that most programmers expect "." to match any
> character. And for "^" and "$" to match the beginning and end of a
> line. He says that if you always use /sm, you _never_ have to wonder
> whether "." matches a newline or not.

/sm would be a nice default. But then you need a way to disable it: /SM.

Dr.Ruud

unread,
Nov 3, 2005, 5:04:23 AM11/3/05
to
Tad McClellan:
> Dr.Ruud:
>> Anno Siegel:

>>> The changes by /x only affect the regex proper. The replacement
>>> part is still an ordinary double-quotish string.
>>
>> OK. I am still trying to think up why it was chosen to not affect the
>> replacement part.
>
> Because spaces are _supposed_ to matter when they are in a string.

There can also be spaces in the regex, and there are several ways to
present them.
I use \s where possible, and also "\x{0020}", "\x{20}", even "[ ]", "\
", depending on the context.

So I still see no reason why unprotected spaces should not be ignored in
the replacement part.
"\x{20}" and "\ " would work fine there too.
Or use a variable with a run of spaces, like $space42 = ' 'x42, and an
o-modifier.

--
Affijn, Ruud (gimme a \X)

"Gewoon is een tijger."

Anno Siegel

unread,
Nov 3, 2005, 5:39:16 AM11/3/05
to
Dr.Ruud <rvtol...@isolution.nl> wrote in comp.lang.perl.misc:
> Abigail:
>
> > Damian's argument is that most programmers expect "." to match any
> > character. And for "^" and "$" to match the beginning and end of a
> > line. He says that if you always use /sm, you _never_ have to wonder
> > whether "." matches a newline or not.
>
> /sm would be a nice default. But then you need a way to disable it: /SM.

As noted in another thread, PBP recommends /xsm for all regexes. That
is also the standard in Perl 6.

Dr.Ruud

unread,
Nov 3, 2005, 5:40:12 AM11/3/05
to
Anno Siegel:

> Redundant constructs are an important indicator *against* the authors
> competence. That's why it is generally a good idea to avoid them.

I generally agree.

The Posting Guidelines say: "Do not provide too much information", so I
did cut down my code to an example of a few lines. But I forgot to toss
the s-modifier.

And now that I have inserted the m- and x-modifiers in all the
appropriate places (I had read that chapter of PBP before but had forgot
about it), I won't even have to do that anymore. :)

Dr.Ruud

unread,
Nov 3, 2005, 5:50:45 AM11/3/05
to
Anno Siegel schreef:
> Dr.Ruud:
>> Abigail:

>>> Damian's argument is that most programmers expect "." to match any
>>> character. And for "^" and "$" to match the beginning and end of a
>>> line. He says that if you always use /sm, you _never_ have to wonder
>>> whether "." matches a newline or not.
>>
>> /sm would be a nice default. But then you need a way to disable it:
>> /SM.
>
> As noted in another thread, PBP recommends /xsm for all regexes. That
> is also the standard in Perl 6.

OK, great. I tend to write it as /msx, so in alphabetical order.

What is the way to make '.' not match "\n" in Perl6?

I guess Perl-5-mode, or convert to something like "[^\n]".

Anno Siegel

unread,
Nov 3, 2005, 6:14:13 AM11/3/05
to
Dr.Ruud <rvtol...@isolution.nl> wrote in comp.lang.perl.misc:
> Anno Siegel schreef:
> > Dr.Ruud:
> >> Abigail:
>
> >>> Damian's argument is that most programmers expect "." to match any
> >>> character. And for "^" and "$" to match the beginning and end of a
> >>> line. He says that if you always use /sm, you _never_ have to wonder
> >>> whether "." matches a newline or not.
> >>
> >> /sm would be a nice default. But then you need a way to disable it:
> >> /SM.
> >
> > As noted in another thread, PBP recommends /xsm for all regexes. That
> > is also the standard in Perl 6.
>
> OK, great. I tend to write it as /msx, so in alphabetical order.

I'm still hesitant about adopting this, but if I do, I'll use /xms,
following the pattern in PBP. I'd like to make it as clear as possible
just what convention I'm following.

> What is the way to make '.' not match "\n" in Perl6?
>
> I guess Perl-5-mode, or convert to something like "[^\n]".

I guess it's /[^\n]". The simplification is that the dot *always* matches
all characters, no exceptions. That won't be broken. Switching to Perl 5
for the purpose would be obscure once people have forgotten the quirks
/./ used to have.

though...@gmail.com

unread,
Nov 3, 2005, 7:42:08 AM11/3/05
to
> What is the way to make '.' not match "\n" in Perl6?

You *can't* make . not match \n in Perl 6.

But there is a new metacharacter than matches "everything except \n"
Here's the table that illustrates the underlying pattern:

Match... Match anything but...

Whitespace \s \S
Word char \w \W
Digit \d \D
Newline \n \N

Damian

Anno Siegel

unread,
Nov 3, 2005, 9:13:21 AM11/3/05
to
Dr.Ruud <rvtol...@isolution.nl> wrote in comp.lang.perl.misc:

> "Gewoon is een tijger."

"Habit is a tiger"?

Yeah, a sleepy one, but when you want him to move he's got teeth.

Dr.Ruud

unread,
Nov 3, 2005, 9:58:44 AM11/3/05
to
Anno Siegel:
> Dr.Ruud:

>> "Gewoon is een tijger."
>
> "Habit is a tiger"?
>
> Yeah, a sleepy one, but when you want him to move he's got teeth.

A bit like that yes. It covers adjectives like 'normal', 'usual',
'habitual', 'customary', 'ordinary', 'general', 'common', 'simple',
'just', and most related adverbs too.

It is something my 3 year old answered when she got fed up with me
asking her several times in a row what she meant with 'Gewoon.' (I knew
that she meant "Just because." and she knew that I knew).

Abigail

unread,
Nov 3, 2005, 2:51:33 PM11/3/05
to
though...@gmail.com (though...@gmail.com) wrote on MMMMCDXLVII
September MCMXCIII in <URL:news:1131021728.3...@g14g2000cwa.googlegroups.com>:

}} > What is the way to make '.' not match "\n" in Perl6?
}}
}} You *can't* make . not match \n in Perl 6.


Sure you can. Just use a 'p5' prefix and tell Perl 6 you're using Perl 5
style regexes. ;-)

Abigail
--
($;,$_,$|,$\)=("\@\x7Fy~*kde~box*Zoxf*Bkiaox","X"x25,1,"\r");
s/./ /;{vec($_=>1+$"=>8)=ord($/^substr$;=>$"=int rand 24=>1);
print&&select$,,$,,$,,$|/($|+tr/X//c);redo if y/X//};sleep 1;

Ala Qumsieh

unread,
Nov 3, 2005, 3:32:35 PM11/3/05
to
Dr.Ruud wrote:

> OK. I am still trying to think up why it was chosen to not affect the
> replacement part. I have no doubt that there is a simple explanation why
> it is not feasible, but I just can't think it up (tired of working some
> very long days, but very satisfied with the results and very happy with
> Perl).

I don't think the reason is that it's not feasible, but rather that it's
not intuitive. Regular expressions can be messy, so having an option to
add comments, and 'beautify' them is a good idea. The replacement part
of an s/// is simply a string, and won't really benefit much from such
an option.

Moreover, you CAN add comments in the replacement part if you want to.
You just need to modify your code slightly, and use the /e modifier.
From your example:

$_ = "abc 123 def 123 ghi";

$x = s/ # Replace


1 # ONE
2 # TWO
3 # THREE

/ # by
4 . # FOUR
5 . # FIVE
6 # SIX
/gsex; # global sex

But, I think this can be less readable than the alternative if the
replacement part is a simple string.

--Ala

Dr.Ruud

unread,
Nov 3, 2005, 4:42:59 PM11/3/05
to
Ala Qumsieh:


> Regular expressions can be messy, so having an
> option to add comments, and 'beautify' them is a good idea. The
> replacement part of an s/// is simply a string, and won't really
> benefit much from such an option.

I met this with some rather lengthy strings of \x{####} in both the
search and the replacement part.


> Moreover, you CAN add comments in the replacement part if you want to.
> You just need to modify your code slightly, and use the /e modifier.
> From your example:
>
> $_ = "abc 123 def 123 ghi";
>
> $x = s/ # Replace
> 1 # ONE
> 2 # TWO
> 3 # THREE
> / # by
> 4 . # FOUR
> 5 . # FIVE
> 6 # SIX
> /gsex; # global sex

Thanks, that looks workable. Will the o-modifier make up for any lost
performance? I'll test it.


> But, I think this can be less readable than the alternative if the
> replacement part is a simple string.

Yes, but in my case it often isn't. The algorithm needs to be checked by
linguists. They rather read the Unicode character names and such, so I
like to use the \N{name} format, but (without that e-modifier) that
would give very lengthy lines. I was going to store everything in
variables, but I'll test this format too.

Short example (without backreferences):

$x = s/(?<=\x{0020})

\x{0111}\x{0123}\x{0222}\x{02AA}\x{0123}\x{0223}\x{0221}\x{0241}\x{0247}
\x{02E2}\x{0223}(?=\x{0020})

/\x{0117}\x{000D}\x{0223}\x{02AA}\x{000D}\x{0223}\x{0221}\x{0221}\x{0223
}/gmsx;

(actual codes munged)

John W. Krahn

unread,
Nov 3, 2005, 5:14:31 PM11/3/05
to
Dr.Ruud wrote:
> Ala Qumsieh:

>
>>Moreover, you CAN add comments in the replacement part if you want to.
>>You just need to modify your code slightly, and use the /e modifier.
>> From your example:
>>
>> $_ = "abc 123 def 123 ghi";
>>
>> $x = s/ # Replace
>> 1 # ONE
>> 2 # TWO
>> 3 # THREE
>> / # by
>> 4 . # FOUR
>> 5 . # FIVE
>> 6 # SIX
>> /gsex; # global sex
>
> Thanks, that looks workable. Will the o-modifier make up for any lost
> performance?

Probably not.

perldoc -q /o

John
--
use Perl;
program
fulfillment

Tad McClellan

unread,
Nov 3, 2005, 5:01:59 PM11/3/05
to
Dr.Ruud <rvtol...@isolution.nl> wrote:
> Ala Qumsieh:

>> $x = s/ # Replace
>> 1 # ONE
>> 2 # TWO
>> 3 # THREE
>> / # by
>> 4 . # FOUR
>> 5 . # FIVE
>> 6 # SIX
>> /gsex; # global sex
>
> Thanks, that looks workable. Will the o-modifier make up for any lost
> performance?


Of course not.

s///o is a no-op when there are no variables in the pattern part.

s///o has no effect whatsoever on the replacement string part.

but

/gosex; # go have sex

would be cute to have in code. :-)

Abigail

unread,
Nov 3, 2005, 6:25:55 PM11/3/05
to
Dr.Ruud (rvtol...@isolution.nl) wrote on MMMMCDXLVII September MCMXCIII
in <URL:news:dke4nr...@news.isolution.nl>:
** Ala Qumsieh:
**
**
** > Regular expressions can be messy, so having an
** > option to add comments, and 'beautify' them is a good idea. The
** > replacement part of an s/// is simply a string, and won't really
** > benefit much from such an option.
**
** I met this with some rather lengthy strings of \x{####} in both the
** search and the replacement part.
**
**
** > Moreover, you CAN add comments in the replacement part if you want to.
** > You just need to modify your code slightly, and use the /e modifier.
** > From your example:
** >
** > $_ = "abc 123 def 123 ghi";
** >
** > $x = s/ # Replace
** > 1 # ONE
** > 2 # TWO
** > 3 # THREE
** > / # by
** > 4 . # FOUR
** > 5 . # FIVE
** > 6 # SIX
** > /gsex; # global sex
**
** Thanks, that looks workable. Will the o-modifier make up for any lost
** performance? I'll test it.

No. /o only matters if you have a variable inside regexp, and then only
if you encounter the regex more than once with a different value in the
variable. And then only if you want to keep using the old value.

My advice is to *never* use /o. There's no point in using it for speed,
and when it matters for speed, the effect may not be what you want - and
even if you want it, it may confuse anyone else looking at the code.

for (qw /foo bar/) {
print /$_/ ? "Yes 1\n" : "No 1\n";
print /$_/o ? "Yes 2\n" : "No 2\n";
}
__END__
Yes 1
Yes 2
Yes 1
No 1


Abigail
--
INIT {print "Perl " }
CHECK {print "another "}
END {print "Hacker\n"}
BEGIN {print "Just " }

Dr.Ruud

unread,
Nov 3, 2005, 7:11:44 PM11/3/05
to
Abigail:

> /o only matters if you have a variable inside regexp, and then
> only
> if you encounter the regex more than once with a different value in
> the variable. And then only if you want to keep using the old value.


I have series of substitutions that have to be tried in order on every
line of many files.

To make the code more readable, I can store these substitutions in a
hash (with keys like 'A01' meaning phase A, first substitution).

It is no problem to unloop the code for speed, so it might look like:

$x = s/$re{'A01'}[SRCH]/$re{'A01')[REPL]/gsx; # or /gosx
print STDERR $re{'A01'}[NAME], $x if ($x > $re{'A01'}[MIN]);

$x = s/$re{'A02'}[SRCH]/$re{'A02')[REPL]/gsx;
print STDERR $re{'A02'}[NAME], $x if ($x > $re{'A02'}[MIN]);

(and then dozens more)

If possible, I would like the modifiers to be in $re{'key'}[MODS].
(yes, this is all totally untested code yet)

OK, let me first try and test the alternatives. I still have a few days.

Abigail

unread,
Nov 3, 2005, 8:05:15 PM11/3/05
to
Dr.Ruud (rvtol...@isolution.nl) wrote on MMMMCDXLVIII September
MCMXCIII in <URL:news:dkectf...@news.isolution.nl>:
-: Abigail:
-:
-: > /o only matters if you have a variable inside regexp, and then
-: > only
-: > if you encounter the regex more than once with a different value in
-: > the variable. And then only if you want to keep using the old value.
-:
-:
-: I have series of substitutions that have to be tried in order on every
-: line of many files.
-:
-: To make the code more readable, I can store these substitutions in a
-: hash (with keys like 'A01' meaning phase A, first substitution).
-:
-: It is no problem to unloop the code for speed, so it might look like:
-:
-: $x = s/$re{'A01'}[SRCH]/$re{'A01')[REPL]/gsx; # or /gosx
-: print STDERR $re{'A01'}[NAME], $x if ($x > $re{'A01'}[MIN]);

The question you should ask here is: does "$re{A01}[SRCH]" change?
And if it does, do you want to keep using the *old* value? If the
answer to both questions is yes, you could use /o (although I would
use qr//). If latter question is answered with 'no', using /o will
make that your program will produce the wrong results. If the first
question is answered with 'no', then using /o doesn't matter.

-: $x = s/$re{'A02'}[SRCH]/$re{'A02')[REPL]/gsx;
-: print STDERR $re{'A02'}[NAME], $x if ($x > $re{'A02'}[MIN]);
-:
-: (and then dozens more)
-:


Suppose you have @lines containing all the lines you want to inspect,
and @regexes with all the regexes (as strings), there is a gigantic
difference between:

for my $line (@lines) {
for my $regex (@regexes) {
$line =~ /$regex/
}
}

and

for my $regex (@regexes) {
for my $line (@lines) {
$line =~ /$regex/
}
}

The first code snippet means that you will be doing

scalar (@lines) * scalar (@regexes)

regex compilations, while in the latter case, you only will be
doing

scalar (@regexes)

compilations. (Except if you have only one regex, then you will be
compiling only once, in both code snippets).


-: If possible, I would like the modifiers to be in $re{'key'}[MODS].
-: (yes, this is all totally untested code yet)

s/(?$re{key}[MODS])$re{key}[SRCH]/$re{key}[REPL]/

ought to do the trick.

Abigail
--
map{${+chr}=chr}map{$_=>$_^ord$"}$=+$]..3*$=/2;
print "$J$u$s$t $a$n$o$t$h$e$r $P$e$r$l $H$a$c$k$e$r\n";

Dr.Ruud

unread,
Nov 3, 2005, 10:29:59 PM11/3/05
to
Abigail:
> Ruud:

> does "$re{A01}[SRCH]" change?

No, it's a constant.


> If the first
> question is answered with 'no', then using /o doesn't matter.

OK. I still hesitate that /o really doesn't matter, because I still
expect that a test needs to be done to find out if the variable has
changed or not, but even with such a (fast) test it can hardly matter.


>> If possible, I would like the modifiers to be in $re{'key'}[MODS].

>> (yes, this is all totally untested code yet)
>
> s/(?$re{key}[MODS])$re{key}[SRCH]/$re{key}[REPL]/
>
> ought to do the trick.

Ah, nice. Just another thing that I had read about but hadn't used yet.

Abigail

unread,
Nov 4, 2005, 3:07:33 AM11/4/05
to
Dr.Ruud (rvtol...@isolution.nl) wrote on MMMMCDXLVIII September
MCMXCIII in <URL:news:dkeo51...@news.isolution.nl>:
~~ Abigail:
~~ > Ruud:
~~
~~ > does "$re{A01}[SRCH]" change?
~~
~~ No, it's a constant.
~~
~~
~~ > If the first
~~ > question is answered with 'no', then using /o doesn't matter.
~~
~~ OK. I still hesitate that /o really doesn't matter, because I still
~~ expect that a test needs to be done to find out if the variable has
~~ changed or not, but even with such a (fast) test it can hardly matter.

It doesn't actually check whether a variable has changed - it just tests
whether, after interpolation, the regex has changed. And compared to
actually executing a regex, this test takes insignificant time. It
doesn't weight up against the hard to trace bugs if the variable does
change and the regex doesn't because you used /o.

Abigail
--
use lib sub {($\) = split /\./ => pop; print $"};
eval "use Just" || eval "use another" || eval "use Perl" || eval "use Hacker";

Anno Siegel

unread,
Nov 4, 2005, 4:49:26 AM11/4/05
to
Dr.Ruud <rvtol...@isolution.nl> wrote in comp.lang.perl.misc:
> Anno Siegel:
> > Dr.Ruud:
>
> >> "Gewoon is een tijger."
> >
> > "Habit is a tiger"?
> >
> > Yeah, a sleepy one, but when you want him to move he's got teeth.
>
> A bit like that yes. It covers adjectives like 'normal', 'usual',
> 'habitual', 'customary', 'ordinary', 'general', 'common', 'simple',
> 'just', and most related adverbs too.

I should have known better than to base a translation on the "Dutch is
like German" theory.

> It is something my 3 year old answered when she got fed up with me
> asking her several times in a row what she meant with 'Gewoon.' (I knew
> that she meant "Just because." and she knew that I knew).

Oh well... Pedantic educative questions instead of getting on with
whatever you were doing. You had it coming.

Dr.Ruud

unread,
Nov 4, 2005, 7:20:41 AM11/4/05
to
Anno Siegel:

> Pedantic educative questions instead of getting on with
> whatever you were doing. You had it coming.

It's just me.

Dr.Ruud

unread,
Nov 4, 2005, 7:52:12 AM11/4/05
to
Abigail schreef:

> [regex without /o]


> It doesn't actually check whether a variable has changed - it just
> tests whether, after interpolation, the regex has changed. And
> compared to actually executing a regex, this test takes insignificant
> time. It doesn't weight up against the hard to trace bugs if the
> variable does change and the regex doesn't because you used /o.

OK, thanks for confirming that.

My /o's meant that the variables will never change after setup.

Is there an efficient way to use constants in regexes? Maybe not useful
when constants are actually subs.

--
Affijn, Ruud (flip-flop)

"Gewoon is een tijger."

John Bokma

unread,
Nov 4, 2005, 10:05:42 AM11/4/05
to
"Dr.Ruud" <rvtol...@isolution.nl> wrote:

> Is there an efficient way to use constants in regexes? Maybe not useful
> when constants are actually subs.

using qr//?

"Since Perl may compile the pattern at the moment of execution of qr()
operator, using qr() may have speed advantages in some situations, notably
if the result of qr() is used standalone:"

(perlop)

--
John Small Perl scripts: http://johnbokma.com/perl/
Perl programmer available: http://castleamber.com/
I ploink googlegroups.com :-)

Dr.Ruud

unread,
Nov 4, 2005, 12:22:18 PM11/4/05
to
John Bokma:
> Dr.Ruud:

>> Is there an efficient way to use constants in regexes? Maybe not
>> useful when constants are actually subs.
>
> using qr//?
>
> "Since Perl may compile the pattern at the moment of execution of qr()
> operator, using qr() may have speed advantages in some situations,
> notably if the result of qr() is used standalone:"
>

> {perlop)

Thanks John. Abigail already mentioned it, but I didn't look into it
right away and then I just didn't, so now at last I did.

$re{'A01'}[SRCH] = 'some regex, grouping allowed';
$re{'A01')[REPL] = 'some replacement, backtracking allowed';
$re{'A01'}[MODS] = 'xsg';
:
:
$re{'A01'}[QREX] = qr/(?$re{'A01'}[MODS])$re{'A01'}[SRCH]/; #
qompiled regex
:
:
s/$re{'A01'}[QREX]/$re{'A01')[REPL]/;

Stan R.

unread,
Nov 4, 2005, 1:19:41 PM11/4/05
to
Abigail wrote:
[...]

> My advice is to *never* use /o. There's no point in using it for
> speed, and when it matters for speed, the effect may not be what you
> want - and even if you want it, it may confuse anyone else looking at
> the code.
>
> for (qw /foo bar/) {
> print /$_/ ? "Yes 1\n" : "No 1\n";
> print /$_/o ? "Yes 2\n" : "No 2\n";
> }
> __END__
> Yes 1
> Yes 2
> Yes 1
> No 1

I get "No 2" on the end, not "No 1"

$ perl -e 'for (qw /foo bar/) {


print /$_/ ? "Yes 1\n" : "No 1\n";
print /$_/o ? "Yes 2\n" : "No 2\n";

}'


Yes 1
Yes 2
Yes 1

No 2

I guess you didn't actually run the code you posted, or you typed from
memory :-P

--
Stan


Abigail

unread,
Nov 4, 2005, 5:16:28 PM11/4/05
to
Stan R. (stan....@bremove.lz.hmrprint.com) wrote on MMMMCDXLVIII
September MCMXCIII in <URL:news:1131128...@spool6-east.superfeed.net>:

"" Abigail wrote:
"" [...]
"" > My advice is to *never* use /o. There's no point in using it for
"" > speed, and when it matters for speed, the effect may not be what you
"" > want - and even if you want it, it may confuse anyone else looking at
"" > the code.
"" >
"" > for (qw /foo bar/) {
"" > print /$_/ ? "Yes 1\n" : "No 1\n";
"" > print /$_/o ? "Yes 2\n" : "No 2\n";
"" > }
"" > __END__
"" > Yes 1
"" > Yes 2
"" > Yes 1
"" > No 1
""
"" I get "No 2" on the end, not "No 1"

Indeed.

""
"" $ perl -e 'for (qw /foo bar/) {
"" print /$_/ ? "Yes 1\n" : "No 1\n";
"" print /$_/o ? "Yes 2\n" : "No 2\n";
"" }'
"" Yes 1
"" Yes 2
"" Yes 1
"" No 2
""
"" I guess you didn't actually run the code you posted, or you typed from
"" memory :-P

Oh, I ran it. Then copied into my posting. Then modified it, and typed the
last line by hand instead of using the mouse.


Abigail
--
perl -wle'print"Кхуф бопфиет Ретм Ибглет"^"\x80"x24'

robic0

unread,
Nov 10, 2005, 2:41:46 AM11/10/05
to
On Fri, 4 Nov 2005 13:20:41 +0100, "Dr.Ruud" <rvtol...@isolution.nl>
wrote:

>Anno Siegel:
>
>> Pedantic educative questions instead of getting on with
>> whatever you were doing. You had it coming.
>
>It's just me.

Wheres the original thread?
Why don't you email each others, ahhh whatever it is
lovers do.

0 new messages