Non greedy pattern match in sed

Paul Bryan pa_bryan at yahoo.co.uk
Tue Aug 13 21:35:13 EST 2002


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Tue, 13 Aug 2002 21:15, Kim Holburn wrote:

> If you use perl you can do this. From the perl manual (man perlre)
>
>        By default, a quantified subpattern is "greedy", that is,
>        it will match as many times as possible (given a particu-
>        lar starting location) while still allowing the rest of
>        the pattern to match.  If you want it to match the minimum
>        number of times possible, follow the quantifier with a
>        "?".  Note that the meanings don't change, just the
>        "greediness":
>
>            *?     Match 0 or more times
>            +?     Match 1 or more times
>            ??     Match 0 or 1 time
>            {n}?   Match exactly n times
>            {n,}?  Match at least n times
>            {n,m}? Match at least n but not more than m times
>
>  .... | perl -p -e 's/\/\*.*?\*\//g'
>

I don't think this is quite right because the regex isn't matching mulitple 
instances of ".*" 

What it's doing is looking for the last occurance of "*/". So using a 
quantifier like "?" isn't going to make any difference because it's the last 
"*/" that's causing the problem. 

In other words in the command:

echo "/* This is a */ test /* line */" | sed 's/\/\*.*\*\//g'

it finds "*/" before "test", but doesn't stop there! This is where the greedy 
part comes in to it. It keeps on looking to see if there is another occurance 
of "*/", which it finds at the end of the line. Of course ".*" still matches 
everything in between including the first occurance of "*/".

Just thought I should try and clear that one up.

Cheers,
Paul.

p.s. 

PHP has the "U" pattern modifier which specifies ungreedy matches. This stops 
looking once it finds the first "*/". The only problem there is that the ".*" 
part of the pattern will match the rest of the line including the second 
comment. 
- -- 
Paul Bryan
E-Mail: pa_bryan at yahoo.co.uk

PGP Key
http://www.keyserver.net:11371/pks/lookup?op=get&search=0xB1D405DA

It's always darkest just before it gets pitch black.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE9WO703qGyTLHUBdoRAuWRAKCJPz/Z5yl42Ay8X6xHnf/ZO0G7FACgqvZR
RQ56SE0Tt8zpWxypoaNai0E=
=4loX
-----END PGP SIGNATURE-----



More information about the linux mailing list