Regular expression help

Paul Bryan pa_bryan at
Tue Dec 10 14:48:15 EST 2002

Hash: SHA1

On Tue, 10 Dec 2002 12:24, Joel Pearson wrote:
> Thanks to everyone who suggested regular expressions to me, they were
> very helpful.
> In the end I decided to go with this regular expression, which should be
> fine as long as there is no more than 1 set of brackets in the name:
> /value=(.{3})> \(([^\)\(]*|[^\(]*\([^\)]*\)[^\)]*)\)+/
> -Sent by Brian Graham

Do you only want the first three characters? Just wanted to clarify because 
some of your data has a value parameter with 4 or 5 characters and this will 
only match the three letter items. If you want to match all value parameters 
irrespecitive of how many characters there are try:


> Kim's suggestion of: /value=(\([A-Z]{3,4}\))> \(([^<]+)\) \1\(
> \&nbsp\)+<\/option/
> Seemed to be the best option, but for some reason or another it didn't
> want to work in php. (Something to do with backreferencing I think)

- From the looks, your first backrefernece (array position 1) should be the 
value paramter. This will only get 3-4 character paramaters though (see my 
above comment). The second backreference should be the part inside the 

Here's a few notes from the php manual to help clarify how php treats 
regexes. Also there are a lot of modifiers that you can use (eg. greediness 
modifier). Check the php manual for details.

preg_match_all uses two different systems for returning the backreferences. 
- From the php manual on preg_match_all:

flags can be a combination of the following flags (note that it doesn't make 
sense to use PREG_PATTERN_ORDER together with PREG_SET_ORDER):


Orders results so that $matches[0] is an array of full pattern matches, 
$matches[1] is an array of strings matched by the first parenthesized 
subpattern, and so on. 


Orders results so that $matches[0] is an array of first set of matches, 
$matches[1] is an array of second set of matches, and so on. 

You need to make sure that you're using the flag you want as it will change 
the array structure. Usually I use PREG_SET_ORDER so that for each match, the 
backreferences are indexed together eg. $matches[0][0] is the whole 
expression, $matches[0][1] is the first backreference. These are for the 
first match. The second match is then $matches[1] and so on.

- From the "pattern syntax" section under "pcre", under the heading 

The fact that plain parentheses fulfil two functions is  not always  helpful. 
There are often times when a grouping subpattern is required without a 
capturing requirement.  If  an opening parenthesis is followed by "?:", the 
subpattern does not do any capturing, and is not counted when computing  the 
number of any subsequent capturing subpatterns. For example, if the string" 
the  white  queen"  is  matched  against  the pattern

the ((?:red|white) (king|queen))

the captured substrings are "white queen" and  "queen",  and are  numbered  1 
and 2. The maximum number of captured substrings is 99, and the maximum 
number  of  all  subpatterns, both capturing and non-capturing, is 200. 

This should help you figure out exactly what your backreferneces are 
(referred above as "capturing subpatterns").

Also check the top of the "pcre > pattern syntax" section to see how these 
regexes differ from perl.

> Thanks again to everyone for your help
> Joel
> ~~~~~~~~~~~~~~~~~~~~~~~
> Joel Pearson
> Email: pearj at
> ICQ:1580379 <,,,00.html?Uin=1580379>
> MSN: joelpearson at
> <
> t=en-au>

- -- 
Paul Bryan
E-Mail: pa_bryan at
What makes us so bitter against people who outwit us is that they think
themselves cleverer than we are.
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see


More information about the linux mailing list