[clug] awk or Perl regex question

steve jenkin sjenkin at canb.auug.org.au
Sat Jul 20 08:08:09 UTC 2019


In awk, I’m trying to remove First Names from Full Name strings.
There might be multiple first names and alternative separated by a ‘/‘

Surnames as UPPERCASE and happen at the end of the string [and may contain single quote (O’SHEA) or  a blank (DE SMETS).

Currently I’ve got a working version doing two different subs, the first is unanchored, the second is anchored to the start of the string (^)

	sub(/Mc[A-Z][a-z]* /, "", A[1]); 
	sub(/^([A-Z][a-z\047]*[ /])+/, "", A[1]);

I’ve tried this regex, unachored and not, with ‘?’ for 0 or 1 repeats of the group or ‘*’ for 0 or more repeats.

	(Mc)?([A-Z][a-z\047]*[ /])+

Any suggestions for other things to try?

--
Steve Jenkin, IT Systems and Design 
0412 786 915 (+61 412 786 915)
PO Box 38, Kippax ACT 2615, AUSTRALIA

mailto:sjenkin at canb.auug.org.au http://members.tip.net.au/~sjenkin




More information about the linux mailing list