[clug] Circumflex

Scott Ferguson scott.ferguson.clug at gmail.com
Thu Aug 17 14:20:06 UTC 2017



On 17/08/17 22:43, Bryan Kilgallin (iiNet) via linux wrote:
> Thanks, Eyal:
> 
>> In a regular expression the '^' (at the start) indicates that the
>> matching expression must be at the
>> beginning of the examined string:
> 
> {
> More special characters ^$
> 
> Used outside of the square brackets the ^ and $ sign will have new
> meanings. And yes, this is a big part of why Regex is a bit confusing to
> start up with. Instead of the ^ meaning “not these characters” it means
> “start of the string”.
> }
> 
> http://webagility.com/posts/the-basics-of-regex-explained
> 
> I could use drill and practice exercises.

I can highly recommend "Mastering Regular Expressions" by Jeffrey E. F.
Friedll.

You'll probably find your text editor includes RegEx support for Find
and Replace (KWrite certainly does) - which is a good place to practise.

Here's a list of basic RegEx Special Characters, Quantifiers and
Metacharacters. Note that the formatting is lost when I copied it from
my personal wiki; and that there is more than one type of RegEx.... :)
See man regex for that!
(if the following is unreadable there's a PDF version of the original at
https://scottferguson.com.au/uploads/files/regex.pdf for a short time)

RegEx

A regular expression, regex or regexp (sometimes called a rational
expression) is, in theoretical computer science and formal language
theory, a sequence of characters that define a search pattern, mainly
for use in pattern matching with strings, or using a string searching
algorithm, i.e. “find and replace”-like operations. The concept arose in
the 1950s, when the American mathematician Stephen Cole Kleene
formalised the description of a regular language, and came into common
use with the Unix text processing utilities ed, an editor, and grep, a
filter.
abc… 	Letters
123… 	Digits
\d 	Any Digit
\D 	Any Non-digit character
. 	Any Character
\. 	Period
[abc] 	Only a, b, or c
[^abc] 	Not a, b, nor c
[a-z] 	Characters a to z
[0-9] 	Numbers 0 to 9
\w 	Any Alphanumeric character
\W 	Any Non-alphanumeric character
{m} 	m Repetitions
{m,n} 	m to n Repetitions
* 	Zero or more repetitions
+ 	One or more repetitions
? 	Optional character
\s 	Any Whitespace
\S 	Any Non-whitespace character
^…$ 	Starts and ends
(…) 	Capture Group
(a(bc)) 	Capture Sub-group
(.*) 	Capture all
(abc|def) 	Matches abc or def
Metacharacter	Name	Matches
.	dot	any one character
[…]	character class	any character listed
[^…]	negated character class	any character not listed
^	caret	the position at the start of the line
$	dollar	the position at the end of the line
\<	backslash less-than	1)the position at the start of a word
\>	backslash greater-than	2)the position at the end of a word
|	or, bar	matches either expression it separates
(…)	parentheses	used to limit scope of |, plus additional uses
Quantifiers
	Minimum required	Maximum to try	Meaning
?	none	1	one allowed; none required (“one optional”)
*	none	no limit	unlimited allowed; none required (“any amount OK”)
+	1	no limit	unlimited allowed; one required (“at least one”)


Kind regards

-- 
    A: Because we read from top to bottom, left to right.
    Q: Why should I start my reply below the quoted text?

    A: Because it messes up the order in which people normally read text.
    Q: Why is top-posting such a bad thing?

    A: The lost context.
    Q: What makes top-posted replies harder to read than bottom-posted?

    A: Yes.
    Q: Should I trim down the quoted part of an email to which I'm reply

http://www.idallen.com/topposting.html



More information about the linux mailing list