SonicOS 7.1 Objects

Technical Documentation> Match Objects> Match Objects> Regular Expressions> Regular Expression Syntax

Regular Expression Syntax

This section provides the information about syntax that are used in building regular expressions.

Regular expression syntax: Single characters
Representation	Definition
.	Any character except \n. Use /s (stream mode, also known as single-line mode) modifier to match \n too.
[xyz]	Character class. Can also give escaped characters. Special characters do not need to be escaped as they do not have special meaning within brackets [ ].
\xdd	Hex input. dd is the hexadecimal value for the character. Two digits are mandatory. For example, \r is \x0d and not \xd.
[a-z][0-9]	Character range.

Regular expression syntax: Composites
Representation	Definition
xy	x followed by y
x\|y	x or y
(x)	Equivalent to x. Can be used to override precedences.

Regular expression syntax: Repetitions
Representation	Definition
x*	Zero or more x
x?	Zero or one x
x+	One or more x
x{n, m}	Minimum of n and a maximum of m sequential x’s. All numbered repetitions are expanded. So, making m unreasonably large is ill-advised.
x{n}	Exactly n x’s
x{n,}	Minimum of n x’s
x{,n}	Maximum of n x’s

Regular expression syntax: Escape sequences
Representation	Definition
\0, \a, \b, \f, \t, \n, \r, \v	C programming language escape sequences (\0 is the NULL character (ASCII character zero)).
\x	Hex-input. \x followed by two hexa-decimal digits denotes the hexa-decimal value for the intended character.
\*, \?, \+, \(, \), \[, \], \{, \}, \\, \/, \<space>, \#	Escape any special character. Comments that are not processed are preceded by any number of spaces and a pound sign (#). So, to match a space or a pound sign (#), you must use the escape sequences \ and \#.

Regular expression syntax: Perl-like character classes
Representation	Definition
\d, \D	Digits, Non-digits.
\z, \Z	Non-zero digits ([1-9]), All other characters.
\s, \S	White space, Non-white space. Equivalent to [\t\n\f\r]. \v is not included in Perl white spaces.
\w, \W	Word characters, Non-word characters Equivalent to [0-9A-Za-z_].

Regular expression syntax: Other ASCII character class primitives
If you want...	...then use
[:cntrl:]	\c, \C	Control character. [\x00 - \x1F\x7F].
[:digit:]	\d, \D	Digits, Non-Digits. Same as Perl character class.
[:graph:]	\g, \G	Any printable character except space.
[:xdigit:]	\h, \H	Any hexadecimal digit. [a-fA-F0-9]. Note this is different from the Perl \h, which means a horizontal space.
[:lower:]	\l, \L	Any lower case character.
[:ascii:]	\p, \P	Positive, Negative ASCII characters. [0x00 – 0x7F], [0x80 – 0xFF].
[:upper:]	\u, \U	Any upper case character.

Some of the other popular character classes can be built from the above primitives. The following classes do not have their own short-hand due of the lack of a nice mnemonic for any of the remaining characters used for them.

Regular expression syntax: Compound character classes
If you want...	... then use
[:alnum:]	= [\l\u\d]	The set of all characters and digits.
[:alpha:]	= [\l\u]	The set of all characters.
[:blank:]	= [\t<space>]	The class of blank characters: tab and space.
[:print:]	= [\g<space>]	The class of all printable characters: all graphical characters including space.
[:punct:]	= [^\P\c<space>\d\u\l]	The class of all punctuation characters: no negative ASCII characters, no control characters, no space, no digits, no upper or lower characters.
[:space:]	= [\s\v]	All white space characters. Includes Perl white space and the vertical tab character.

Regular expression syntax: Modifiers
Representation	Definition
/i	Case-insensitive
/s	Treat input as single-line. Can also be thought of as stream-mode. That is, dot (.) matches \n too.

Regular expression syntax: Operators in decreasing order of precedence
Operators	Associativity
[ ], [^]	Left to right
()	Left to right
*, +, ?	Left to right
. (Concatenation)	Left to right
\|	Left to right

Comments in Regular Expressions

SonicOS supports comments in regular expressions. Comments are preceded by any number of spaces and a pound sign (#). All text after a space and pound sign is discarded until the end of the expression.

< Prev Section

Next Section >

SonicOS 7.1 Objects

Table of Contents

Regular Expression Syntax

Comments in Regular Expressions