This section provides the information about syntax that are used in building regular expressions.
| Representation | Definition |
|---|---|
| . | Any character except \n. Use /s (stream mode, also known as single-line mode) modifier to match \n too. |
| [xyz] | Character class. Can also give escaped characters. Special characters do not need to be escaped as they do not have special meaning within brackets [ ]. |
| \xdd | Hex input. dd is the hexadecimal value for the character. Two digits are mandatory. For example, \r is \x0d and not \xd. |
| [a-z][0-9] | Character range. |
| Representation | Definition |
|---|---|
| xy | x followed by y |
| x|y | x or y |
| (x) | Equivalent to x. Can be used to override precedences. |
| Representation | Definition |
|---|---|
| x* | Zero or more x |
| x? | Zero or one x |
| x+ | One or more x |
| x{n, m} | Minimum of n and a maximum of m sequential x’s. All numbered repetitions are expanded. So, making m unreasonably large is ill-advised. |
| x{n} | Exactly n x’s |
| x{n,} | Minimum of n x’s |
| x{,n} | Maximum of n x’s |
| Representation | Definition |
|---|---|
| \0, \a, \b, \f, \t, \n, \r, \v | C programming language escape sequences (\0 is the NULL character (ASCII character zero)). |
| \x | Hex-input. \x followed by two hexa-decimal digits denotes the hexa-decimal value for the intended character. |
| \*, \?, \+, \(, \), \[, \], \{, \}, \\, \/, \<space>, \# |
Escape any special character. Comments that are not processed are preceded by any number of spaces and a pound sign (#). So, to match a space or a pound sign (#), you must use the escape sequences \ and \#. |
| Representation | Definition |
|---|---|
| \d, \D | Digits, Non-digits. |
| \z, \Z | Non-zero digits ([1-9]), All other characters. |
| \s, \S | White space, Non-white space. Equivalent to [\t\n\f\r]. \v is not included in Perl white spaces. |
| \w, \W | Word characters, Non-word characters Equivalent to [0-9A-Za-z_]. |
| If you want... | ...then use | |
|---|---|---|
| [:cntrl:] | \c, \C | Control character. [\x00 - \x1F\x7F]. |
| [:digit:] | \d, \D | Digits, Non-Digits. Same as Perl character class. |
| [:graph:] | \g, \G | Any printable character except space. |
| [:xdigit:] | \h, \H | Any hexadecimal digit. [a-fA-F0-9]. Note this is different from the Perl \h, which means a horizontal space. |
| [:lower:] | \l, \L | Any lower case character. |
| [:ascii:] | \p, \P | Positive, Negative ASCII characters. [0x00 – 0x7F], [0x80 – 0xFF]. |
| [:upper:] | \u, \U | Any upper case character. |
Some of the other popular character classes can be built from the above primitives. The following classes do not have their own short-hand due of the lack of a nice mnemonic for any of the remaining characters used for them.
| If you want... | ... then use | |
|---|---|---|
| [:alnum:] | = [\l\u\d] | The set of all characters and digits. |
| [:alpha:] | = [\l\u] | The set of all characters. |
| [:blank:] | = [\t<space>] | The class of blank characters: tab and space. |
| [:print:] | = [\g<space>] | The class of all printable characters: all graphical characters including space. |
| [:punct:] | = [^\P\c<space>\d\u\l] | The class of all punctuation characters: no negative ASCII characters, no control characters, no space, no digits, no upper or lower characters. |
| [:space:] | = [\s\v] | All white space characters. Includes Perl white space and the vertical tab character. |
| Representation | Definition |
|---|---|
| /i | Case-insensitive |
| /s | Treat input as single-line. Can also be thought of as stream-mode. That is, dot (.) matches \n too. |
| Operators | Associativity |
|---|---|
| [ ], [^] | Left to right |
| () | Left to right |
| *, +, ? | Left to right |
| . (Concatenation) | Left to right |
| | | Left to right |
SonicOS supports comments in regular expressions. Comments are preceded by any number of spaces and a pound sign (#). All text after a space and pound sign is discarded until the end of the expression.