Appendix A - Perl regular expressions
The following table lists and describes some examples of Perl regular expressions.
Expression |
Matches |
---|---|
abc |
“abc” (the exact character sequence but anywhere in the string). |
^abc |
“abc” at the beginning of the string. |
abc$ |
“abc” at the end of the string. |
a|b |
Either “a” or “b”. |
^abc|abc$ |
The string “abc” at the beginning or at the end of the string. |
ab{2,4}c |
“a” followed by two, three, or four “b”s followed by a “c”. |
ab{2,}c |
“a” followed by at least two “b”s followed by a “c”. |
ab*c |
“a” followed by any number (zero or more) of “b”s followed by a “c”. |
ab+c |
“a” followed by one or more “b”s followed by a “c”. |
ab?c |
“a” followed by an optional “b” followed by a “c”; that is, either “abc” or “ac”. |
a.c |
“a” followed by any single character (not newline) followed by a “c”. |
a\.c |
“a.c” exactly. |
[abc] |
Any one of “a”, “b”, and “c”. |
[Aa]bc |
Either of “Abc” and “abc”. |
[abc]+ |
Any (nonempty) string of “a”s, “b”s and “c”s (such as “a”, “abba”, “acbabcacaa”). |
[^abc]+ |
Any (nonempty) string that does not contain any of “a”, “b”, and “c” (such as “defg”). |
\d\d |
Any two decimal digits, such as 42; same as \d{2}. |
/i |
Makes the pattern case insensitive. For example, /bad language/i blocks any instance of “bad language” regardless of case. |
\w+ |
A “word”: A nonempty sequence of alphanumeric characters and low lines (underscores), such as “foo”, “12bar8”, and “foo_1”. |
100\s*mk |
The strings “100” and “mk” optionally separated by any amount of white space (spaces, tabs, and newlines). |
abc\b |
“abc” when followed by a word boundary (for example, in “abc!” but not in “abcd”). |
perl\B |
“perl” when not followed by a word boundary (for example, in “perlert” but not in “perl stuff”). |
\x |
Tells the regular expression parser to ignore white space that is neither preceded by a backslash character nor within a character class. Use this to break up a regular expression into slightly more readable parts. |
/x |
Used to add regular expressions within other text. If the first character in a pattern is forward slash “/”, the “/” is treated as the delimiter. The pattern must contain a second “/”. The pattern between the “/” is taken as a regular expression, and anything after the second “/” is parsed as a list of regular expression options (“i”,“x”, and so on). An error occurs if the second “/” is missing. In regular expressions, the leading and trailing space is treated as part of the regular expression. |
Block common spam phrases
Block common phrases found in spam messages with the following expressions:
/try it for free/i
/student loans/i
/you’re already approved/i
/special[\+\-\*=<>\.\,;!\?%&~#§@\^°\$£\{\}()\[\]\|\\_1]offer/i
Block purposely misspelled words
Random characters are often inserted between the letters of a word to bypass spam-blocking software. The following expressions can help to block those messages:
/^.*v.*i.*a.*g.*r.*o.*$/i
/cr[eéèêë][\+\-\*=<>\.\,;!\?%&§@\^°\$£\{\}()\[\]\|\\_01]dit/i
Block any word in a phrase
Use the following expression to block any word in a phrase:
/block|any|word/