Regular Expressions
I can never seem to locate a guide that has an easy explanation of the common regular expressions that I use on a regular basis. I usually just find bits and pieces, or find the right info, but not displayed in a way that’s easy to reference. So here’s my own notes on RegEx
| Character | Usage |
|---|---|
| . | wildcard character - will match anything |
| | | or operator - (this|that) will match either |
| - | range selector - will match a range of character or numbers i.e. 1-9 |
| [] | match any character inside the brackets [aeiou] would match anything with a vowel. |
| ^ | match the start of a string - ^a will match 'atlas' but not 'mad' |
| [^] | don't match - it will negate what you put inside like [^cheese] |
| $ | match the end of a string - at$ will match 'mat' but not 'saturn' |
| ? | match preceding character 0-1 times - tree? will match 'tre', 'tree', but not 'treee' |
| * | match preceding character 0+ time - same as * except it would match 'treee'+ |
| + | match preceding character 1+ times - this version will not match 'tre' |
| {#} | match preceding character # times - tree{3} will match 'treeee' |
| {#,#2} | match preceding character at least #, but not more than #2 - tree{3,4} will match 'treeee' or 'treeeee' |
| i | case insensitive flag - add at end i.e. '#(regex)#i' |
| s | dotall flag - make the "." character match anything |
| x | verbose flag - ignore all whitespace and allows "#" commenting |
One piece I use often as kind of a ‘match-all’ is “(.*?)” so preg_match_all(‘#<a>(.*?)</a>#is’, $content, $matches); would match everything in the string that is within anchor tags.
The delimiter can also be changed, in the previous example it is the “#” pound character, but could easily be changed to “/” as in ‘/<a>(.*?)</a>/is’ or changed into whatever is most convenient for the enclosed string.
Tags: find, php, reference, regular expression, Scripting