Home
Main website
Display Sidebar
Hide Ads
Recent Changes
View Source:
RegularExpression
Edit
PageHistory
Diff
Info
LikePages
A RegularExpression is a way of describing search patterns. The letters A-Z and the numbers 0-9 match themselves (case sensitively) and "." matches the "any character". "*" matches the previous charactor zero or more times. \ prevents the next character from having special meaning. so: a*b..e (any number of a's, followed by a b, two characters and an e) will match: aaaaabcde and: bbcde but not: axbcde Quick cheatsheet |^__Character__|^__Matches__ | . | Any single character | ^ | Beginning of line | $ | End of line | \''any character'' | Match ''any character'' exactly (even if it's a special character) | [[''character group''] | Any single character in the group Things that alter the previous expression |^__Character__|^__Alteration__ | ? | Match the previous expression exactly zero or one times | * | Match the previous expression zero or more times | + | Match the previous expression one or more times regex(7) explains all the neat things you can do with [RegularExpression]s and the different types. perlre(1) explains perl's extended regex's. ----- grep(1) is a command to look for a regex in a file. eg: grep 'foo' /tmp/baz.txt will look for the string "foo" in /tmp/baz.txt. More usefully: grep 'wlug\.linuxcare\.co\.nz' * will search for every occurance of "wlug.linuxcare.co.nz" in all the files in this directory. ----- sed(1) is a "__s__cript __ed__itor" which uses regex's. sed is usually used for it's amazing search and replace capability. for (simple) example: sed 's/foo/baz/g' <a.txt >b.txt will search for "foo" and replace it with "baz" in a.txt and output the result in b.txt ----- perl(1) can also be used for inplace substitutions like so perl -pi -e 's/foo/bar/g' a.txt will replace all instances of "foo" with "bar" in a.txt See also: * perlrun(1) ----- awk(1) is a tool for doing processing on record orientated files. It allows you to specify different actions to perform based on regex's. ----- See also: File [Glob]s Tricks and Traps: * When specifying regex's on the command line, surround them in single quotes "'", it's just easier that way. ----- !!Examples of single-character expressions To match any lowercase vowel: /[[aeiou]/ To match any lowercase or uppercase vowel: /[[aeiouAEIOU]/ To match any single digit: /[[0123456789]/ The same thing: /[[0-9]/ Any single digit or minus: /[[0-9\-]/ Any lowercase letter: /[[a-z]/ The ^ character can be used to negate a [] pattern: To match anything __except__ a lowercase letter: /[[^a-z]/ To match anything __except__ a lowercase or uppercase letter, digit or underscore: /[[^a-zA-Z0-9_]/ These can be used with * too, so: /[[0-9]*/ matches any number of digits, including no digits. !!Character abbreviations: Note: These apply to perl regular expressions. They will most likely work in other regex parsers such as sed, but there may be subtle differences. To match any digit: /[[\d]/ (Equivalent to /[[0-9]/) To match any 'word' character: /[[\w]/ (Equivalent to /[[a-zA-Z0-9_]/) To match any space character: /[[\s]/ (Equivalent to /[[ \r\t\n\f]/) \D, \W and \S are the negated versions of \d, \w and \s: /[[\D]/ is equivalent to /[[^0-9]/ /[[\W]/ is equivalent to /[[^a-zA-Z0-9_]/ /[[\S]/ is equivalent to /[[^ \r\t\n\f]/ ---- !Perl As mentioned before, perl uses an extended regular expression syntax explained on the perlre(1) man page. Having said that though, here are some hints. __DON'T__ interprete variables as a regular expression if we have $text='abc[[c]defc'; $search='[[c]'; then $text =~ s/\Q$search\E/XX/ ; would replace the substring '[[c]' in $text with "XX", while $text =~ s/$search/XX/ ; would replace all the occurrences of the character 'c' with "XX".
12 pages link to
RegularExpression
:
OCaml
LinuxExpert
NewBestFriend
KeySigningScripts
InNeedOfRefactor
SDLManPages
RewriteRules
PerlOneLiners
Globs
Perl
POSIX
UnicodeNotes