Differences between version 2 and predecessor to the previous major change of RegularExpression.
Other diffs: Previous Revision, Previous Author, or view the Annotated Edit History
Newer page: | version 2 | Last edited on Monday, March 10, 2003 2:19:00 am | by AristotlePagaltzis | Revert |
Older page: | version 1 | Last edited on Sunday, March 9, 2003 10:22:04 pm | by JohnMcPherson | Revert |
@@ -1 +1,87 @@
-See RegularExpressions
for description
and examples
.
+A RegularExpression is a way of describing search patterns. The letters A-Z and the numbers -9 match themselves (case sensitively) and "." matches the "any character". "*" matches the previous charactor zero or more times. \ prevents the next character from having special meaning. so:
+ a*b..e
+(any number of a's, followed by a b, two characters and an e)
+will match:
+ aaaaabcde
+and:
+ bbcde
+
+but not:
+ axbcde
+
+regex(7) explains all the neat things you can do with [RegularExpression]s and the different types. perlre(1) explains perl's extended regex's.
+-----
+grep(1) is a command to look
for a regex in a file. eg:
+ grep 'foo' /tmp/baz.txt
+will look for the string "foo" in /tmp/baz.txt. More usefully:
+ grep 'wlug\.linuxcare\.co\.nz' *
+will search for every occurance of "wlug.linuxcare.co.nz" in all the files in this directory.
+-----
+sed(1) is a "__s__cript __ed__itor" which uses regex's. sed is usually used for it's amazing search
and replace capability
. for (simple) example:
+ sed 's/foo/baz/g' <a.txt >b.txt
+will search for "foo" and replace it with "baz" in a.txt and output the result in b.txt
+-----
+awk(1) is a tool for doing processing on record orientated files. It allows you to specify different actions to perform based on regex's.
+-----
+See also: File [Glob]s
+
+Tricks and Traps:
+* When specifying regex's on the command line, surround them in single quotes "'", it's just easier that way.
+-----
+!!Examples of single-character expressions
+
+
+To match any lowercase vowel:
+/[[aeiou]/
+
+To match any lowercase or uppercase vowel:
+/[[aeiouAEIOU]/
+
+To match any single digit:
+/[[0123456789]/
+
+The same thing:
+/[[-9]/
+
+Any single digit or minus:
+/[[-9\-]/
+
+Any lowercase letter:
+/[[a-z]/
+
+The ^ character can be used to negate a [] pattern:
+
+To match anything __except__ a lowercase letter:
+/[[^a-z]/
+
+To match anything __except__ a lowercase or uppercase letter, digit or underscore:
+/[[^a-zA-Z0-9_]/
+
+These can be used with * too, so:
+
+/[[-9]*/
+
+matches any number of digits, including no digits.
+
+!!Character abbreviations:
+Note: These apply to perl regular expressions. They will most likely work in other regex parsers such as sed, but there may be subtle differences.
+
+To match any digit:
+/[[\d]/
+(Equivalent to /[[-9]/)
+
+To match any 'word' character:
+/[[\w]/
+(Equivalent to /[[a-zA-Z0-9_]/)
+
+To match any space character:
+/[[\s]/
+(Equivalent to /[[ \r\t\n\f]/)
+
+\D, \W and \S are the negated versions of \d, \w and \s:
+
+/[[\D]/ is equivalent to /[[^-9]/
+
+/[[\W]/ is equivalent to /[[^a-zA-Z0-9_]/
+
+/[[\S]/ is equivalent to /[[^ \r\t\n\f]/