Penguin
Diff: RegularExpression
EditPageHistoryDiffInfoLikePages

Differences between version 2 and predecessor to the previous major change of RegularExpression.

Other diffs: Previous Revision, Previous Author, or view the Annotated Edit History

Newer page: version 2 Last edited on Monday, March 10, 2003 2:19:00 am by AristotlePagaltzis Revert
Older page: version 1 Last edited on Sunday, March 9, 2003 10:22:04 pm by JohnMcPherson Revert
@@ -1 +1,87 @@
-See RegularExpressions for description and examples
+A RegularExpression is a way of describing search patterns. The letters A-Z and the numbers -9 match themselves (case sensitively) and "." matches the "any character". "*" matches the previous charactor zero or more times. \ prevents the next character from having special meaning. so:  
+ a*b..e  
+(any number of a's, followed by a b, two characters and an e)  
+will match:  
+ aaaaabcde  
+and:  
+ bbcde  
+  
+but not:  
+ axbcde  
+  
+regex(7) explains all the neat things you can do with [RegularExpression]s and the different types. perlre(1) explains perl's extended regex's.  
+-----  
+grep(1) is a command to look for a regex in a file. eg:  
+ grep 'foo' /tmp/baz.txt  
+will look for the string "foo" in /tmp/baz.txt. More usefully:  
+ grep 'wlug\.linuxcare\.co\.nz' *  
+will search for every occurance of "wlug.linuxcare.co.nz" in all the files in this directory.  
+-----  
+sed(1) is a "__s__cript __ed__itor" which uses regex's. sed is usually used for it's amazing search and replace capability . for (simple) example:  
+ sed 's/foo/baz/g' <a.txt >b.txt  
+will search for "foo" and replace it with "baz" in a.txt and output the result in b.txt  
+-----  
+awk(1) is a tool for doing processing on record orientated files. It allows you to specify different actions to perform based on regex's.  
+-----  
+See also: File [Glob]s  
+  
+Tricks and Traps:  
+* When specifying regex's on the command line, surround them in single quotes "'", it's just easier that way.  
+-----  
+!!Examples of single-character expressions  
+  
+  
+To match any lowercase vowel:  
+/[[aeiou]/  
+  
+To match any lowercase or uppercase vowel:  
+/[[aeiouAEIOU]/  
+  
+To match any single digit:  
+/[[0123456789]/  
+  
+The same thing:  
+/[[-9]/  
+  
+Any single digit or minus:  
+/[[-9\-]/  
+  
+Any lowercase letter:  
+/[[a-z]/  
+  
+The ^ character can be used to negate a [] pattern:  
+  
+To match anything __except__ a lowercase letter:  
+/[[^a-z]/  
+  
+To match anything __except__ a lowercase or uppercase letter, digit or underscore:  
+/[[^a-zA-Z0-9_]/  
+  
+These can be used with * too, so:  
+  
+/[[-9]*/  
+  
+matches any number of digits, including no digits.  
+  
+!!Character abbreviations:  
+Note: These apply to perl regular expressions. They will most likely work in other regex parsers such as sed, but there may be subtle differences.  
+  
+To match any digit:  
+/[[\d]/  
+(Equivalent to /[[-9]/)  
+  
+To match any 'word' character:  
+/[[\w]/  
+(Equivalent to /[[a-zA-Z0-9_]/)  
+  
+To match any space character:  
+/[[\s]/  
+(Equivalent to /[[ \r\t\n\f]/)  
+  
+\D, \W and \S are the negated versions of \d, \w and \s:  
+  
+/[[\D]/ is equivalent to /[[^-9]/  
+  
+/[[\W]/ is equivalent to /[[^a-zA-Z0-9_]/  
+  
+/[[\S]/ is equivalent to /[[^ \r\t\n\f]/