Differences between version 2 and previous revision of perlretut(1).
Other diffs: Previous Major Revision, Previous Author, or view the Annotated Edit History
Newer page: | version 2 | Last edited on Monday, June 3, 2002 6:50:50 pm | by perry | Revert |
Older page: | version 1 | Last edited on Monday, June 3, 2002 6:50:50 pm | by perry | Revert |
@@ -320,9 +320,9 @@
# non-word char, followed by a word char
/..rt/; # matches any two chars, followed by 'rt'
/end./; # matches 'end.'
/end[[.]/; # same thing, matches 'end.'
-Because a period is a metacharacter, it needs to be escaped to match as an ordinary period. Because, for example, d and w are sets of characters, it is incorrect to think of [[^dw] as [[DW]; in fact [[^dw] is the same as [[^w], which is the same as [[W]. Think DeMorgan's laws.
+Because a period is a metacharacter, it needs to be escaped to match as an ordinary period. Because, for example, d and w are sets of characters, it is incorrect to think of [[^dw] as [[DW]; in fact [[^dw] is the same as [[^w], which is the same as [[W]. Think !
DeMorgan's laws.
An anchor useful in basic regexps is the __word anchor__
b. This matches a boundary between a word character
@@ -1409,10 +1409,10 @@
Here is the association between some Perl named classes and the traditional Unicode classes:
Perl class name Unicode class name or regular expression
- IsAlpha /^[[LM]/
-IsAlnum /^[[LMN]/
+ !
IsAlpha /^[[LM]/
+!
IsAlnum /^[[LMN]/
IsASCII $code
You can also use the official Unicode class names with the p and P, like p{L} for Unicode 'letters', or p{Lu} for uppercase letters, or P{Nd} for non-digits. If a name is just one letter, the braces can be dropped. For instance, pM is the character class of Unicode 'marks'.
@@ -1441,16 +1441,16 @@
w), and blank (a GNU
extension). If utf8 is being used, then these
classes are defined the same as their corresponding perl
Unicode classes: [[:upper:] is the same as
-p{IsUpper}, etc. The POSIX character
+p{!
IsUpper}, etc. The POSIX character
classes, however, don't require using utf8. The
[[:digit:], [[:word:], and
[[:space:] correspond to the familiar d,
w, and s character classes. To negate a
POSIX class, put a ^ in front of the
name, so that, e.g., [[:^digit:] corresponds to
-D and under utf8, P{IsDigit}. The
+D and under utf8, P{!
IsDigit}. The
Unicode and POSIX character classes can be
used just like d, both inside and outside of
character classes: