version 1, including all changes.
.
Rev |
Author |
# |
Line |
1 |
perry |
1 |
GLOB |
|
|
2 |
!!!GLOB |
|
|
3 |
NAME |
|
|
4 |
DESCRIPTION |
|
|
5 |
WILDCARD MATCHING |
|
|
6 |
PATHNAMES |
|
|
7 |
EMPTY LISTS |
|
|
8 |
NOTES |
|
|
9 |
SEE ALSO |
|
|
10 |
---- |
|
|
11 |
!!NAME |
|
|
12 |
|
|
|
13 |
|
|
|
14 |
glob - Globbing pathnames |
|
|
15 |
!!DESCRIPTION |
|
|
16 |
|
|
|
17 |
|
|
|
18 |
Long ago, in Unix V6, there was a program ''/etc/glob'' |
|
|
19 |
that would expand wildcard patterns. Soon afterwards this |
|
|
20 |
became a shell built-in. |
|
|
21 |
|
|
|
22 |
|
|
|
23 |
These days there is also a library routine glob(3) |
|
|
24 |
that will perform this function for a user |
|
|
25 |
program. |
|
|
26 |
|
|
|
27 |
|
|
|
28 |
The rules are as follows (POSIX 1003.2, 3.13). |
|
|
29 |
!!WILDCARD MATCHING |
|
|
30 |
|
|
|
31 |
|
|
|
32 |
A string is a wildcard pattern if it contains one of the |
|
|
33 |
characters `?', `*' or `[['. Globbing is the operation that |
|
|
34 |
expands a wildcard pattern into the list of pathnames |
|
|
35 |
matching the pattern. Matching is defined by: |
|
|
36 |
|
|
|
37 |
|
|
|
38 |
A `?' (not between brackets) matches any single |
|
|
39 |
character. |
|
|
40 |
|
|
|
41 |
|
|
|
42 |
A `*' (not between brackets) matches any string, including |
|
|
43 |
the empty string. |
|
|
44 |
|
|
|
45 |
|
|
|
46 |
__Character classes__ |
|
|
47 |
|
|
|
48 |
|
|
|
49 |
An expression `[[...]' where the first character after the |
|
|
50 |
leading `[[' is not an `!' matches a single character, namely |
|
|
51 |
any of the characters enclosed by the brackets. The string |
|
|
52 |
enclosed by the brackets cannot be empty; therefore `]' can |
|
|
53 |
be allowed between the brackets, provided that it is the |
|
|
54 |
first character. (Thus, `[[][[!]' matches the three characters |
|
|
55 |
`[[', `]' and `!'.) |
|
|
56 |
|
|
|
57 |
|
|
|
58 |
__Ranges__ |
|
|
59 |
|
|
|
60 |
|
|
|
61 |
There is one special convention: two characters separated by |
|
|
62 |
`-' denote a range. (Thus, `[[A-Fa-f0-9]' is equivalent to |
|
|
63 |
`[[ABCDEFabcdef0123456789]'.) One may include `-' in its |
|
|
64 |
literal meaning by making it the first or last character |
|
|
65 |
between the brackets. (Thus, `[[]-]' matches just the two |
|
|
66 |
characters `]' and `-', and `[[--/]' matches the three |
|
|
67 |
characters `-', `.', `/'.) |
|
|
68 |
|
|
|
69 |
|
|
|
70 |
__Complementation__ |
|
|
71 |
|
|
|
72 |
|
|
|
73 |
An expression `[[!...]' matches a single character, namely |
|
|
74 |
any character that is not matched by the expression obtained |
|
|
75 |
by removing the first `!' from it. (Thus, `[[!]a-]' matches |
|
|
76 |
any single character except `]', `a' and `-'.) |
|
|
77 |
|
|
|
78 |
|
|
|
79 |
One can remove the special meaning of `?', `*' and `[[' by |
|
|
80 |
preceding them by a backslash, or, in case this is part of a |
|
|
81 |
shell command line, enclosing them in quotes. Between |
|
|
82 |
brackets these characters stand for themselves. Thus, |
|
|
83 |
`[[[[?*]' matches the four characters `[[', `?', `*' and |
|
|
84 |
`'. |
|
|
85 |
!!PATHNAMES |
|
|
86 |
|
|
|
87 |
|
|
|
88 |
Globbing is applied on each of the components of a pathname |
|
|
89 |
separately. A `/' in a pathname cannot be matched by a `?' |
|
|
90 |
or `*' wildcard, or by a range like `[[.-0]'. A range cannot |
|
|
91 |
contain an explicit `/' character; this would lead to a |
|
|
92 |
syntax error. |
|
|
93 |
|
|
|
94 |
|
|
|
95 |
If a filename starts with a `.', this character must be |
|
|
96 |
matched explicitly. (Thus, `rm *' will not remove .profile, |
|
|
97 |
and `tar c *' will not archive all your files; `tar c .' is |
|
|
98 |
better.) |
|
|
99 |
!!EMPTY LISTS |
|
|
100 |
|
|
|
101 |
|
|
|
102 |
The nice and simple rule given above: `expand a wildcard |
|
|
103 |
pattern into the list of matching pathnames' was the |
|
|
104 |
original Unix definition. It allowed one to have patterns |
|
|
105 |
that expand into an empty list, as in |
|
|
106 |
|
|
|
107 |
|
|
|
108 |
xv -wait 0 *.gif *.jpg |
|
|
109 |
where perhaps no *.gif files are present (and this is not an error). However, POSIX requires that a wildcard pattern is left unchanged when it is syntactically incorrect, or the list of matching pathnames is empty. With ''bash'' one can force the classical behaviour by setting ''allow_null_glob_expansion=true''. |
|
|
110 |
|
|
|
111 |
|
|
|
112 |
(Similar problems occur elsewhere. E.g., where old scripts |
|
|
113 |
have |
|
|
114 |
|
|
|
115 |
|
|
|
116 |
rm `find . -name |
|
|
117 |
new scripts require |
|
|
118 |
|
|
|
119 |
|
|
|
120 |
rm -f nosuchfile `find . -name |
|
|
121 |
to avoid error messages from ''rm'' called with an empty argument list.) |
|
|
122 |
!!NOTES |
|
|
123 |
|
|
|
124 |
|
|
|
125 |
__Regular expressions__ |
|
|
126 |
|
|
|
127 |
|
|
|
128 |
Note that wildcard patterns are not regular expressions, |
|
|
129 |
although they are a bit similar. First of all, they match |
|
|
130 |
filenames, rather than text, and secondly, the conventions |
|
|
131 |
are not the same: e.g., in a regular expression `*' means |
|
|
132 |
zero or more copies of the preceding thing. |
|
|
133 |
|
|
|
134 |
|
|
|
135 |
Now that regular expressions have bracket expressions where |
|
|
136 |
the negation is indicated by a `^', POSIX has declared the |
|
|
137 |
effect of a wildcard pattern `[[^...]' to be |
|
|
138 |
undefined. |
|
|
139 |
|
|
|
140 |
|
|
|
141 |
__Character classes and |
|
|
142 |
Internationalization__ |
|
|
143 |
|
|
|
144 |
|
|
|
145 |
Of course ranges were originally meant to be ASCII ranges, |
|
|
146 |
so that `[[ -%]' stands for `[[ ! |
|
|
147 |
|
|
|
148 |
|
|
|
149 |
(iii) Ranges X-Y comprise all characters that fall between X |
|
|
150 |
and Y (inclusive) in the currect collating sequence as |
|
|
151 |
defined by the LC_COLLATE category in the current |
|
|
152 |
locale. |
|
|
153 |
|
|
|
154 |
|
|
|
155 |
(iv) Named character classes, like |
|
|
156 |
|
|
|
157 |
|
|
|
158 |
[[:alnum:] [[:alpha:] [[:blank:] [[:cntrl:] |
|
|
159 |
[[:digit:] [[:graph:] [[:lower:] [[:print:] |
|
|
160 |
[[:punct:] [[:space:] [[:upper:] [[:xdigit:] |
|
|
161 |
so that one can say `[[[[:lower:]]' instead of `[[a-z]', and have things work in Denmark, too, where there are three letters past `z' in the alphabet. These character classes are defined by the LC_CTYPE category in the current locale. |
|
|
162 |
|
|
|
163 |
|
|
|
164 |
(v) Collating symbols, like `[[.ch.]' or `[[.a-acute.]', where |
|
|
165 |
the string between `[[.' and `.]' is a collating element |
|
|
166 |
defined for the current locale. Note that this may be a |
|
|
167 |
multi-character element. |
|
|
168 |
|
|
|
169 |
|
|
|
170 |
(vi) Equivalence class expressions, like `[[=a=]', where the |
|
|
171 |
string between `[[=' and `=]' is any collating element from |
|
|
172 |
its equivalence class, as defined for the current locale. |
|
|
173 |
For example, `[[[[=a=]]' might be equivalent to |
|
|
174 |
`[[a |
|
|
175 |
!!SEE ALSO |
|
|
176 |
|
|
|
177 |
|
|
|
178 |
sh(1), glob(3), fnmatch(3), |
|
|
179 |
locale(7), regex(7) |
|
|
180 |
---- |