Penguin
Blame: procmailsc(5)
EditPageHistoryDiffInfoLikePages
Annotated edit history of procmailsc(5) version 1, including all changes. View license author blame.
Rev Author # Line
1 perry 1 PROCMAILSC
2 !!!PROCMAILSC
3 NAME
4 SYNOPSIS
5 DESCRIPTION
6 Weighted regular expression conditions
7 Weighted program conditions
8 Weighted length conditions
9 MISCELLANEOUS
10 EXAMPLES
11 CAVEATS
12 SEE ALSO
13 BUGS
14 MISCELLANEOUS
15 NOTES
16 AUTHORS
17 ----
18 !!NAME
19
20
21 procmailsc - procmail weighted scoring technique
22 !!SYNOPSIS
23
24
25 [[__*__] __w^x condition__
26 !!DESCRIPTION
27
28
29 In addition to the traditional true or false conditions you
30 can specify on a recipe, you can use a weighted scoring
31 technique to decide if a certain recipe matches or not. When
32 weighted scoring is used in a recipe, then the final score
33 for that recipe must be positive for it to
34 match.
35
36
37 A certain condition can contribute to the score if you
38 allocate it a `weight' (__w__) and an `exponent'
39 (__x__). You do this by preceding the condition (on the
40 same line) with:
41
42
43 __ w^x
44 __Whereas both __w__ and __x__ are real numbers between -2147483647.0 and 2147483647.0 inclusive.
45 !!Weighted regular expression conditions
46
47
48 The first time the regular expression is found, it will add
49 ''w'' to the score. The second time it is found,
50 ''w*x'' will be added. The third time it is found,
51 ''w*x*x'' will be added. The fourth time ''w*x*x*x''
52 will be added. And so forth.
53
54
55 This can be described by the following concise
56 formula:
57
58
59 n
60 n k-1 x - 1
61 w * Sum x = w * -------
62 k=1 x - 1
63 It represents the total added score for this condition if __n__ matches are found.
64
65
66 Note that the following case distinctions can be
67 made:
68
69
70 x=0
71
72
73 Only the first match will contribute w to the score. Any
74 subsequent matches are ignored.
75
76
77 x=1
78
79
80 Every match will contribute the same w to the score. The
81 score grows linearly with the number of matches
82 found.
83
84
85 0
86
87
88 Every match will contribute less to the score than the
89 previous one. The score will asymptotically approach a
90 certain value (see the __NOTES__ section
91 below).
92
93
94 1
95
96
97 Every match will contribute more to the score than the
98 previous one. The score will grow
99 exponentially.
100
101
102 x
103
104
105 Can be utilised to favour odd or even number of
106 matches.
107
108
109 If the regular expression is negated (i.e., matches if it
110 isn't found), then __n__ obviously can either be zero or
111 one.
112 !!Weighted program conditions
113
114
115 If the program returns an exitcode of EXIT_SUCCESS (=0),
116 then the total added score will be __w__. If it returns
117 any other exitcode (indicating failure), the total added
118 score will be __x__.
119
120
121 If the exitcode of the program is negated, then, the
122 exitcode will be considered as if it were a virtual number
123 of matches. Calculation of the added score then proceeds as
124 if it had been a normal regular expression with
125 __n=`exitcode'__ matches.
126 !!Weighted length conditions
127
128
129 If the length of the actual mail is __M__
130 then:
131
132
133 * w^x
134 will generate an additional score of:
135
136
137 x
138 / M \
139 w * | --- |
140 \ L /
141 And:
142
143
144 * w^x
145 will generate an additional score of:
146
147
148 x
149 / L \
150 w * | --- |
151 \ M /
152
153
154 In both cases, if L=M, this will add w to the score. In the
155 former case however, larger mails will be favoured, in the
156 latter case, smaller mails will be favoured. Although x can
157 be varied to fine-tune the steepness of the function,
158 typical usage sets x=1.
159 !!MISCELLANEOUS
160
161
162 You can query the final score of all the conditions on a
163 recipe from the environment variable __$=__. This
164 variable is set ''every'' time just after procmail has
165 parsed all conditions on a recipe (even if the recipe is not
166 being executed).
167 !!EXAMPLES
168
169
170 The following recipe will ditch all mails having more than
171 150 lines in the body. The first condition contains an empty
172 regular expression which, because it always matches, is used
173 to give our score a negative offset. The second condition
174 then matches every line in the mail, and consumes up the
175 previous negative offset we gave (one point per line). In
176 the end, the score will only be positive if the mail
177 contained more than 150 lines.
178
179
180 :0 Bh
181 * -150^0
182 * 1^1 ^.*$
183 /dev/null
184 Suppose you have a priority folder which you always read first. The next recipe picks out the priority mail and files them in this special folder. The first condition is a regular one, i.e., it doesn't contribute to the score, but simply has to be satisfied. The other conditions describe things like: john and claire usually have something important to say, meetings are usually important, replies are favoured a bit, mails about Elvis (this is merely an example :-) are favoured (the more he is mentioned, the more the mail is favoured, but the maximum extra score due to Elvis will be 4000, no matter how often he is mentioned), lots of quoted lines are disliked, smileys are appreciated (the score for those will reach a maximum of 3500), those three people usually don't send interesting mails, the mails should preferably be small (e.g., 2000 bytes long mails will score -100, 4000 bytes long mails do -800). As you see, if some of the uninteresting people send mail, then the mail still has a chance of landing in the priority folder, e.g., if it is about a meeting, or if it contains at least two smileys.
185
186
187 :0 HB
188 * !^Precedence:.*(junk|bulk)
189 * 2000^0 ^From:.*(john@home|claire@work)
190 * 2000^0 ^Subject:.*meeting
191 * 300^0 ^Subject:.*Re:
192 * 1000^.75 elvis|presley
193 * -100^1 ^
194 If you are subscribed to a mailinglist, and just would like to read the quality mails, then the following recipes could do the trick. First we make sure that the mail is coming from the mailinglist. Then we check if it is from certain persons of whom we value the opinion, or about a subject we absolutely want to know everything about. If it is, file it. Otherwise, check if the ratio of quoted lines to original lines is at most 1:2. If it exceeds that, ditch the mail. Everything that survived the previous test, is filed.
195
196
197 :0
198 ^From mailinglist-request@some.where
199 {
200 :0:
201 * ^(From:.*(paula|bill)|Subject:.*skiing)
202 mailinglist
203 :0 Bh
204 * 20^1 ^
205 For further examples you should look in the procmailex(5) man page.
206 !!CAVEATS
207
208
209 Because this speeds up the search by an order of magnitude,
210 the procmail internal egrep will always search for the
211 leftmost ''shortest'' match, unless it is determining
212 what to assign to __MATCH__, in which case it searches
213 the leftmost ''longest'' match. E.g. for the leftmost
214 ''shortest'' match, by itself, the regular
215 expression:
216
217
218 __.*__
219
220
221 will always match a zero length string at the same
222 spot.
223
224
225 __.+__
226
227
228 will always match one character (except newlines of
229 course).
230 !!SEE ALSO
231
232
233 procmail(1), procmailrc(5),
234 procmailex(5), sh(1), csh(1),
235 egrep(1), grep(1),
236 !!BUGS
237
238
239 If, in a length condition, you specify an __x__ that
240 causes an overflow, procmail is at the mercy of the
241 pow(3) function in your mathematical
242 library.
243
244
245 Floating point numbers in `engineering' format (e.g., 12e5)
246 are not accepted.
247 !!MISCELLANEOUS
248
249
250 As soon as `plus infinity' (2147483647) is reached, any
251 subsequent ''weighted'' conditions will simply be
252 skipped.
253
254
255 As soon as `minus infinity' (-2147483647) is reached, the
256 condition will be considered as `no match' and the recipe
257 will terminate early.
258 !!NOTES
259
260
261 If in a regular expression weighted formula
262 __0__, the to- tal added score for this
263 condition will asymptotically ap- proach:
264
265
266 w
267 -------
268 1 - x
269 In order to reach half the maximum value you need
270
271
272 - ln 2
273 n = --------
274 ln x
275 matches.
276 !!AUTHORS
277
278
279 Stephen R. van den Berg
280
281
282
283 Philip A. Guenther
284
285
286
287 ----
This page is a man page (or other imported legacy content). We are unable to automatically determine the license status of this page.