version 3, including all changes.
.
Rev |
Author |
# |
Line |
1 |
perry |
1 |
UPSMON.CONF |
|
|
2 |
!!!UPSMON.CONF |
|
|
3 |
NAME |
|
|
4 |
DESCRIPTION |
|
|
5 |
CONFIGURATION DIRECTIVES |
|
|
6 |
SEE ALSO |
|
|
7 |
---- |
|
|
8 |
!!NAME |
|
|
9 |
|
|
|
10 |
|
|
|
11 |
upsmon.conf - Configuration for Network UPS Tools upsmon |
|
|
12 |
!!DESCRIPTION |
|
|
13 |
|
|
|
14 |
|
|
|
15 |
This file's primary job is to define the systems that |
|
|
16 |
upsmon(8) will monitor and to tell it how to shut |
|
|
17 |
down the system when necessary. It will contain passwords, |
|
|
18 |
so keep it secure. Ideally,only the upsmon process should be |
|
|
19 |
able to read it. |
|
|
20 |
|
|
|
21 |
|
|
|
22 |
Additionally, other optional configuration values can be set |
|
|
23 |
in this file. |
|
|
24 |
!!CONFIGURATION DIRECTIVES |
|
|
25 |
|
|
|
26 |
|
|
|
27 |
DEADTIME ''seconds'' |
|
|
28 |
|
|
|
29 |
|
|
|
30 |
upsmon allows a UPS to go missing for this many seconds |
|
|
31 |
before declaring it |
|
|
32 |
|
|
|
33 |
|
|
|
34 |
upsmon requires a UPS to provide status information every |
|
|
35 |
few seconds (see POLLFREQ and POLLFREQALERT) to keep things |
|
|
36 |
updated. If the status fetch fails, the UPS is marked stale. |
|
|
37 |
If it stays stale for more than DEADTIME seconds, the UPS is |
|
|
38 |
marked dead. |
|
|
39 |
|
|
|
40 |
|
|
|
41 |
A dead UPS that was last known to be on battery is assumed |
|
|
42 |
to have changed to a low battery condition. This may force a |
|
|
43 |
shutdown if it is providing a critical amount of power to |
|
|
44 |
your system. This seems disruptive, but the alternative is |
|
|
45 |
barreling ahead into oblivion and crashing when you run out |
|
|
46 |
of power. |
|
|
47 |
|
|
|
48 |
|
|
|
49 |
Note: DEADTIME should be a multiple of POLLFREQ and |
|
|
50 |
POLLFREQALERT. Otherwise, you'll have |
|
|
51 |
|
|
|
52 |
|
|
|
53 |
FINALDELAY ''seconds'' |
|
|
54 |
|
|
|
55 |
|
|
|
56 |
When running in master mode, upsmon waits this long after |
|
|
57 |
sending the NOTIFY_SHUTDOWN to warn the users. After the |
|
|
58 |
timer elapses, it then runs your SHUTDOWNCMD. By default |
|
|
59 |
this is set to 5 seconds. |
|
|
60 |
|
|
|
61 |
|
|
|
62 |
If you need to let your users do something in between those |
|
|
63 |
events, increase this number. Remember, at this point your |
|
|
64 |
UPS battery is almost depleted, so don't make this too |
|
|
65 |
big. |
|
|
66 |
|
|
|
67 |
|
|
|
68 |
Alternatively, you can set this very low so you don't wait |
|
|
69 |
around when it's time to shut down. Some UPSes don't give |
|
|
70 |
much warning for low battery and will require a value of 0 |
|
|
71 |
here for a safe shutdown. |
|
|
72 |
|
|
|
73 |
|
|
|
74 |
Note: If FINALDELAY on the slave is greater than HOSTSYNC on |
|
|
75 |
the master, the master will give up waiting for the slave to |
|
|
76 |
disconnect. |
|
|
77 |
|
|
|
78 |
|
|
|
79 |
HOSTSYNC ''seconds'' |
|
|
80 |
|
|
|
81 |
|
|
|
82 |
upsmon will wait up to this many seconds in master mode for |
|
|
83 |
the slaves to disconnect during a shutdown situation. By |
|
|
84 |
default, this is 15 seconds. |
|
|
85 |
|
|
|
86 |
|
|
|
87 |
When a UPS goes critical (on battery + low battery, or |
|
|
88 |
|
|
|
89 |
|
|
|
90 |
This value is also used to keep slave systems from getting |
|
|
91 |
stuck if the master fails to respond in time. After a UPS |
|
|
92 |
becomes critical, the slave will wait up to HOSTSYNC seconds |
|
|
93 |
for the master to set the FSD flag. If that timer expires, |
|
|
94 |
the slave will assume that the master is broken and will |
|
|
95 |
shut down anyway. |
|
|
96 |
|
|
|
97 |
|
|
|
98 |
This keeps the slaves from shutting down during a |
|
|
99 |
short-lived status change to |
|
|
100 |
|
|
|
101 |
|
|
|
102 |
MINSUPPLIES ''num'' |
|
|
103 |
|
|
|
104 |
|
|
|
105 |
Set the number of power supplies that must be receiving |
|
|
106 |
power to keep this system running. Normal computers have |
|
|
107 |
just one power supply, so the default value of 1 is |
|
|
108 |
acceptable. |
|
|
109 |
|
|
|
110 |
|
|
|
111 |
Large/expensive server type systems usually have more, and |
2 |
perry |
112 |
can run with a few missing. The HP !NetServer LH4 can run |
1 |
perry |
113 |
with 2 out of 4, for example, so you'd set it to 2. The idea |
|
|
114 |
is to keep the box running as long as possible, |
|
|
115 |
right? |
|
|
116 |
|
|
|
117 |
|
|
|
118 |
Obviously you have to put the redundant supplies on |
|
|
119 |
different UPS circuits for this to make sense! See |
|
|
120 |
big-servers.txt in the docs subdirectory for more |
|
|
121 |
information and ideas on how to use this |
|
|
122 |
feature. |
|
|
123 |
|
|
|
124 |
|
|
|
125 |
Also see the section on |
|
|
126 |
upsmon__(8). |
|
|
127 |
|
|
|
128 |
|
|
|
129 |
MONITOR ''system powervalue password type'' |
|
|
130 |
|
|
|
131 |
|
|
|
132 |
Each UPS that you need to be monitor should have a MONITOR |
|
|
133 |
line. Not all of these need supply power to the system that |
|
|
134 |
is running upsmon. You may monitor other systems if you want |
|
|
135 |
to be able to send notifications about status changes on |
|
|
136 |
them. |
|
|
137 |
|
|
|
138 |
|
|
|
139 |
You must have at least one MONITOR directive in this |
|
|
140 |
file. |
|
|
141 |
|
|
|
142 |
|
|
|
143 |
''system'' is a UPS identifier. It is in this |
|
|
144 |
form: |
|
|
145 |
|
|
|
146 |
|
|
|
147 |
[[ |
|
|
148 |
|
|
|
149 |
|
|
|
150 |
Some examples: |
|
|
151 |
|
|
|
152 |
|
|
|
153 |
|
|
|
154 |
|
|
|
155 |
|
|
|
156 |
|
|
|
157 |
upsd__(8) on port |
|
|
158 |
1234. |
|
|
159 |
|
|
|
160 |
|
|
|
161 |
To use all of the options together: |
|
|
162 |
|
|
|
163 |
|
|
|
164 |
upsd__(8) on port 5678. Phew! |
|
|
165 |
|
|
|
166 |
|
|
|
167 |
''powervalue'' is an integer representing the number of |
|
|
168 |
power supplies that the UPS feeds on this system. Most |
|
|
169 |
normal computers have one power supply, and the UPS feeds |
|
|
170 |
it, so this value will be 1. You need a very large or |
|
|
171 |
special system to have anything higher here. |
|
|
172 |
|
|
|
173 |
|
|
|
174 |
You can set the ''powervalue'' to 0 if you want to |
|
|
175 |
monitor a UPS that doesn't actually supply power to this |
|
|
176 |
system. This is useful when you want to have upsmon do |
|
|
177 |
notifications about status changes on a UPS without shutting |
|
|
178 |
down when it goes critical. |
|
|
179 |
|
|
|
180 |
|
|
|
181 |
The ''password'' on this line must match the ACCESS |
3 |
perry |
182 |
definition in your upsd.conf(5). |
1 |
perry |
183 |
|
|
|
184 |
|
|
|
185 |
The ''type'' refers to the relationship with |
|
|
186 |
upsd(8). It can be either |
|
|
187 |
upsmon(8) for more information |
|
|
188 |
on the meaning of these modes. Remember to grant either |
|
|
189 |
LOGIN or MASTER level to this host in your |
3 |
perry |
190 |
upsd.conf(8) depending on which type you choose |
1 |
perry |
191 |
here. |
|
|
192 |
|
|
|
193 |
|
|
|
194 |
NOCOMMWARNTIME ''seconds'' |
|
|
195 |
|
|
|
196 |
|
|
|
197 |
upsmon will trigger a NOTIFY_NOCOMM after this many seconds |
|
|
198 |
if it can't reach any of the UPS entries in this |
|
|
199 |
configuration file. It keeps warning you until the situation |
|
|
200 |
is fixed. By default this is 300 seconds. |
|
|
201 |
|
|
|
202 |
|
|
|
203 |
NOTIFYCMD ''command'' |
|
|
204 |
|
|
|
205 |
|
|
|
206 |
upsmon calls this to send messages when things |
|
|
207 |
happen. |
|
|
208 |
|
|
|
209 |
|
|
|
210 |
This command is called with the full text of the message as |
|
|
211 |
one argument. The environment string NOTIFYTYPE will contain |
|
|
212 |
the type string of whatever caused this event to |
|
|
213 |
happen. |
|
|
214 |
|
|
|
215 |
|
|
|
216 |
If you need to use upssched(8), then you must make it |
|
|
217 |
your NOTIFYCMD by listing it here. |
|
|
218 |
|
|
|
219 |
|
|
|
220 |
Note that this is only called for NOTIFY events that have |
|
|
221 |
EXEC set with NOTIFYFLAG. See NOTIFYFLAG below for more |
|
|
222 |
details. |
|
|
223 |
|
|
|
224 |
|
|
|
225 |
Making this some sort of shell script might not be a bad |
|
|
226 |
idea. For more information and ideas, see pager.txt in the |
|
|
227 |
docs directory. |
|
|
228 |
|
|
|
229 |
|
|
|
230 |
Remember, this also needs to be one element in the |
|
|
231 |
configuration file, so if your command has spaces, then wrap |
|
|
232 |
it in quotes. |
|
|
233 |
|
|
|
234 |
|
|
|
235 |
NOTIFYCMD |
|
|
236 |
|
|
|
237 |
|
|
|
238 |
This script is run in the background - that is, upsmon forks |
|
|
239 |
before it calls out to start it. This means that your |
|
|
240 |
NOTIFYCMD may have multiple instances running simultaneously |
|
|
241 |
if a lot of stuff happens all at once. Keep this in mind |
|
|
242 |
when designing complicated notifiers. |
|
|
243 |
|
|
|
244 |
|
|
|
245 |
NOTIFYMSG ''type message'' |
|
|
246 |
|
|
|
247 |
|
|
|
248 |
upsmon comes with a set of stock messages for various |
|
|
249 |
events. You can change them if you like. |
|
|
250 |
|
|
|
251 |
|
|
|
252 |
NOTIFYMSG ONLINE |
|
|
253 |
|
|
|
254 |
|
|
|
255 |
NOTIFYMSG ONBATT |
|
|
256 |
|
|
|
257 |
|
|
|
258 |
Note that %s is replaced with the identifier of the UPS in |
|
|
259 |
question. |
|
|
260 |
|
|
|
261 |
|
|
|
262 |
Possible values for ''type'': |
|
|
263 |
|
|
|
264 |
|
|
|
265 |
ONLINE - UPS is back online |
|
|
266 |
|
|
|
267 |
|
|
|
268 |
ONBATT - UPS is on battery |
|
|
269 |
|
|
|
270 |
|
|
|
271 |
LOWBATT - UPS is on battery and has a low battery (is |
|
|
272 |
critical) |
|
|
273 |
|
|
|
274 |
|
|
|
275 |
FSD - UPS is being shutdown by the master (FSD = |
|
|
276 |
|
|
|
277 |
|
|
|
278 |
COMMOK - Communications established with the |
|
|
279 |
UPS |
|
|
280 |
|
|
|
281 |
|
|
|
282 |
COMMBAD - Communications lost to the UPS |
|
|
283 |
|
|
|
284 |
|
|
|
285 |
SHUTDOWN - The system is being shutdown |
|
|
286 |
|
|
|
287 |
|
|
|
288 |
REPLBATT - The UPS battery is bad and needs to be |
|
|
289 |
replaced |
|
|
290 |
|
|
|
291 |
|
|
|
292 |
NOCOMM - A UPS is unavailable (can't be contacted for |
|
|
293 |
monitoring) |
|
|
294 |
|
|
|
295 |
|
|
|
296 |
The message must be one element in the configuration file, |
|
|
297 |
so if it contains spaces, you must wrap it in |
|
|
298 |
quotes. |
|
|
299 |
|
|
|
300 |
|
|
|
301 |
NOTIFYMSG NOCOMM |
|
|
302 |
|
|
|
303 |
|
|
|
304 |
NOTIFYFLAG ''type |
|
|
305 |
flag''[[+''flag''][[+''flag'']... |
|
|
306 |
|
|
|
307 |
|
|
|
308 |
By default, upsmon sends walls global messages to all logged |
|
|
309 |
in users) via /bin/wall and writes to the syslog when things |
|
|
310 |
happen. You can change this. |
|
|
311 |
|
|
|
312 |
|
|
|
313 |
Examples: |
|
|
314 |
|
|
|
315 |
|
|
|
316 |
NOTIFYFLAG ONLINE SYSLOG |
|
|
317 |
|
|
|
318 |
|
|
|
319 |
NOTIFYFLAG ONBATT SYSLOG+WALL+EXEC |
|
|
320 |
|
|
|
321 |
|
|
|
322 |
Possible values for the flags: |
|
|
323 |
|
|
|
324 |
|
|
|
325 |
SYSLOG - Write the message to the syslog |
|
|
326 |
|
|
|
327 |
|
|
|
328 |
WALL - Write the message to all users with |
|
|
329 |
/bin/wall |
|
|
330 |
|
|
|
331 |
|
|
|
332 |
EXEC - Execute NOTIFYCMD (see above) with the |
|
|
333 |
message |
|
|
334 |
|
|
|
335 |
|
|
|
336 |
IGNORE - Don't do anything |
|
|
337 |
|
|
|
338 |
|
|
|
339 |
If you use IGNORE, don't use any other flags on the same |
|
|
340 |
line. |
|
|
341 |
|
|
|
342 |
|
|
|
343 |
POLLFREQ ''seconds'' |
|
|
344 |
|
|
|
345 |
|
|
|
346 |
Normally upsmon polls the upsd(8) server every 5 |
|
|
347 |
seconds. If this is flooding your network with activity, you |
|
|
348 |
can make it higher. You can also make it lower to get faster |
|
|
349 |
updates in some cases. |
|
|
350 |
|
|
|
351 |
|
|
|
352 |
There are some catches. First, if you set the POLLFREQ too |
|
|
353 |
high, you may miss short-lived power events entirely. You |
|
|
354 |
also risk triggering the DEADTIME (see above) if you use a |
|
|
355 |
very large number. |
|
|
356 |
|
|
|
357 |
|
|
|
358 |
Second, there is a point of diminishing returns if you set |
|
|
359 |
it too low. While upsd normally has all of the data |
|
|
360 |
available to it instantly, most drivers only refresh the UPS |
|
|
361 |
status once every 2 seconds. Polling any more than that |
|
|
362 |
usually doesn't get you the information any |
|
|
363 |
faster. |
|
|
364 |
|
|
|
365 |
|
|
|
366 |
POLLFREQALERT ''seconds'' |
|
|
367 |
|
|
|
368 |
|
|
|
369 |
This is the interval that upsmon waits between polls if any |
|
|
370 |
of its UPSes are on battery. You can use this along with |
|
|
371 |
POLLFREQ above to slow down polls during normal behavior, |
|
|
372 |
but get quicker updates when something bad |
|
|
373 |
happens. |
|
|
374 |
|
|
|
375 |
|
|
|
376 |
This should always be equal to or lower than the POLLFREQ |
|
|
377 |
value. By default it is also set 5 seconds. |
|
|
378 |
|
|
|
379 |
|
|
|
380 |
The catches about POLLFREQ about too-high and too-low values |
|
|
381 |
also apply here. |
|
|
382 |
|
|
|
383 |
|
|
|
384 |
POWERDOWNFLAG ''filename'' |
|
|
385 |
|
|
|
386 |
|
|
|
387 |
upsmon creates this file when running in master mode when |
|
|
388 |
the UPS needs to be powered off. You should check for this |
|
|
389 |
file in your shutdown scripts and call the shutdown sequence |
|
|
390 |
(-k) in your UPS model driver if it exists. |
|
|
391 |
|
|
|
392 |
|
|
|
393 |
This is done to forcibly reset the slaves, so they don't get |
|
|
394 |
stuck at the |
|
|
395 |
|
|
|
396 |
|
|
|
397 |
See the shutdown.txt file in the docs subdirectory for more |
|
|
398 |
information. |
|
|
399 |
|
|
|
400 |
|
|
|
401 |
RBWARNTIME ''seconds'' |
|
|
402 |
|
|
|
403 |
|
|
|
404 |
When a UPS says that it needs to have its battery replaced, |
|
|
405 |
upsmon will generate a NOTIFY_REPLBATT event. By default |
|
|
406 |
this happens every 43200 seconds - 12 hours. |
|
|
407 |
|
|
|
408 |
|
|
|
409 |
If you need another value, set it here. |
|
|
410 |
|
|
|
411 |
|
|
|
412 |
RUN_AS_USER ''username'' |
|
|
413 |
|
|
|
414 |
|
|
|
415 |
upsmon normally runs the bulk of the monitoring duties under |
|
|
416 |
another user ID after dropping root privileges. On most |
|
|
417 |
systems this means it runs as |
|
|
418 |
|
|
|
419 |
|
|
|
420 |
The catch is that |
|
|
421 |
|
|
|
422 |
|
|
|
423 |
The solution is to create a new user just for upsmon, then |
|
|
424 |
make it run as that user. I suggest |
|
|
425 |
|
|
|
426 |
|
|
|
427 |
Then, tell upsmon to run as that user, and make upsmon.conf |
|
|
428 |
readable by it. Your reloads will work, and your config file |
|
|
429 |
will stay secure. |
|
|
430 |
|
|
|
431 |
|
|
|
432 |
SHUTDOWNCMD ''command'' |
|
|
433 |
|
|
|
434 |
|
|
|
435 |
upsmon runs this command when the system needs to be brought |
|
|
436 |
down. If it is a slave, it will do that immediately whenever |
|
|
437 |
the current overall power value drops below the MINSUPPLIES |
|
|
438 |
value above. |
|
|
439 |
|
|
|
440 |
|
|
|
441 |
When upsmon is a master, it will allow any slaves to log out |
|
|
442 |
before starting the local shutdown procedure. |
|
|
443 |
|
|
|
444 |
|
|
|
445 |
Note that the command needs to be one element in the config |
|
|
446 |
file. If your shutdown command includes spaces, then put it |
|
|
447 |
in quotes to keep it together, i.e.: |
|
|
448 |
|
|
|
449 |
|
|
|
450 |
SHUTDOWNCMD |
|
|
451 |
!!SEE ALSO |
|
|
452 |
|
|
|
453 |
|
|
|
454 |
upsmon(8), upsd(8), |
|
|
455 |
nutupsdrv(8). |
|
|
456 |
|
|
|
457 |
|
|
|
458 |
__Internet resources:__ |
|
|
459 |
|
|
|
460 |
|
|
|
461 |
The NUT (Network UPS Tools) home page: |
|
|
462 |
http://www.exploits.org/nut/ |
|
|
463 |
|
|
|
464 |
|
|
|
465 |
NUT mailing list archives and information: |
|
|
466 |
http://lists.exploits.org/ |
|
|
467 |
---- |