Penguin
Blame: ProcessNotes
EditPageHistoryDiffInfoLikePages
Annotated edit history of ProcessNotes version 22, including all changes. View license author blame.
Rev Author # Line
16 JohnMcPherson 1 For a good introduction to processes, have a look at the slides on our UnixTutorials page.
2
15 AristotlePagaltzis 3 !! Useful Process Related utilities
4
22 CraigBox 5 ; %%% fuser(1) :Tells you which processes are using a resource, and optionally send them a [Signal]
6 ; %%% kill(1) :Send a [Signal] to a process by ProcessID
7 ; %%% killall(1) :Send a [Signal] to a process by name
8 ; %%% killall5(8) :Send a [Signal] to all running processes
9 ; %%% lsof(8) :Similar to fuser(1)
10 ; %%% nice(1) :Run a program with modified scheduling priority
11 ; %%% pgrep(1), pkill(1) :Look up or signal processes based on name and other attributes
12 ; %%% pidof(8) :List pid(s) of process(es) by name
13 ; %%% ps(1) :Display process status
14 ; %%% pstree(1) :Display processes as a tree
15 ; %%% top(1) :Display processes sorted by certain criteria (default: [CPU] load)
16 ; %%% vmstat(8) :Show VirtualMemory statistics
15 AristotlePagaltzis 17
18 ----
19
20 !! top
21
22 The 'TIME' column in top is the amount of time the program has spent running, not to be confused that the amount of time since the program was started. eg: a program started a month ago may have only run for 1 minute total, so it's TIME column will only show 1 minute of running time.
23
24 RSS is the ResidentSetSize, the amount of memory that the program has which is actually *in* memory (not swapped out). Note that this also covers memory which is shared between programs and threads. Mozilla for instance shows as using about 20M in 5 processes, but this doesn't mean it is using 100M in total, it means it's using about 20M in total, shared between 5 processes :)
25
26 Someone was searching for "WCHAN", so here's a definition, when a process is 'sleeping in the kernel' (in the S state) then WCHAN is the function inside the kernel it is sleeping on. for instance init(8) (at least on my machine) usually is blocked inside "select" from select(2).
27
28 top(1)'s summary output:
29 * The top line has the uptime, the number of users logged in (according to utmp(5)) and the LoadAverage (according to uptime(1))
30 * The next line has the number of processes, then a break down of sleeping processes (processes blocked waiting for an event), the number of running processes, the number of [ZombieProcess]es (processes that haven't been cleaned up by their parent process) and the number of stopped processes (processes that are stopped by SIGSTOP)
17 JohnMcPherson 31 * The next line has the CPU states, amount used in userspace, the amount of CPU used in the system (kernel and device drivers), the amount of cpu used by nice processes (processes that have a lower than normal priority) and the amount of cpu time that is idle (is spent with the cpu shutdown). In more recent versions (such as "procps version 3.2.0"), this line gives a summary of all cpus, for __us__er, __sy__stem, __ni__ce, __id__le, __wa__iting on I/O, __h__ard __I__RQ, and __s__oft __I__RQ. (For this version of top, pressing "1" toggles between 1 summary for all cpus, and a summary line for each cpu.)
15 AristotlePagaltzis 32 * Then the memory breakdowns:
22 CraigBox 33 * Total amount of physical memory that the kernel knows about
34 * The amount of physical memory that is in use
35 * The amount of physical memory that is not in use (wasted)
36 * The amount of physical memory used for buffers (eg: networking)
15 AristotlePagaltzis 37 * The swap breakdown:
22 CraigBox 38 * Total amount of swap space
39 * Total amount of swap used
40 * Total amount of swap free
41 * Total amount of physical memory being used as disk cache.
15 AristotlePagaltzis 42
43 After this comes the list of processes.
44
45 Some hints:
46 * If you're doing a lot of I/O (or especially older IDE I/O) you will probably see your "System %" increase. This means that your CPU is spending it's time talking to the hardware, and perhaps not spending much time doing whatever you want it to. If your system % is high you should perhaps consider upgrading your hardware.
47 * If the number of zombies is high then you possibly have a poorly written program that doesn't cleanup zombies. use pstree(1) to get an idea which process is not cleaning up it's children.
48 * See LoadAverage about the Load average and related issues.
49
50 The various states of a process can be:
22 CraigBox 51 <?plugin OldStyleTable
15 AristotlePagaltzis 52 |^ __State__ |^ __Meaning__
53 | S | Sleeping
54 | W | Swapped out
55 | R | Running
56 | D | Blocked in a device driver in the kernel. Unkillable.
57 | < | Process is running with a high priority (nice level <0)
22 CraigBox 58 ?>
15 AristotlePagaltzis 59
60 ----
61
62 !! nice
63
64 nice(1) lets you make programs "nicer" (ie: have less access to the [CPU] in proportion to other processes). nice values in Linux range between -20 and +19. The default nice(1) level is "0". Only the root user can lower their niceless level. Higher nice level means it has a lower priority. A process running at -20 is considered "RealTime" and is never preempted.
65
22 CraigBox 66 <pre>
15 AristotlePagaltzis 67 nice -n ''nicelevelchange'' ''program'
22 CraigBox 68 </pre>
15 AristotlePagaltzis 69
70 eg:
71
22 CraigBox 72 <pre>
15 AristotlePagaltzis 73 nice -n 1 ./program OR nice -1 ./program
22 CraigBox 74 </pre>
15 AristotlePagaltzis 75
76 will run ./program with one level higher niceness (ie: *lower* priority compared to other processes).
77
22 CraigBox 78 <pre>
15 AristotlePagaltzis 79 nice --5 ./program
22 CraigBox 80 </pre>
15 AristotlePagaltzis 81
82 will run a process with lower niceness (ie *higher* priority) of negative 5. (Only the root user can do this).
83
84 ----
85
86 !! ps
87
88 If you want to grep for a running process (eg foo) use:
89
22 CraigBox 90 <pre>
91 ps ax | grep ~[f]oo
92 </pre>
15 AristotlePagaltzis 93
94 not:
95
22 CraigBox 96 <pre>
15 AristotlePagaltzis 97 ps ax | grep foo
22 CraigBox 98 </pre>
15 AristotlePagaltzis 99
100 The reason for this is that the latter example will also find the __grep foo__ itself in the process list, while the first one won't.
101
102 The most useful ps(1) command is probably
103
22 CraigBox 104 <pre>
15 AristotlePagaltzis 105 ps auxww
22 CraigBox 106 </pre>
15 AristotlePagaltzis 107
108 This gives a lot more information about each process than you get by default.
109
110 Here is a poor man's Linux-only ps replacement (for when ps(1) just don't work
111 or can't be relied upon):
112
22 CraigBox 113 <pre>
15 AristotlePagaltzis 114 #!/bin/bash
22 CraigBox 115 cd /proc && for p in ~[0-9]* ; do
15 AristotlePagaltzis 116 echo -ne "$p\0"
117 tr '\0' ' ' < $p/cmdline
118 echo -ne '\0'
119 done | xargs -r0t printf ' %5g %s\n' | sort -ns
22 CraigBox 120 </pre>
15 AristotlePagaltzis 121
122 ----
123
124 !! Miscellaneous
125
126 If you want to unmount a filesystem but it's in use you can use
127
22 CraigBox 128 <pre>
15 AristotlePagaltzis 129 ps -auxwwe |grep ''mountpoint''
130 lsof | grep ''mountpoint''
131 fuser -vm ''mountpoint''
22 CraigBox 132 </pre>
15 AristotlePagaltzis 133
134 lsof(8) stands for __l__i__s__t of __o__pen __f__iles.
135
136 PerryLorier suggests
137
22 CraigBox 138 <pre>
15 AristotlePagaltzis 139 fuser -k -v -m /mnt/nfs
22 CraigBox 140 </pre>
15 AristotlePagaltzis 141
142 to kill all processes using that mount point.
143
144 If your program says "[Signal] 11", "SegmentationFault", or similar, you can
145 retrieve information about the process when it crashed. First remove the limit
146 on dumping core files (so it will dump core this time around):
147
22 CraigBox 148 <pre>
15 AristotlePagaltzis 149 ulimit -c unlimited
22 CraigBox 150 </pre>
15 AristotlePagaltzis 151
152 Then run the program again. (See builtins(1) and ulimit(3) for more information
153 about this.) This time when it [SegmentationFault]s it will leave a file called
154 "core" which contains the state of the program when it died. This file can be
155 inspected by
156
22 CraigBox 157 <pre>
15 AristotlePagaltzis 158 gdb ''programname'' ''corefilename''
22 CraigBox 159 </pre>
15 AristotlePagaltzis 160
161 To find out where it crashed try __bt full__ at the prompt. You can also print
162 variables to find out what they currently hold, for example __print argc__ will
163 tell you the contents of argc. Of course, __quit__ or [[Ctrl][[d] will exit
164 gdb. For more information about the [GNU] debugger see gdb(1). For more
165 information about this procedure see DeBugging.
18 PerryLorier 166
167 !!Help I'm running out of file handles, what's using them all?
22 CraigBox 168 <pre>
18 PerryLorier 169 lsof | awk '{print $2}' | uniq -c | sort -n +1 | join -12 -21 - <(ps ax -o pid,command | sort -n) | sort -n +1
22 CraigBox 170 </pre>
18 PerryLorier 171 This one liner lists processes in the form "pid,number of open files,command" with the process using the most files at the end.
172 You can use this to determine which process on your system is leaking file handles, then use strace(1) to figure out why.
19 CraigBox 173
22 CraigBox 174 Read about [Zombie processes|Zombie].
20 CraigBox 175
176 !! Saving processes to disc (software suspend)
177
178 Hate having processes die because of kernel upgrades? The thought of losing your irssi scrollback just too much for you?
179
180 Get [Cryopid|http://cryopid.berlios.de/]. Compile (I needed zlib1g-dev on my Ubuntu machine which had build-essentials installed) and then run 'freeze' to save processes to disk. You start the process again by executing the file that it saves - there is no 'thaw' utility
181
182 It can't save a screen session, but you can save the processes inside them and rescreen.
18 PerryLorier 183
184 ----
185 CategoryNotes