version 1, including all changes.
.
Rev |
Author |
# |
Line |
1 |
perry |
1 |
PRIO |
|
|
2 |
!!!PRIO |
|
|
3 |
NAME |
|
|
4 |
SYNOPSIS |
|
|
5 |
DESCRIPTION |
|
|
6 |
ALGORITHM |
|
|
7 |
CLASSIFICATION |
|
|
8 |
QDISC PARAMETERS |
|
|
9 |
CLASSES |
|
|
10 |
BUGS |
|
|
11 |
AUTHORS |
|
|
12 |
---- |
|
|
13 |
!!NAME |
|
|
14 |
|
|
|
15 |
|
|
|
16 |
PRIO - Priority qdisc |
|
|
17 |
!!SYNOPSIS |
|
|
18 |
|
|
|
19 |
|
|
|
20 |
__tc qdisc ... dev__ dev __( parent__ classid __| |
|
|
21 |
root) [[ handle__ major: __] prio [[ bands__ bands __] [[ |
|
|
22 |
priomap__ band,band,band... __] [[ estimator__ interval |
|
|
23 |
timeconstant __]__ |
|
|
24 |
!!DESCRIPTION |
|
|
25 |
|
|
|
26 |
|
|
|
27 |
The PRIO qdisc is a simple classful queueing discipline that |
|
|
28 |
contains an arbitrary number of classes of differing |
|
|
29 |
priority. The classes are dequeued in numerical descending |
|
|
30 |
order of priority. PRIO is a scheduler and never delays |
|
|
31 |
packets - it is a work-conserving qdisc, though the qdiscs |
|
|
32 |
contained in the classes may not be. |
|
|
33 |
|
|
|
34 |
|
|
|
35 |
Very useful for lowering latency when there is no need for |
|
|
36 |
slowing down traffic. |
|
|
37 |
!!ALGORITHM |
|
|
38 |
|
|
|
39 |
|
|
|
40 |
On creation with 'tc qdisc add', a fixed number of bands is |
|
|
41 |
created. Each band is a class, although is not possible to |
|
|
42 |
add classes with 'tc qdisc add', the number of bands to be |
|
|
43 |
created must instead be specified on the commandline |
|
|
44 |
attaching PRIO to its root. |
|
|
45 |
|
|
|
46 |
|
|
|
47 |
When dequeueing, band 0 is tried first and only if it did |
|
|
48 |
not deliver a packet does PRIO try band 1, and so onwards. |
|
|
49 |
Maximum reliability packets should therefore go to band 0, |
|
|
50 |
minimum delay to band 1 and the rest to band 2. |
|
|
51 |
|
|
|
52 |
|
|
|
53 |
As the PRIO qdisc itself will have minor number 0, band 0 is |
|
|
54 |
actually major:1, band 1 is major:2, etc. For major, |
|
|
55 |
substitute the major number assigned to the qdisc on 'tc |
|
|
56 |
qdisc add' with the __handle__ parameter. |
|
|
57 |
!!CLASSIFICATION |
|
|
58 |
|
|
|
59 |
|
|
|
60 |
Three methods are available to PRIO to determine in which |
|
|
61 |
band a packet will be enqueued. |
|
|
62 |
|
|
|
63 |
|
|
|
64 |
From userspace |
|
|
65 |
|
|
|
66 |
|
|
|
67 |
A process with sufficient privileges can encode the |
|
|
68 |
destination class directly with SO_PRIORITY, see |
|
|
69 |
__tc(7).__ |
|
|
70 |
|
|
|
71 |
|
|
|
72 |
with a tc filter |
|
|
73 |
|
|
|
74 |
|
|
|
75 |
A tc filter attached to the root qdisc can point traffic |
|
|
76 |
directly to a class |
|
|
77 |
|
|
|
78 |
|
|
|
79 |
with the priomap |
|
|
80 |
|
|
|
81 |
|
|
|
82 |
Based on the packet priority, which in turn is derived from |
|
|
83 |
the Type of Service assigned to the packet. |
|
|
84 |
|
|
|
85 |
|
|
|
86 |
Only the priomap is specific to this qdisc. |
|
|
87 |
!!QDISC PARAMETERS |
|
|
88 |
|
|
|
89 |
|
|
|
90 |
bands |
|
|
91 |
|
|
|
92 |
|
|
|
93 |
Number of bands. If changed from the default of 3, |
|
|
94 |
__priomap__ must be updated as well. |
|
|
95 |
|
|
|
96 |
|
|
|
97 |
priomap |
|
|
98 |
|
|
|
99 |
|
|
|
100 |
The priomap maps the priority of a packet to a class. The |
|
|
101 |
priority can either be set directly from userspace, or be |
|
|
102 |
derived from the Type of Service of the packet. |
|
|
103 |
|
|
|
104 |
|
|
|
105 |
Determines how packet priorities, as assigned by the kernel, |
|
|
106 |
map to bands. Mapping occurs based on the TOS octet of the |
|
|
107 |
packet, which looks like this: |
|
|
108 |
|
|
|
109 |
|
|
|
110 |
0 1 2 3 4 5 6 7 |
|
|
111 |
+---+---+---+---+---+---+---+---+ |
|
|
112 |
| | | | |
|
|
113 |
|PRECEDENCE | TOS |MBZ| |
|
|
114 |
| | | | |
|
|
115 |
+---+---+---+---+---+---+---+---+ |
|
|
116 |
The four TOS bits (the 'TOS field') are defined as: |
|
|
117 |
|
|
|
118 |
|
|
|
119 |
Binary Decimcal Meaning |
|
|
120 |
----------------------------------------- |
|
|
121 |
1000 8 Minimize delay (md) |
|
|
122 |
0100 4 Maximize throughput (mt) |
|
|
123 |
0010 2 Maximize reliability (mr) |
|
|
124 |
0001 1 Minimize monetary cost (mmc) |
|
|
125 |
0000 0 Normal Service |
|
|
126 |
As there is 1 bit to the right of these four bits, the actual value of the TOS field is double the value of the TOS bits. Tcpdump -v -v shows you the value of the entire TOS field, not just the four bits. It is the value you see in the first column of this table: |
|
|
127 |
|
|
|
128 |
|
|
|
129 |
TOS Bits Means Linux Priority Band |
|
|
130 |
------------------------------------------------------------ |
|
|
131 |
0x0 0 Normal Service 0 Best Effort 1 |
|
|
132 |
0x2 1 Minimize Monetary Cost 1 Filler 2 |
|
|
133 |
0x4 2 Maximize Reliability 0 Best Effort 1 |
|
|
134 |
0x6 3 mmc+mr 0 Best Effort 1 |
|
|
135 |
0x8 4 Maximize Throughput 2 Bulk 2 |
|
|
136 |
0xa 5 mmc+mt 2 Bulk 2 |
|
|
137 |
0xc 6 mr+mt 2 Bulk 2 |
|
|
138 |
0xe 7 mmc+mr+mt 2 Bulk 2 |
|
|
139 |
0x10 8 Minimize Delay 6 Interactive 0 |
|
|
140 |
0x12 9 mmc+md 6 Interactive 0 |
|
|
141 |
0x14 10 mr+md 6 Interactive 0 |
|
|
142 |
0x16 11 mmc+mr+md 6 Interactive 0 |
|
|
143 |
0x18 12 mt+md 4 Int. Bulk 1 |
|
|
144 |
0x1a 13 mmc+mt+md 4 Int. Bulk 1 |
|
|
145 |
0x1c 14 mr+mt+md 4 Int. Bulk 1 |
|
|
146 |
0x1e 15 mmc+mr+mt+md 4 Int. Bulk 1 |
|
|
147 |
The second column contains the value of the relevant four TOS bits, followed by their translated meaning. For example, 15 stands for a packet wanting Minimal Montetary Cost, Maximum Reliability, Maximum Throughput AND Minimum Delay. |
|
|
148 |
|
|
|
149 |
|
|
|
150 |
The fourth column lists the way the Linux kernel interprets |
|
|
151 |
the TOS bits, by showing to which Priority they are |
|
|
152 |
mapped. |
|
|
153 |
|
|
|
154 |
|
|
|
155 |
The last column shows the result of the default priomap. On |
|
|
156 |
the commandline, the default priomap looks like |
|
|
157 |
this: |
|
|
158 |
|
|
|
159 |
|
|
|
160 |
1, 2, 2, 2, 1, 2, 0, 0 , 1, 1, 1, 1, 1, 1, 1, 1 |
|
|
161 |
|
|
|
162 |
|
|
|
163 |
This means that priority 4, for example, gets mapped to band |
|
|
164 |
number 1. The priomap also allows you to list higher |
|
|
165 |
priorities ( |
|
|
166 |
|
|
|
167 |
|
|
|
168 |
This table from RFC 1349 (read it for more details) explains |
|
|
169 |
how applications might very well set their TOS |
|
|
170 |
bits: |
|
|
171 |
|
|
|
172 |
|
|
|
173 |
TELNET 1000 (minimize delay) |
|
|
174 |
FTP |
|
|
175 |
Control 1000 (minimize delay) |
|
|
176 |
Data 0100 (maximize throughput) |
|
|
177 |
TFTP 1000 (minimize delay) |
|
|
178 |
SMTP |
|
|
179 |
Command phase 1000 (minimize delay) |
|
|
180 |
DATA phase 0100 (maximize throughput) |
|
|
181 |
Domain Name Service |
|
|
182 |
UDP Query 1000 (minimize delay) |
|
|
183 |
TCP Query 0000 |
|
|
184 |
Zone Transfer 0100 (maximize throughput) |
|
|
185 |
NNTP 0001 (minimize monetary cost) |
|
|
186 |
ICMP |
|
|
187 |
Errors 0000 |
|
|
188 |
Requests 0000 (mostly) |
|
|
189 |
Responses |
|
|
190 |
!!CLASSES |
|
|
191 |
|
|
|
192 |
|
|
|
193 |
PRIO classes cannot be configured further - they are |
|
|
194 |
automatically created when the PRIO qdisc is attached. Each |
|
|
195 |
class however can contain yet a further qdisc. |
|
|
196 |
!!BUGS |
|
|
197 |
|
|
|
198 |
|
|
|
199 |
Large amounts of traffic in the lower bands can cause |
|
|
200 |
starvation of higher bands. Can be prevented by attaching a |
|
|
201 |
shaper (for example, __tc-tbf(8)__ to these bands to make |
|
|
202 |
sure they cannot dominate the link. |
|
|
203 |
!!AUTHORS |
|
|
204 |
|
|
|
205 |
|
|
|
206 |
Alexey N. Kuznetsov, |
|
|
207 |
---- |