version 2, including all changes.
.
| Rev |
Author |
# |
Line |
| 1 |
perry |
1 |
TCP |
| |
|
2 |
!!!TCP |
| |
|
3 |
NAME |
| |
|
4 |
SYNOPSIS |
| |
|
5 |
DESCRIPTION |
| |
|
6 |
ADDRESS FORMATS |
| |
|
7 |
SYSCTLS |
| |
|
8 |
SOCKET OPTIONS |
| |
|
9 |
IOCTLS |
| |
|
10 |
ERROR HANDLING |
| |
|
11 |
NOTES |
| |
|
12 |
ERRORS |
| |
|
13 |
BUGS |
| |
|
14 |
VERSIONS |
| |
|
15 |
SEE ALSO |
| |
|
16 |
---- |
| |
|
17 |
!!NAME |
| |
|
18 |
|
| |
|
19 |
|
| |
|
20 |
tcp - TCP protocol. |
| |
|
21 |
!!SYNOPSIS |
| |
|
22 |
|
| |
|
23 |
|
| |
|
24 |
__#include __ |
| |
|
25 |
#include __ |
| |
|
26 |
tcp_socket = socket(PF_INET, SOCK_STREAM, |
| |
|
27 |
0);__ |
| |
|
28 |
!!DESCRIPTION |
| |
|
29 |
|
| |
|
30 |
|
| |
|
31 |
This is an implementation of the TCP protocol defined in |
| 2 |
perry |
32 |
RFC793, RFC1122 and RFC2001 with the !NewReno and SACK |
| 1 |
perry |
33 |
extensions. It provides a reliable, stream oriented, full |
| |
|
34 |
duplex connection between two sockets on top of |
| |
|
35 |
ip(7). TCP guarantees that the data arrives in order |
| |
|
36 |
and retransmits lost packets. It generates and checks a per |
| |
|
37 |
packet checksum to catch transmission errors. TCP does not |
| |
|
38 |
preserve record boundaries. |
| |
|
39 |
|
| |
|
40 |
|
| |
|
41 |
A fresh TCP socket has no remote or local address and is not |
| |
|
42 |
fully specified. To create an outgoing TCP connection use |
| |
|
43 |
connect(2) to establish a connection to another TCP |
| |
|
44 |
socket. To receive new incoming connections bind(2) |
| |
|
45 |
the socket first to a local address and port and then call |
| |
|
46 |
listen(2) to put the socket into listening state. |
| |
|
47 |
After that a new socket for each incoming connection can be |
| |
|
48 |
accepted using accept(2). A socket which has had |
| |
|
49 |
__accept__ or __connect__ successfully called on it is |
| |
|
50 |
fully specified and may transmit data. Data cannot be |
| |
|
51 |
transmitted on listening or not yet connected |
| |
|
52 |
sockets. |
| |
|
53 |
|
| |
|
54 |
|
| |
|
55 |
Linux 2.2 supports the RFC1323 TCP high performance |
| |
|
56 |
extensions. This includes large TCP windows to support links |
| |
|
57 |
with high latency or bandwidth. In order to make use of |
| |
|
58 |
them, the send and receive buffer sizes must be increased. |
| |
|
59 |
They can be be set globally with the |
| |
|
60 |
__net.core.wmem_default__ and |
| |
|
61 |
__net.core.rmem_default__ sysctls, or on individual |
| |
|
62 |
sockets by using the __SO_SNDBUF__ and __SO_RCVBUF__ |
| |
|
63 |
socket options. The maximum sizes for socket buffers are |
| |
|
64 |
limited by the global __net.core.rmem_max__ and |
| |
|
65 |
__net.core.wmem_max__ sysctls. See socket(7) for |
| |
|
66 |
more information. |
| |
|
67 |
|
| |
|
68 |
|
| |
|
69 |
TCP supports urgent data. Urgent data is used to signal the |
| |
|
70 |
receiver that some important message is part of the data |
| |
|
71 |
stream and that it should be processed as soon as possible. |
| |
|
72 |
To send urgent data specify the __MSG_OOB__ option to |
| |
|
73 |
send(2). When urgent data is received, the kernel |
| |
|
74 |
sends a __SIGURG__ signal to the reading process or the |
| |
|
75 |
process or process group that has been set for the socket |
| |
|
76 |
using the __FIOCSPGRP__ or __FIOCSETOWN__ ioctls. When |
| |
|
77 |
the __SO_OOBINLINE__ socket option is enabled, urgent |
| |
|
78 |
data is put into the normal data stream (and can be tested |
| |
|
79 |
for by the __SIOCATMARK__ ioctl), otherwise it can be |
| |
|
80 |
only received when the __MSG_OOB__ flag is set for |
| |
|
81 |
sendmsg(2). |
| |
|
82 |
!!ADDRESS FORMATS |
| |
|
83 |
|
| |
|
84 |
|
| |
|
85 |
TCP is built on top of IP (see ip(7)). The address |
| |
|
86 |
formats defined by ip(7) apply to TCP. TCP only |
| |
|
87 |
supports point-to-point communication; broadcasting and |
| |
|
88 |
multicasting are not supported. |
| |
|
89 |
!!SYSCTLS |
| |
|
90 |
|
| |
|
91 |
|
| |
|
92 |
These sysctls can be accessed by the |
| |
|
93 |
__/proc/sys/net/ipv4/*__ files or with the |
| |
|
94 |
sysctl(2) interface. In addition, most IP sysctls |
| |
|
95 |
also apply to TCP; see ip(7). |
| |
|
96 |
|
| |
|
97 |
|
| |
|
98 |
__tcp_window_scaling__ |
| |
|
99 |
|
| |
|
100 |
|
| |
|
101 |
Enable RFC1323 TCP window scaling. |
| |
|
102 |
|
| |
|
103 |
|
| |
|
104 |
__tcp_sack__ |
| |
|
105 |
|
| |
|
106 |
|
| |
|
107 |
Enable RFC2018 TCP Selective Acknowledgements. |
| |
|
108 |
|
| |
|
109 |
|
| |
|
110 |
__tcp_timestamps__ |
| |
|
111 |
|
| |
|
112 |
|
| |
|
113 |
Enable RFC1323 TCP timestamps. |
| |
|
114 |
|
| |
|
115 |
|
| |
|
116 |
__tcp_fin_timeout__ |
| |
|
117 |
|
| |
|
118 |
|
| |
|
119 |
How many seconds to wait for a final FIN packet before the |
| |
|
120 |
socket is forcibly closed. This is strictly a violation of |
| |
|
121 |
the TCP specification, but required to prevent |
| |
|
122 |
denial-of-service attacks. |
| |
|
123 |
|
| |
|
124 |
|
| |
|
125 |
__tcp_keepalive_probes__ |
| |
|
126 |
|
| |
|
127 |
|
| |
|
128 |
Maximum TCP keep-alive probes to send before giving up. |
| |
|
129 |
Keep-alives are only sent when the __SO_KEEPALIVE__ |
| |
|
130 |
socket option is enabled. |
| |
|
131 |
|
| |
|
132 |
|
| |
|
133 |
__tcp_keepalive_time__ |
| |
|
134 |
|
| |
|
135 |
|
| |
|
136 |
The number of seconds after no data has been transmitted |
| |
|
137 |
before a keep-alive will be sent on a connection. The |
| |
|
138 |
default is 10800 seconds (3 hours). |
| |
|
139 |
|
| |
|
140 |
|
| |
|
141 |
__tcp_max_ka_probes__ |
| |
|
142 |
|
| |
|
143 |
|
| |
|
144 |
How many keep-alive probes are sent per slow timer run. To |
| |
|
145 |
prevent bursts, this value should not be set too |
| |
|
146 |
high. |
| |
|
147 |
|
| |
|
148 |
|
| |
|
149 |
__tcp_stdurg__ |
| |
|
150 |
|
| |
|
151 |
|
| |
|
152 |
Enable the strict RFC793 interpretation of the TCP |
| |
|
153 |
urgent-pointer field. The default is to use the |
| |
|
154 |
BSD-compatible interpretation of the urgent-pointer, |
| |
|
155 |
pointing to the first byte after the urgent data. The RFC793 |
| |
|
156 |
interpretation is to have it point to the last byte of |
| |
|
157 |
urgent data. Enabling this option may lead to |
| |
|
158 |
interoperatibility problems. |
| |
|
159 |
|
| |
|
160 |
|
| |
|
161 |
__tcp_syncookies__ |
| |
|
162 |
|
| |
|
163 |
|
| |
|
164 |
Enable TCP syncookies. The kernel must be compiled with |
| |
|
165 |
__CONFIG_SYN_COOKIES__. Syncookies protects a socket from |
| |
|
166 |
overload when too many connection attempts arrive. Client |
| |
|
167 |
machines may not be able to detect an overloaded machine |
| |
|
168 |
with a short timeout anymore when syncookies are |
| |
|
169 |
enabled. |
| |
|
170 |
|
| |
|
171 |
|
| |
|
172 |
__tcp_max_syn_backlog__ |
| |
|
173 |
|
| |
|
174 |
|
| |
|
175 |
Length of the per-socket backlog queue. As of Linux 2.2, the |
| |
|
176 |
backlog specified in listen(2) only specifies the |
| |
|
177 |
length of the backlog queue of already established sockets. |
| |
|
178 |
The maximum queue of sockets not yet established (in |
| |
|
179 |
__SYN_RECV__ state) per listen socket is set by this |
| |
|
180 |
sysctl. When more connection requests arrive, Linux starts |
| |
|
181 |
to drop packets. When syncookies are enabled the packets are |
| |
|
182 |
still answered and this value is effectively |
| |
|
183 |
ignored. |
| |
|
184 |
|
| |
|
185 |
|
| |
|
186 |
__tcp_retries1__ |
| |
|
187 |
|
| |
|
188 |
|
| |
|
189 |
Defines how many times an answer to a TCP connection request |
| |
|
190 |
is retransmitted before giving up. |
| |
|
191 |
|
| |
|
192 |
|
| |
|
193 |
__tcp_retries2__ |
| |
|
194 |
|
| |
|
195 |
|
| |
|
196 |
Defines how many times a TCP packet is retransmitted in |
| |
|
197 |
established state before giving up. |
| |
|
198 |
|
| |
|
199 |
|
| |
|
200 |
__tcp_syn_retries__ |
| |
|
201 |
|
| |
|
202 |
|
| |
|
203 |
Defines how many times to try to send an initial SYN packet |
| |
|
204 |
to a remote host before giving up and returns an error. Must |
| |
|
205 |
be below 255. This is only the timeout for outgoing |
| |
|
206 |
connections; for incoming connections the number of |
| |
|
207 |
retransmits is defined by __tcp_retries1__. |
| |
|
208 |
|
| |
|
209 |
|
| |
|
210 |
__tcp_retrans_collapse__ |
| |
|
211 |
|
| |
|
212 |
|
| |
|
213 |
Try to send full-sized packets during retransmit. This is |
| |
|
214 |
used to work around TCP bugs in some stacks. |
| |
|
215 |
!!SOCKET OPTIONS |
| |
|
216 |
|
| |
|
217 |
|
| |
|
218 |
To set or get a TCP socket option, call getsockopt(2) |
| |
|
219 |
to read or setsockopt(2) to write the option with the |
| |
|
220 |
socket family argument set to __SOL_TCP__. In addition, |
| |
|
221 |
most __SOL_IP__ socket options are valid on TCP sockets. |
| |
|
222 |
For more information see ip(7). |
| |
|
223 |
|
| |
|
224 |
|
| |
|
225 |
__TCP_NODELAY__ |
| |
|
226 |
|
| |
|
227 |
|
| |
|
228 |
Turn the Nagle algorithm off. This means that packets are |
| |
|
229 |
always sent as soon as possible and no unnecessary delays |
| |
|
230 |
are introduced, at the cost of more packets in the network. |
| |
|
231 |
Expects an integer boolean flag. |
| |
|
232 |
|
| |
|
233 |
|
| |
|
234 |
__TCP_MAXSEG__ |
| |
|
235 |
|
| |
|
236 |
|
| |
|
237 |
Set or receive the maximum segment size for outgoing TCP |
| |
|
238 |
packets. If this option is set before connection |
| |
|
239 |
establishment, it also changes the MSS value announced to |
| |
|
240 |
the other end in the initial packet. Values greater than the |
| |
|
241 |
interface MTU are ignored and have no effect. |
| |
|
242 |
|
| |
|
243 |
|
| |
|
244 |
__TCP_CORK__ |
| |
|
245 |
|
| |
|
246 |
|
| |
|
247 |
If enabled don't send out partial frames. All queued partial |
| |
|
248 |
frames are sent when the option is cleared again. This is |
| |
|
249 |
useful for prepending headers before calling |
| |
|
250 |
sendfile(2), or for throughput optimization. This |
| |
|
251 |
option cannot be combined with |
| |
|
252 |
__TCP_NODELAY__. |
| |
|
253 |
!!IOCTLS |
| |
|
254 |
|
| |
|
255 |
|
| |
|
256 |
These ioctls can be accessed using ioctl(2). The |
| |
|
257 |
correct syntax is: |
| |
|
258 |
|
| |
|
259 |
|
| |
|
260 |
__int__ ''value''__; |
| |
|
261 |
__''error'' __= ioctl(__''tcp_socket''__,__ ''ioctl_type''__, __''value''__); |
| |
|
262 |
__ |
| |
|
263 |
|
| |
|
264 |
|
| |
|
265 |
__FIONREAD__ or __TIOCINQ__ |
| |
|
266 |
|
| |
|
267 |
|
| |
|
268 |
Returns the amount of queued unread data in the receive |
| |
|
269 |
buffer. Argument is a pointer to an integer. |
| |
|
270 |
|
| |
|
271 |
|
| |
|
272 |
__SIOCATMARK__ |
| |
|
273 |
|
| |
|
274 |
|
| |
|
275 |
Returns true when the all urgent data has been already |
| |
|
276 |
received by the user program. This is used together with |
| |
|
277 |
__SO_OOBINLINE__. Argument is an pointer to an integer |
| |
|
278 |
for the test result. |
| |
|
279 |
|
| |
|
280 |
|
| |
|
281 |
__TIOCOUTQ__ |
| |
|
282 |
|
| |
|
283 |
|
| |
|
284 |
Returns the amount of unsent data in the socket send queue |
| |
|
285 |
in the passed integer value pointer. Unfortunately, the |
| |
|
286 |
implementation of this ioctl is buggy in all known versions |
| |
|
287 |
of Linux and instead returns the free space (effectively |
| |
|
288 |
buffer size minus bytes used including metadata) in the send |
| |
|
289 |
queue. This will be fixed in future Linux versions. If you |
| |
|
290 |
use __TIOCOUTQ__, please include a runtime test for both |
| |
|
291 |
behaviors for correct function on future releases and other |
| |
|
292 |
Unixes. |
| |
|
293 |
!!ERROR HANDLING |
| |
|
294 |
|
| |
|
295 |
|
| |
|
296 |
When a network error occurs, TCP tries to resend the packet. |
| |
|
297 |
If it doesn't succeed after some time, either |
| |
|
298 |
__ETIMEDOUT__ or the last received error on this |
| |
|
299 |
connection is reported. |
| |
|
300 |
|
| |
|
301 |
|
| |
|
302 |
Some applications require a quicker error notification. This |
| |
|
303 |
can be enabled with the __SOL_IP__ level |
| |
|
304 |
__IP_RECVERR__ socket option. When this option is |
| |
|
305 |
enabled, all incoming errors are immediately passed to the |
| |
|
306 |
user program. Use this option with care - it makes TCP less |
| |
|
307 |
tolerant to routing changes and other normal network |
| |
|
308 |
conditions. |
| |
|
309 |
!!NOTES |
| |
|
310 |
|
| |
|
311 |
|
| |
|
312 |
When an error occurs doing a connection setup occuring in a |
| |
|
313 |
socket write __SIGPIPE__ is only raised when the |
| |
|
314 |
__SO_KEEPALIVE__ socket option is set. |
| |
|
315 |
|
| |
|
316 |
|
| |
|
317 |
TCP has no real out-of-band data; it has urgent data. In |
| |
|
318 |
Linux this means if the other end sends newer out-of-band |
| |
|
319 |
data the older urgent data is inserted as normal data into |
| |
|
320 |
the stream (even when __SO_OOBINLINE__ is not set). This |
| |
|
321 |
differs from BSD based stacks. |
| |
|
322 |
|
| |
|
323 |
|
| |
|
324 |
Linux uses the BSD compatible interpretation of the urgent |
| |
|
325 |
pointer field by default. This violates RFC1122, but is |
| |
|
326 |
required for interoperability with other stacks. It can be |
| |
|
327 |
changed by the __tcp_stdurg__ sysctl. |
| |
|
328 |
!!ERRORS |
| |
|
329 |
|
| |
|
330 |
|
| |
|
331 |
__EPIPE__ |
| |
|
332 |
|
| |
|
333 |
|
| |
|
334 |
The other end closed the socket unexpectedly or a read is |
| |
|
335 |
executed on a shut down socket. |
| |
|
336 |
|
| |
|
337 |
|
| |
|
338 |
__ETIMEDOUT__ |
| |
|
339 |
|
| |
|
340 |
|
| |
|
341 |
The other end didn't acknowledge retransmitted data after |
| |
|
342 |
some time. |
| |
|
343 |
|
| |
|
344 |
|
| |
|
345 |
__EAFNOTSUPPORT__ |
| |
|
346 |
|
| |
|
347 |
|
| |
|
348 |
Passed socket address type in ''sin_family'' was not |
| |
|
349 |
__AF_INET__. |
| |
|
350 |
|
| |
|
351 |
|
| |
|
352 |
Any errors defined for ip(7) or the generic socket |
| |
|
353 |
layer may also be returned for TCP. |
| |
|
354 |
!!BUGS |
| |
|
355 |
|
| |
|
356 |
|
| |
|
357 |
Not all errors are documented. |
| |
|
358 |
|
| |
|
359 |
|
| |
|
360 |
IPv6 is not described. |
| |
|
361 |
|
| |
|
362 |
|
| |
|
363 |
Transparent proxy options are not described. |
| |
|
364 |
!!VERSIONS |
| |
|
365 |
|
| |
|
366 |
|
| |
|
367 |
The sysctls are new in Linux 2.2. __IP_RECVERR__ is a new |
| |
|
368 |
feature in Linux 2.2. __TCP_CORK__ is new in |
| |
|
369 |
2.2. |
| |
|
370 |
!!SEE ALSO |
| |
|
371 |
|
| |
|
372 |
|
| |
|
373 |
socket(7), socket(2), ip(7), |
| |
|
374 |
sendmsg(2), recvmsg(2) |
| |
|
375 |
RFC793 for the TCP specification. |
| |
|
376 |
RFC1122 for the TCP requirements and a description of the |
| |
|
377 |
Nagle algorithm. |
| |
|
378 |
RFC2581 for some TCP algorithms. |
| |
|
379 |
---- |