version 2, including all changes.
.
Rev |
Author |
# |
Line |
1 |
perry |
1 |
TCP |
|
|
2 |
!!!TCP |
|
|
3 |
NAME |
|
|
4 |
SYNOPSIS |
|
|
5 |
DESCRIPTION |
|
|
6 |
ADDRESS FORMATS |
|
|
7 |
SYSCTLS |
|
|
8 |
SOCKET OPTIONS |
|
|
9 |
IOCTLS |
|
|
10 |
ERROR HANDLING |
|
|
11 |
NOTES |
|
|
12 |
ERRORS |
|
|
13 |
BUGS |
|
|
14 |
VERSIONS |
|
|
15 |
SEE ALSO |
|
|
16 |
---- |
|
|
17 |
!!NAME |
|
|
18 |
|
|
|
19 |
|
|
|
20 |
tcp - TCP protocol. |
|
|
21 |
!!SYNOPSIS |
|
|
22 |
|
|
|
23 |
|
|
|
24 |
__#include __ |
|
|
25 |
#include __ |
|
|
26 |
tcp_socket = socket(PF_INET, SOCK_STREAM, |
|
|
27 |
0);__ |
|
|
28 |
!!DESCRIPTION |
|
|
29 |
|
|
|
30 |
|
|
|
31 |
This is an implementation of the TCP protocol defined in |
2 |
perry |
32 |
RFC793, RFC1122 and RFC2001 with the !NewReno and SACK |
1 |
perry |
33 |
extensions. It provides a reliable, stream oriented, full |
|
|
34 |
duplex connection between two sockets on top of |
|
|
35 |
ip(7). TCP guarantees that the data arrives in order |
|
|
36 |
and retransmits lost packets. It generates and checks a per |
|
|
37 |
packet checksum to catch transmission errors. TCP does not |
|
|
38 |
preserve record boundaries. |
|
|
39 |
|
|
|
40 |
|
|
|
41 |
A fresh TCP socket has no remote or local address and is not |
|
|
42 |
fully specified. To create an outgoing TCP connection use |
|
|
43 |
connect(2) to establish a connection to another TCP |
|
|
44 |
socket. To receive new incoming connections bind(2) |
|
|
45 |
the socket first to a local address and port and then call |
|
|
46 |
listen(2) to put the socket into listening state. |
|
|
47 |
After that a new socket for each incoming connection can be |
|
|
48 |
accepted using accept(2). A socket which has had |
|
|
49 |
__accept__ or __connect__ successfully called on it is |
|
|
50 |
fully specified and may transmit data. Data cannot be |
|
|
51 |
transmitted on listening or not yet connected |
|
|
52 |
sockets. |
|
|
53 |
|
|
|
54 |
|
|
|
55 |
Linux 2.2 supports the RFC1323 TCP high performance |
|
|
56 |
extensions. This includes large TCP windows to support links |
|
|
57 |
with high latency or bandwidth. In order to make use of |
|
|
58 |
them, the send and receive buffer sizes must be increased. |
|
|
59 |
They can be be set globally with the |
|
|
60 |
__net.core.wmem_default__ and |
|
|
61 |
__net.core.rmem_default__ sysctls, or on individual |
|
|
62 |
sockets by using the __SO_SNDBUF__ and __SO_RCVBUF__ |
|
|
63 |
socket options. The maximum sizes for socket buffers are |
|
|
64 |
limited by the global __net.core.rmem_max__ and |
|
|
65 |
__net.core.wmem_max__ sysctls. See socket(7) for |
|
|
66 |
more information. |
|
|
67 |
|
|
|
68 |
|
|
|
69 |
TCP supports urgent data. Urgent data is used to signal the |
|
|
70 |
receiver that some important message is part of the data |
|
|
71 |
stream and that it should be processed as soon as possible. |
|
|
72 |
To send urgent data specify the __MSG_OOB__ option to |
|
|
73 |
send(2). When urgent data is received, the kernel |
|
|
74 |
sends a __SIGURG__ signal to the reading process or the |
|
|
75 |
process or process group that has been set for the socket |
|
|
76 |
using the __FIOCSPGRP__ or __FIOCSETOWN__ ioctls. When |
|
|
77 |
the __SO_OOBINLINE__ socket option is enabled, urgent |
|
|
78 |
data is put into the normal data stream (and can be tested |
|
|
79 |
for by the __SIOCATMARK__ ioctl), otherwise it can be |
|
|
80 |
only received when the __MSG_OOB__ flag is set for |
|
|
81 |
sendmsg(2). |
|
|
82 |
!!ADDRESS FORMATS |
|
|
83 |
|
|
|
84 |
|
|
|
85 |
TCP is built on top of IP (see ip(7)). The address |
|
|
86 |
formats defined by ip(7) apply to TCP. TCP only |
|
|
87 |
supports point-to-point communication; broadcasting and |
|
|
88 |
multicasting are not supported. |
|
|
89 |
!!SYSCTLS |
|
|
90 |
|
|
|
91 |
|
|
|
92 |
These sysctls can be accessed by the |
|
|
93 |
__/proc/sys/net/ipv4/*__ files or with the |
|
|
94 |
sysctl(2) interface. In addition, most IP sysctls |
|
|
95 |
also apply to TCP; see ip(7). |
|
|
96 |
|
|
|
97 |
|
|
|
98 |
__tcp_window_scaling__ |
|
|
99 |
|
|
|
100 |
|
|
|
101 |
Enable RFC1323 TCP window scaling. |
|
|
102 |
|
|
|
103 |
|
|
|
104 |
__tcp_sack__ |
|
|
105 |
|
|
|
106 |
|
|
|
107 |
Enable RFC2018 TCP Selective Acknowledgements. |
|
|
108 |
|
|
|
109 |
|
|
|
110 |
__tcp_timestamps__ |
|
|
111 |
|
|
|
112 |
|
|
|
113 |
Enable RFC1323 TCP timestamps. |
|
|
114 |
|
|
|
115 |
|
|
|
116 |
__tcp_fin_timeout__ |
|
|
117 |
|
|
|
118 |
|
|
|
119 |
How many seconds to wait for a final FIN packet before the |
|
|
120 |
socket is forcibly closed. This is strictly a violation of |
|
|
121 |
the TCP specification, but required to prevent |
|
|
122 |
denial-of-service attacks. |
|
|
123 |
|
|
|
124 |
|
|
|
125 |
__tcp_keepalive_probes__ |
|
|
126 |
|
|
|
127 |
|
|
|
128 |
Maximum TCP keep-alive probes to send before giving up. |
|
|
129 |
Keep-alives are only sent when the __SO_KEEPALIVE__ |
|
|
130 |
socket option is enabled. |
|
|
131 |
|
|
|
132 |
|
|
|
133 |
__tcp_keepalive_time__ |
|
|
134 |
|
|
|
135 |
|
|
|
136 |
The number of seconds after no data has been transmitted |
|
|
137 |
before a keep-alive will be sent on a connection. The |
|
|
138 |
default is 10800 seconds (3 hours). |
|
|
139 |
|
|
|
140 |
|
|
|
141 |
__tcp_max_ka_probes__ |
|
|
142 |
|
|
|
143 |
|
|
|
144 |
How many keep-alive probes are sent per slow timer run. To |
|
|
145 |
prevent bursts, this value should not be set too |
|
|
146 |
high. |
|
|
147 |
|
|
|
148 |
|
|
|
149 |
__tcp_stdurg__ |
|
|
150 |
|
|
|
151 |
|
|
|
152 |
Enable the strict RFC793 interpretation of the TCP |
|
|
153 |
urgent-pointer field. The default is to use the |
|
|
154 |
BSD-compatible interpretation of the urgent-pointer, |
|
|
155 |
pointing to the first byte after the urgent data. The RFC793 |
|
|
156 |
interpretation is to have it point to the last byte of |
|
|
157 |
urgent data. Enabling this option may lead to |
|
|
158 |
interoperatibility problems. |
|
|
159 |
|
|
|
160 |
|
|
|
161 |
__tcp_syncookies__ |
|
|
162 |
|
|
|
163 |
|
|
|
164 |
Enable TCP syncookies. The kernel must be compiled with |
|
|
165 |
__CONFIG_SYN_COOKIES__. Syncookies protects a socket from |
|
|
166 |
overload when too many connection attempts arrive. Client |
|
|
167 |
machines may not be able to detect an overloaded machine |
|
|
168 |
with a short timeout anymore when syncookies are |
|
|
169 |
enabled. |
|
|
170 |
|
|
|
171 |
|
|
|
172 |
__tcp_max_syn_backlog__ |
|
|
173 |
|
|
|
174 |
|
|
|
175 |
Length of the per-socket backlog queue. As of Linux 2.2, the |
|
|
176 |
backlog specified in listen(2) only specifies the |
|
|
177 |
length of the backlog queue of already established sockets. |
|
|
178 |
The maximum queue of sockets not yet established (in |
|
|
179 |
__SYN_RECV__ state) per listen socket is set by this |
|
|
180 |
sysctl. When more connection requests arrive, Linux starts |
|
|
181 |
to drop packets. When syncookies are enabled the packets are |
|
|
182 |
still answered and this value is effectively |
|
|
183 |
ignored. |
|
|
184 |
|
|
|
185 |
|
|
|
186 |
__tcp_retries1__ |
|
|
187 |
|
|
|
188 |
|
|
|
189 |
Defines how many times an answer to a TCP connection request |
|
|
190 |
is retransmitted before giving up. |
|
|
191 |
|
|
|
192 |
|
|
|
193 |
__tcp_retries2__ |
|
|
194 |
|
|
|
195 |
|
|
|
196 |
Defines how many times a TCP packet is retransmitted in |
|
|
197 |
established state before giving up. |
|
|
198 |
|
|
|
199 |
|
|
|
200 |
__tcp_syn_retries__ |
|
|
201 |
|
|
|
202 |
|
|
|
203 |
Defines how many times to try to send an initial SYN packet |
|
|
204 |
to a remote host before giving up and returns an error. Must |
|
|
205 |
be below 255. This is only the timeout for outgoing |
|
|
206 |
connections; for incoming connections the number of |
|
|
207 |
retransmits is defined by __tcp_retries1__. |
|
|
208 |
|
|
|
209 |
|
|
|
210 |
__tcp_retrans_collapse__ |
|
|
211 |
|
|
|
212 |
|
|
|
213 |
Try to send full-sized packets during retransmit. This is |
|
|
214 |
used to work around TCP bugs in some stacks. |
|
|
215 |
!!SOCKET OPTIONS |
|
|
216 |
|
|
|
217 |
|
|
|
218 |
To set or get a TCP socket option, call getsockopt(2) |
|
|
219 |
to read or setsockopt(2) to write the option with the |
|
|
220 |
socket family argument set to __SOL_TCP__. In addition, |
|
|
221 |
most __SOL_IP__ socket options are valid on TCP sockets. |
|
|
222 |
For more information see ip(7). |
|
|
223 |
|
|
|
224 |
|
|
|
225 |
__TCP_NODELAY__ |
|
|
226 |
|
|
|
227 |
|
|
|
228 |
Turn the Nagle algorithm off. This means that packets are |
|
|
229 |
always sent as soon as possible and no unnecessary delays |
|
|
230 |
are introduced, at the cost of more packets in the network. |
|
|
231 |
Expects an integer boolean flag. |
|
|
232 |
|
|
|
233 |
|
|
|
234 |
__TCP_MAXSEG__ |
|
|
235 |
|
|
|
236 |
|
|
|
237 |
Set or receive the maximum segment size for outgoing TCP |
|
|
238 |
packets. If this option is set before connection |
|
|
239 |
establishment, it also changes the MSS value announced to |
|
|
240 |
the other end in the initial packet. Values greater than the |
|
|
241 |
interface MTU are ignored and have no effect. |
|
|
242 |
|
|
|
243 |
|
|
|
244 |
__TCP_CORK__ |
|
|
245 |
|
|
|
246 |
|
|
|
247 |
If enabled don't send out partial frames. All queued partial |
|
|
248 |
frames are sent when the option is cleared again. This is |
|
|
249 |
useful for prepending headers before calling |
|
|
250 |
sendfile(2), or for throughput optimization. This |
|
|
251 |
option cannot be combined with |
|
|
252 |
__TCP_NODELAY__. |
|
|
253 |
!!IOCTLS |
|
|
254 |
|
|
|
255 |
|
|
|
256 |
These ioctls can be accessed using ioctl(2). The |
|
|
257 |
correct syntax is: |
|
|
258 |
|
|
|
259 |
|
|
|
260 |
__int__ ''value''__; |
|
|
261 |
__''error'' __= ioctl(__''tcp_socket''__,__ ''ioctl_type''__, __''value''__); |
|
|
262 |
__ |
|
|
263 |
|
|
|
264 |
|
|
|
265 |
__FIONREAD__ or __TIOCINQ__ |
|
|
266 |
|
|
|
267 |
|
|
|
268 |
Returns the amount of queued unread data in the receive |
|
|
269 |
buffer. Argument is a pointer to an integer. |
|
|
270 |
|
|
|
271 |
|
|
|
272 |
__SIOCATMARK__ |
|
|
273 |
|
|
|
274 |
|
|
|
275 |
Returns true when the all urgent data has been already |
|
|
276 |
received by the user program. This is used together with |
|
|
277 |
__SO_OOBINLINE__. Argument is an pointer to an integer |
|
|
278 |
for the test result. |
|
|
279 |
|
|
|
280 |
|
|
|
281 |
__TIOCOUTQ__ |
|
|
282 |
|
|
|
283 |
|
|
|
284 |
Returns the amount of unsent data in the socket send queue |
|
|
285 |
in the passed integer value pointer. Unfortunately, the |
|
|
286 |
implementation of this ioctl is buggy in all known versions |
|
|
287 |
of Linux and instead returns the free space (effectively |
|
|
288 |
buffer size minus bytes used including metadata) in the send |
|
|
289 |
queue. This will be fixed in future Linux versions. If you |
|
|
290 |
use __TIOCOUTQ__, please include a runtime test for both |
|
|
291 |
behaviors for correct function on future releases and other |
|
|
292 |
Unixes. |
|
|
293 |
!!ERROR HANDLING |
|
|
294 |
|
|
|
295 |
|
|
|
296 |
When a network error occurs, TCP tries to resend the packet. |
|
|
297 |
If it doesn't succeed after some time, either |
|
|
298 |
__ETIMEDOUT__ or the last received error on this |
|
|
299 |
connection is reported. |
|
|
300 |
|
|
|
301 |
|
|
|
302 |
Some applications require a quicker error notification. This |
|
|
303 |
can be enabled with the __SOL_IP__ level |
|
|
304 |
__IP_RECVERR__ socket option. When this option is |
|
|
305 |
enabled, all incoming errors are immediately passed to the |
|
|
306 |
user program. Use this option with care - it makes TCP less |
|
|
307 |
tolerant to routing changes and other normal network |
|
|
308 |
conditions. |
|
|
309 |
!!NOTES |
|
|
310 |
|
|
|
311 |
|
|
|
312 |
When an error occurs doing a connection setup occuring in a |
|
|
313 |
socket write __SIGPIPE__ is only raised when the |
|
|
314 |
__SO_KEEPALIVE__ socket option is set. |
|
|
315 |
|
|
|
316 |
|
|
|
317 |
TCP has no real out-of-band data; it has urgent data. In |
|
|
318 |
Linux this means if the other end sends newer out-of-band |
|
|
319 |
data the older urgent data is inserted as normal data into |
|
|
320 |
the stream (even when __SO_OOBINLINE__ is not set). This |
|
|
321 |
differs from BSD based stacks. |
|
|
322 |
|
|
|
323 |
|
|
|
324 |
Linux uses the BSD compatible interpretation of the urgent |
|
|
325 |
pointer field by default. This violates RFC1122, but is |
|
|
326 |
required for interoperability with other stacks. It can be |
|
|
327 |
changed by the __tcp_stdurg__ sysctl. |
|
|
328 |
!!ERRORS |
|
|
329 |
|
|
|
330 |
|
|
|
331 |
__EPIPE__ |
|
|
332 |
|
|
|
333 |
|
|
|
334 |
The other end closed the socket unexpectedly or a read is |
|
|
335 |
executed on a shut down socket. |
|
|
336 |
|
|
|
337 |
|
|
|
338 |
__ETIMEDOUT__ |
|
|
339 |
|
|
|
340 |
|
|
|
341 |
The other end didn't acknowledge retransmitted data after |
|
|
342 |
some time. |
|
|
343 |
|
|
|
344 |
|
|
|
345 |
__EAFNOTSUPPORT__ |
|
|
346 |
|
|
|
347 |
|
|
|
348 |
Passed socket address type in ''sin_family'' was not |
|
|
349 |
__AF_INET__. |
|
|
350 |
|
|
|
351 |
|
|
|
352 |
Any errors defined for ip(7) or the generic socket |
|
|
353 |
layer may also be returned for TCP. |
|
|
354 |
!!BUGS |
|
|
355 |
|
|
|
356 |
|
|
|
357 |
Not all errors are documented. |
|
|
358 |
|
|
|
359 |
|
|
|
360 |
IPv6 is not described. |
|
|
361 |
|
|
|
362 |
|
|
|
363 |
Transparent proxy options are not described. |
|
|
364 |
!!VERSIONS |
|
|
365 |
|
|
|
366 |
|
|
|
367 |
The sysctls are new in Linux 2.2. __IP_RECVERR__ is a new |
|
|
368 |
feature in Linux 2.2. __TCP_CORK__ is new in |
|
|
369 |
2.2. |
|
|
370 |
!!SEE ALSO |
|
|
371 |
|
|
|
372 |
|
|
|
373 |
socket(7), socket(2), ip(7), |
|
|
374 |
sendmsg(2), recvmsg(2) |
|
|
375 |
RFC793 for the TCP specification. |
|
|
376 |
RFC1122 for the TCP requirements and a description of the |
|
|
377 |
Nagle algorithm. |
|
|
378 |
RFC2581 for some TCP algorithms. |
|
|
379 |
---- |