Penguin
Blame: FailoverInternetConnection
EditPageHistoryDiffInfoLikePages
Annotated edit history of FailoverInternetConnection version 6, including all changes. View license author blame.
Rev Author # Line
1 DanielLawson 1 I'll describe a simple way of setting up a redundant internet connection.
2
2 CraigBox 3 My setup involves my main server being the 'router' for the network. I have another machine, elsewhere on the network that has a radio card in it. All traffic is directed to the main server, which then routes it on the the radio machine, which does the [NAT]. I also have another connection - a so-called "Internet Hub" which takes ethernet in and has a serial line out, connected to a modem. This does dial on demand and NAT.
1 DanielLawson 4
5 Radio machine IP: 10.0.0.254 (name radio)
6 Internet Hub IP: 10.0.0.253 (name modem)
7 Server IP: 10.0.0.1
8
9 So I can change which internet connection I use simply by changing the default route on the server.
10 What I do, is set up my default gateway as being radio, with a metric of 0
11 route add default gw radio metric 0
12 I then add the backup route in as well, with a higher metric
2 CraigBox 13 route add default gw modem metric 10
1 DanielLawson 14
15 The higher metric means it is used less preferentially. As I don't have load balancing enabled, it doesn't get used at all.
16
17 So, If I discover that the radio connection has gone down, I can manually failover by issuing
18 route del default gw
19 When it comes back up, I can reinstate it as the default link by doing
20 route add default gw radio metric 0
21
22
23 The simplest way of knowing if the link has gone down is to ping a remote host. However, I need to make sure I am pinging said host via the radio link, or else when I drop the default route, the ping will succeed via the backup route.
24 So I add a route to the next-hop after the radio link.
25 route add ip.of.next.hop gw radio
26 I can then ping ip.of.next.hop - if it succeeds, the radio link is up. If it fails, the radio link is down.
27
28
29 I can automate this a number of ways. One way I finalized on was a cron script that is run every minute
30
2 CraigBox 31 :/usr/local/sbin# cat check-internet.pl
1 DanielLawson 32
33 #!/usr/bin/perl
34
35 $primary="radio";
36 $backup="modem";
37 $target="next-hop";
38 $emailaddr="daniel";
39
40 $state_file="/var/state/route-checks";
41 $tolerance="3";
42
43 $retval=system("ping $target -i 1 -c 5 > /dev/null");
44 $retval = $retval / 256;
45 $cur_state=`cat $state_file`;
46
47 if ($retval > 0) {
48 $next_state = $cur_state + 1;
49
50 if ($cur_state < $tolerance) {
51 system("echo $next_state > $state_file");
52 } elsif ($cur_state == $tolerance) {
53 system("echo $next_state > $state_file");
54 system("route del default gw $primary");
55 system("echo \"Droppinog default gw ($primary)\" | mail -s \"Route Change (Down)\" $emailaddr");
56 } elsif ($cur_state > $tolerance ) {
57 #do nothing, already dropped the route!
58 }
59 } else {
60 $next_state = 0;
61
62 if ($cur_state == 0) {
63 #do nothing
64 } elsif ($cur_state <= $tolerance) {
65 # reset the counter
66 system("echo $next_state > $state_file");
67 } else {
68 #reset counter and bring up default route;
69 system("echo $next_state > $state_file");
70 system("route add default gw $primary");
71 system("echo \"Restoring default gw ($primary)\" | mail -s \"Route Change (UP)\" $emailaddr");
72 }
73 }
74
75
76 Ignoring the sloppy coding (the empty if or else {} pairs were left intentionally, as I was adding logging stuff to the script and wanting to do some other stuff with it.
77
78 What this does, is pings the nexthop 5 times. If ping returns 0, its fine. If it returns 256 or higher, there was 100% packet loss. or > 80%. Or something. I can't remember exactly - go read the ping source code.
79 Every time we get a 256 retval, we increment the counter stored in /var/state/route-checks.
80 When this counter gets to the threshold value, we increment the counter again, and drop the default route.
81 If we continue to get bad results, we ignore them
82
83 If we get a 0 retval, and the counter is less or equal to the tolerance level, we reset the counter 0.
84 If its above the tolerance level, it means we've dropped the default route, so we reset the counter, and bring the default route back up
85
86
2 CraigBox 87 Thats pretty simple, and it works. Although with a tolerance of 3, as shown, it takes 4 minutes or so to work out the link is down and to bring it back up. Cron can only schedule tasks every minute, so thats the smallest accuracy you'll get.
1 DanielLawson 88
89
2 CraigBox 90 Another method is to use [Nagios]. I have a eventhandler that I wrote for nagios, but I never got it working nicely, hence writing my own above. I can provide it as a sample, however I think the issues were due to the link going to a hard critical state straight away, and not being in a warning state for any length of time.
3 CraigBox 91
92 -- DanielLawson
4 EeroVolotinen 93
94
95 -
96
97 Just little notice from Eero, I don't even know if this is correct, but:
98
99 http://www.pcquest.com/content/linux/103091901.asp says something about gc_timeout?
100
101 If this is correct I only need to set 2 gateway cards and enter something like that:
102
103 # route add default gw 192.168.1.2 dev eth0 metric 0
104 # route add default gw 192.168.2.2 dev eth1 metric 10
6 EeroVolotinen 105 # echo 300 > /proc/sys/net/ipv4/route/gc_timeout or sysctl net.ipv4.route.gc_timeout = 300
5 EeroVolotinen 106
107 Usually best way is to put sysctl settings to /etc/sysctl.conf
4 EeroVolotinen 108
109 and kernel automagically monitors lines and uses better? I haven't tested this not yet, but
110 I will soon.
111
112
113
114
115 -- Eero Volotinen (eero@jlug.fi)
3 CraigBox 116
117 -----
118 CategoryNetworking
The following authors of this page have not agreed to the WlugWikiLicense. As such copyright to all content on this page is retained by the original authors.
  • EeroVolotinen
The following authors of this page have agreed to the WlugWikiLicense.

PHP Warning

lib/blame.php:177: Warning: Invalid argument supplied for foreach()

lib/blame.php (In template 'html'):177: Warning: Invalid argument supplied for foreach()

lib/plugin/WlugLicense.php (In template 'html'):99: Warning: Invalid argument supplied for foreach()

lib/plugin/WlugLicense.php (In template 'html'):111: Warning: in_array() [<a href='function.in-array'>function.in-array</a>]: Wrong datatype for second argument