A simple introduction (with a nice easy example) to source based routing

On standard Internet systems, when you receive a packet and decide where to route it to, that decision is made only based on the destination of the packet.

For example:

 crb@firewall:~$ /sbin/route -n
 Kernel IP routing table
 Destination     Gateway         Genmask         Flags Metric Ref    Use Iface UH    0      0        0 ppp0   U     0      0        0 eth1   U     0      0        0 eth0         UG    0      0        0 ppp0

In this, a simple routing table for a firewall, all traffic for is routed out eth1; traffic for is routed out eth0; and everything else is routed out ppp0 to the Internet.

However, let's deal with the situation where we have two interfaces ppp0 and ppp1 (a dual-homed situation, with two internet providers.) We will call the IP address on ppp0 $P0 and on ppp1, $P1.

You end up with a routing table that looks like this:

 crb@firewall:~$ /sbin/route -n
 Kernel IP routing table
 Destination     Gateway         Genmask         Flags Metric Ref    Use Iface UH    0      0        0 ppp1 UH    0      0        0 ppp0   U     0      0        0 eth1   U     0      0        0 eth0         UG    0      0        0 ppp0

If you get traffic for your machine come into ppp1 from $OUTSIDE, the machine will receive the packets, generate a reply, and the system will now have a packet from $P1, destined to $OUTSIDE. Because the system only looks at destination IP addresses, the packet will get routed out the default gateway, ppp0. Even if you disable ReversePathFiltering to allow this kind of traffic on all your interfaces, chances are high your ISP will be using it. (For example, TelstraClear does not allow any traffic on it's network that originates from another network's IP.)

So, this is where source based routing comes in. We need to take any traffic that originates from $P1 (replies to traffic that came in ppp1), and route it back out through ppp1.

To do this we need to have the iproute2 package, which provides the command /sbin/ip; giving you much finer grained control over routing. If you don't have the /sbin/ip command, install an iproute package (debian: apt-get install iproute). The command route cannot handle multiple routing tables.

You also need to have a couple of kernel options enabled: they are CONFIG_IP_ADVANCED_ROUTER (Networking/IP: Advanced Router) and CONFIG_IP_MULTIPLE_TABLES (Networking/IP: policy routing).

Then, what you do is you create another routing table by editing /etc/iproute2/rt_tables; in my example I wish to create routes for a jetstream connection, so I have called the table 'jetstream' <footnote 1> by adding the line

 100     jetstream

Now, you can create a rule that dictates what routing table to look at.

 ip rule add __from $P1__ table jetstream

Look at the rules with ip rule list to get an idea of what happens when a packet is to be routed. The important bit is the from $P1. If you forget it, depending at the priority of your table, you could send all traffic to that table by default. Now, when routing, a packet that comes from the IP address $P1 will be passed to the routing table 'jetstream' instead of the main routing table.

Populate this table with a new default route, and simple routes for the rest of your local interfaces:

 ip route add dev eth0 table jetstream
 ip route add dev eth1 table jetstream
 ip route add dev lo table jetstream
 ip route add default via table jetstream

And you're done. In my case, I'm doing this on a ppp interface, so I only need the routes to exist when the interface is up; I've therefore added scripts for this to /etc/ppp/ip-up.d/ (ip-down.d contains ip rule del; I leave the table there - it's no harm if it's not called, but you could remove it with ip route del).

Thanks to PerryLorier for explaining this all to me, and to the Linux Advanced Routing and Traffic Control HOWTO for filling in the detail; specifically Routing for multiple uplinks/providers.

Dual-Homed Setup using a single interface

In the situation where you want to have a network where you have multiple routes out to the Internet?, but want to be able to determine the path of traffic not at the edge, but at the local box.

To do this, assign the local box two IP addresses in the local subnet.

 whisky:~# ip addr add brd dev eth0
 whisky:~# ip -4 addr list dev eth0
 2: eth0: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000
     inet brd scope global eth0
     inet brd scope global secondary eth0

Now add a rule in the RPDB;

 whisky:~# ip rule add prio 200 from lookup ORCON
 whisky:~# ip rule
 0:      from all lookup local
 200:    from lookup ORCON   --- Footnote 2
 32766:  from all lookup main
 32767:  from all lookup default

Now all packets with a source IP address of will use the ORCON routing table and other packets will use the 'main' routing table.

 whisky:~# ip ro list table main dev eth0  proto kernel  scope link  src
 default via dev eth0
 whisky:~# ip ro list table ORCON dev eth0  scope link  src
 default via dev eth0  src

Now, if you have a program which does not allow you to set a bind IP address which it will use for connections, you can use the iptables mangle table to MARK the packets you wish to route differently.

I will be doing this for a specific uid, so that all locally-generated traffic owned by the uid:1004 will be routed out through

Firstly, create a rule in the OUTPUT chain of the mangle table:

 whisky:~# iptables -t mangle -A OUTPUT -o eth0 -m owner --uid-owner 1004 -j MARK --set-mark 0x1
 whisky:~# iptables -t mangle -nvL OUTPUT
 Chain OUTPUT (policy ACCEPT 25152 packets, 6543K bytes)
  pkts bytes target     prot opt in     out     source           destination
     0     0 MARK       all  --  *      eth0           OWNER UID match 1004 MARK set 0x1

Secondly, create a rule in the RPDB specifying that packets MARKed 0x1 should use the ORCON routing table.

 whisky:~# ip rule add prio 199 fwmark 0x1 lookup ORCON

You can use any iptables match (within reason) to modify the routing table which will be used. Anything which will not match every single packet in a connection will not work and will break the end-to-end nature of TCP.

1 - It's added at entry 100 so that you can add other tables either side of it later if you want. This isn't a priority. The only value which affects packets which match multiple rules is the prio, as in:

 ip rule add prio __300__ from $IP lookup $TABLE

2 - I have added a line in /etc/iproute2/rt_tables containing '100 ORCON'

This works fine for small data packets but doesnt seem to match on followon packets, to handle this you need CONNMARK tracking and matching as well

iptables -t mangle -A OUTPUT -o eth0 -j CONNMARK --restore-mark
iptables -t mangle -A OUTPUT -o eth0 -m mark ! --mark 0 -j RETURN
iptables -t mangle -A OUTPUT -o eth0 -m owner --uid-owner 1004 -j MARK --set-mark 0x1
iptables -t mangle -A OUTPUT -o eth0 -m mark ! --mark 0 -j CONNMARK --save-mark

This will lookup the current packet in connection tracking table and restore the mark that was assigned to this connection initially. If this provides a mark value no further mangling is done, if not, then the uid-owner matching is tested and if successful, the mark is set and then saved to the connection tracking table.