ipip: Fix handling of DF packets when pmtudisc is OFF
Herbert Xu [Fri, 6 Nov 2009 10:37:41 +0000 (10:37 +0000)]
RFC 2003 requires the outer header to have DF set if DF is set
on the inner header, even when PMTU discovery is off for the
tunnel.  Our implementation does exactly that.

For this to work properly the IPIP gateway also needs to engate
in PMTU when the inner DF bit is set.  As otherwise the original
host would not be able to carry out its PMTU successfully since
part of the path is only visible to the gateway.

Unfortunately when the tunnel PMTU discovery setting is off, we
do not collect the necessary soft state, resulting in blackholes
when the original host tries to perform PMTU discovery.

This problem is not reproducible on the IPIP gateway itself as
the inner packet usually has skb->local_df set.  This is not
correctly cleared (an unrelated bug) when the packet passes
through the tunnel, which allows fragmentation to occur.  For
hosts behind the IPIP gateway it is readily visible with a simple
ping.

This patch fixes the problem by performing PMTU discovery for
all packets with the inner DF bit set, regardless of the PMTU
discovery setting on the tunnel itself.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/ipv4/ipip.c

index 08ccd34..ae40ed1 100644 (file)
@@ -438,25 +438,27 @@ static netdev_tx_t ipip_tunnel_xmit(struct sk_buff *skb, struct net_device *dev)
                goto tx_error;
        }
 
-       if (tiph->frag_off)
+       df |= old_iph->frag_off & htons(IP_DF);
+
+       if (df) {
                mtu = dst_mtu(&rt->u.dst) - sizeof(struct iphdr);
-       else
-               mtu = skb_dst(skb) ? dst_mtu(skb_dst(skb)) : dev->mtu;
 
-       if (mtu < 68) {
-               stats->collisions++;
-               ip_rt_put(rt);
-               goto tx_error;
-       }
-       if (skb_dst(skb))
-               skb_dst(skb)->ops->update_pmtu(skb_dst(skb), mtu);
+               if (mtu < 68) {
+                       stats->collisions++;
+                       ip_rt_put(rt);
+                       goto tx_error;
+               }
 
-       df |= (old_iph->frag_off&htons(IP_DF));
+               if (skb_dst(skb))
+                       skb_dst(skb)->ops->update_pmtu(skb_dst(skb), mtu);
 
-       if ((old_iph->frag_off&htons(IP_DF)) && mtu < ntohs(old_iph->tot_len)) {
-               icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED, htonl(mtu));
-               ip_rt_put(rt);
-               goto tx_error;
+               if ((old_iph->frag_off & htons(IP_DF)) &&
+                   mtu < ntohs(old_iph->tot_len)) {
+                       icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED,
+                                 htonl(mtu));
+                       ip_rt_put(rt);
+                       goto tx_error;
+               }
        }
 
        if (tunnel->err_count > 0) {