Latest working kernel version: 2.6.26.2 Earliest failing kernel version: 2.6.27-rc2 (maybe earlier) Distribution: Ubuntu Hardware Environment: x86_64 Software Environment: 32bit userspace/64bit kernel Problem Description: When using iptables to intercept addr:port and reroute through an ssh tunnel, I see a huge performance hit on the 2.6.27-rc series relative to 2.6.26 (34KB/s vs 1+MB/s). Steps to reproduce: Setup and ssh tunnel to one of the kernel.org servers using a system on your local network: ssh -L 8888:204.152.191.37:80 <local system> Leave the ssh session running. In a new terminal (on your local system), verify performance of direct access versus the tunnel: wget -O /dev/null http://204.152.191.37/pub/linux/kernel/v2.6/linux-2.6.26.2.tar.bz2 wget -O /dev/null http://127.0.0.1:8888/pub/linux/kernel/v2.6/linux-2.6.26.2.tar.bz2 These should be roughly the same. Now setup iptables so that when you try to access 204.152.191.37:80 you'll automatically be redirected to the ssh tunnel: sudo iptables -t nat -N bug sudo iptables -t nat -I OUTPUT 1 -j bug sudo iptables -t nat -A bug -d 204.152.191.37 -p tcp --dport 80 -j DNAT --to-destination 127.0.0.1:8888 Repeat the performance test: wget -O /dev/null http://204.152.191.37/pub/linux/kernel/v2.6/linux-2.6.26.2.tar.bz2 wget -O /dev/null http://127.0.0.1:8888/pub/linux/kernel/v2.6/linux-2.6.26.2.tar.bz2 On 2.6.27-rc2+ My rate quickly drops down to ~34KB/s using the iptables nat'd wget (204.152.191.37) while the ssh tunnel still runs 1+MB/s. On 2.6.26 I get similar performance for both paths.
Reply-To: akpm@linux-foundation.org (switched to email. Please respond via emailed reply-to-all, not via the bugzilla web interface). On Tue, 12 Aug 2008 22:04:41 -0700 (PDT) bugme-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=11316 > > Summary: severe performance regression for iptables nat routing > Product: Networking > Version: 2.5 > KernelVersion: 2.6.27-rc3 > Platform: All > OS/Version: Linux > Tree: Mainline > Status: NEW > Severity: high > Priority: P1 > Component: Netfilter/Iptables > AssignedTo: networking_netfilter-iptables@kernel-bugs.osdl.org > ReportedBy: alex.williamson@hp.com > > > Latest working kernel version: 2.6.26.2 > Earliest failing kernel version: 2.6.27-rc2 (maybe earlier) > Distribution: Ubuntu > Hardware Environment: x86_64 > Software Environment: 32bit userspace/64bit kernel > Problem Description: When using iptables to intercept addr:port and reroute > through an ssh tunnel, I see a huge performance hit on the 2.6.27-rc series > relative to 2.6.26 (34KB/s vs 1+MB/s). > > Steps to reproduce: > > Setup and ssh tunnel to one of the kernel.org servers using a system on your > local network: > > ssh -L 8888:204.152.191.37:80 <local system> > > Leave the ssh session running. In a new terminal (on your local system), > verify performance of direct access versus the tunnel: > > wget -O /dev/null > http://204.152.191.37/pub/linux/kernel/v2.6/linux-2.6.26.2.tar.bz2 > wget -O /dev/null > http://127.0.0.1:8888/pub/linux/kernel/v2.6/linux-2.6.26.2.tar.bz2 > > These should be roughly the same. Now setup iptables so that when you try to > access 204.152.191.37:80 you'll automatically be redirected to the ssh > tunnel: > > sudo iptables -t nat -N bug > sudo iptables -t nat -I OUTPUT 1 -j bug > sudo iptables -t nat -A bug -d 204.152.191.37 -p tcp --dport 80 -j DNAT > --to-destination 127.0.0.1:8888 > > Repeat the performance test: > > wget -O /dev/null > http://204.152.191.37/pub/linux/kernel/v2.6/linux-2.6.26.2.tar.bz2 > wget -O /dev/null > http://127.0.0.1:8888/pub/linux/kernel/v2.6/linux-2.6.26.2.tar.bz2 > > On 2.6.27-rc2+ My rate quickly drops down to ~34KB/s using the iptables nat'd > wget (204.152.191.37) while the ssh tunnel still runs 1+MB/s. On 2.6.26 I > get > similar performance for both paths. >
git bisect traced the problem back to this changeset: commit e5a4a72d4f88f4389e9340d383ca67031d1b8536 Author: Lennert Buytenhek <buytenh@marvell.com> Date: Sun Aug 3 01:23:10 2008 -0700 net: use software GSO for SG+CSUM capable netdevices I've verified that I can toggle the slowness by reverting this patch on top of 8d0968ab (current head). The problem is readily reproducible using Ubuntu Hardy in a KVM VM with upstream, defconfig kernel. On Tue, 2008-08-12 at 22:12 -0700, Andrew Morton wrote: > (switched to email. Please respond via emailed reply-to-all, not via the > bugzilla web interface). > > On Tue, 12 Aug 2008 22:04:41 -0700 (PDT) bugme-daemon@bugzilla.kernel.org > wrote: > > > http://bugzilla.kernel.org/show_bug.cgi?id=11316 > > > > Summary: severe performance regression for iptables nat routing > > Product: Networking > > Version: 2.5 > > KernelVersion: 2.6.27-rc3 > > Platform: All > > OS/Version: Linux > > Tree: Mainline > > Status: NEW > > Severity: high > > Priority: P1 > > Component: Netfilter/Iptables > > AssignedTo: networking_netfilter-iptables@kernel-bugs.osdl.org > > ReportedBy: alex.williamson@hp.com > > > > > > Latest working kernel version: 2.6.26.2 > > Earliest failing kernel version: 2.6.27-rc2 (maybe earlier) > > Distribution: Ubuntu > > Hardware Environment: x86_64 > > Software Environment: 32bit userspace/64bit kernel > > Problem Description: When using iptables to intercept addr:port and reroute > > through an ssh tunnel, I see a huge performance hit on the 2.6.27-rc series > > relative to 2.6.26 (34KB/s vs 1+MB/s). > > > > Steps to reproduce: > > > > Setup and ssh tunnel to one of the kernel.org servers using a system on > your > > local network: > > > > ssh -L 8888:204.152.191.37:80 <local system> > > > > Leave the ssh session running. In a new terminal (on your local system), > > verify performance of direct access versus the tunnel: > > > > wget -O /dev/null > > http://204.152.191.37/pub/linux/kernel/v2.6/linux-2.6.26.2.tar.bz2 > > wget -O /dev/null > > http://127.0.0.1:8888/pub/linux/kernel/v2.6/linux-2.6.26.2.tar.bz2 > > > > These should be roughly the same. Now setup iptables so that when you try > to > > access 204.152.191.37:80 you'll automatically be redirected to the ssh > tunnel: > > > > sudo iptables -t nat -N bug > > sudo iptables -t nat -I OUTPUT 1 -j bug > > sudo iptables -t nat -A bug -d 204.152.191.37 -p tcp --dport 80 -j DNAT > > --to-destination 127.0.0.1:8888 > > > > Repeat the performance test: > > > > wget -O /dev/null > > http://204.152.191.37/pub/linux/kernel/v2.6/linux-2.6.26.2.tar.bz2 > > wget -O /dev/null > > http://127.0.0.1:8888/pub/linux/kernel/v2.6/linux-2.6.26.2.tar.bz2 > > > > On 2.6.27-rc2+ My rate quickly drops down to ~34KB/s using the iptables > nat'd > > wget (204.152.191.37) while the ssh tunnel still runs 1+MB/s. On 2.6.26 I > get > > similar performance for both paths. > > >
From: Alex Williamson <alex.williamson@hp.com> Date: Wed, 13 Aug 2008 20:08:20 -0600 > git bisect traced the problem back to this changeset: > > commit e5a4a72d4f88f4389e9340d383ca67031d1b8536 > Author: Lennert Buytenhek <buytenh@marvell.com> > Date: Sun Aug 3 01:23:10 2008 -0700 > > net: use software GSO for SG+CSUM capable netdevices > > I've verified that I can toggle the slowness by reverting this patch on > top of 8d0968ab (current head). The problem is readily reproducible > using Ubuntu Hardy in a KVM VM with upstream, defconfig kernel. Patrick I wonder if there a case where iptables NAT will COW the packet when it really doesn't need to. It seems, if anything, using GSO should make things go a little bit faster not slower... Hmmm... Anyways, if we can't figure this one out soon we can easily revert.
David Miller wrote: > From: Alex Williamson <alex.williamson@hp.com> > Date: Wed, 13 Aug 2008 20:08:20 -0600 > >> git bisect traced the problem back to this changeset: >> >> commit e5a4a72d4f88f4389e9340d383ca67031d1b8536 >> Author: Lennert Buytenhek <buytenh@marvell.com> >> Date: Sun Aug 3 01:23:10 2008 -0700 >> >> net: use software GSO for SG+CSUM capable netdevices >> >> I've verified that I can toggle the slowness by reverting this patch on >> top of 8d0968ab (current head). The problem is readily reproducible >> using Ubuntu Hardy in a KVM VM with upstream, defconfig kernel. > > Patrick I wonder if there a case where iptables NAT will COW the packet > when it really doesn't need to. I don't think so, its using skb_make_writable everywhere, which checks for skb_clone_writable, which should usually avoid COWing local TCP packets. It would also be unlikely to have that much of a performance impact (1MB/s -> 34kb/s). > > It seems, if anything, using GSO should make things go a little bit > faster not slower... Hmmm... Alex, could you post a tcpdump from both loopback and the outgoing device from the machine you're doing NAT on?
On Thu, 2008-08-14 at 13:04 +0200, Patrick McHardy wrote: > I don't think so, its using skb_make_writable everywhere, which checks > for skb_clone_writable, which should usually avoid COWing local TCP > packets. It would also be unlikely to have that much of a performance > impact (1MB/s -> 34kb/s). > > > > > It seems, if anything, using GSO should make things go a little bit > > faster not slower... Hmmm... > > Alex, could you post a tcpdump from both loopback and the outgoing > device from the machine you're doing NAT on? Attached, let me know if you want more options, this is just -vv -n. The NAT'ing system is at 10.0.2.15 and the ssh tunnel target is 192.168.1.60. Thanks, Alex
From: Patrick McHardy <kaber@trash.net> Date: Thu, 14 Aug 2008 13:04:25 +0200 > David Miller wrote: > > Patrick I wonder if there a case where iptables NAT will COW the packet > > when it really doesn't need to. > > I don't think so, its using skb_make_writable everywhere, which checks > for skb_clone_writable, which should usually avoid COWing local TCP > packets. It would also be unlikely to have that much of a performance > impact (1MB/s -> 34kb/s). I think he is NAT'ing locally generated traffic, look at the bugzilla entry. He has two cases of the same wget transfer, one is direct and another uses a 127.0.0.1:XXXX URL that does the transfer over an SSH tunnel. Normally they go roughly at the same rate. Then he adds iptables NAT entries that redirect the first transfer case over the SSH tunnel addr/port. And it is this case that degrades in performance with the GSO changeset. So it is locally generated TCP traffic, NAT'd to another port and IP address (specifically, redirected to 127.0.0.1:8888). Perhaps the problem has something to do with the fact that as far as TCP is concerned, the destination device can do SG and CSUM and thus GSO. But then iptables NATs this traffic to loopback. I think that is what leads to some kind of slowpath.
David Miller <davem@davemloft.net> wrote: > > Patrick I wonder if there a case where iptables NAT will COW the packet > when it really doesn't need to. This doesn't make sense. He's downloading from a remote host, so GSO shouldn't even come into play. Cheers,
Alex Williamson <alex.williamson@hp.com> wrote: > > Attached, let me know if you want more options, this is just -vv -n. > The NAT'ing system is at 10.0.2.15 and the ssh tunnel target is > 192.168.1.60. Thanks, Right, the underlying TCP connection is going well, but the NATed connection is getting checksum errors. Please send us the raw packet dump on lo (tcpdump -s 1600 -w file) so we can see what's wrong. Actually, I think know what's going on but a raw packet dump should confirm whether we're getting a partial checksum. Thanks,
On Fri, 2008-08-15 at 14:44 +1000, Herbert Xu wrote: > Alex Williamson <alex.williamson@hp.com> wrote: > > > > Attached, let me know if you want more options, this is just -vv -n. > > The NAT'ing system is at 10.0.2.15 and the ssh tunnel target is > > 192.168.1.60. Thanks, > > Right, the underlying TCP connection is going well, but the NATed > connection is getting checksum errors. Please send us the raw > packet dump on lo (tcpdump -s 1600 -w file) so we can see what's > wrong. Here it is. Thanks, Alex
On Fri, Aug 15, 2008 at 02:44:26PM +1000, Herbert Xu wrote: > > Actually, I think know what's going on but a raw packet dump should > confirm whether we're getting a partial checksum. Nevermind, I think I've found the problem. loopback: Drop obsolete ip_summed setting Now that the network stack can handle inbound packets with partial checksums, we should no longer clobber the ip_summed field in the loopback driver. This is because CHECKSUM_UNNECESSARY implies that the checksum field is actually valid which is not true for loopback packets since it's only partial (and thus complemented). This allows packets from lo to then be SNATed to an external source while still preserving the checksum's validity. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- diff --git a/drivers/net/loopback.c b/drivers/net/loopback.c index 49f6bc0..810e292 100644 --- a/drivers/net/loopback.c +++ b/drivers/net/loopback.c @@ -137,9 +137,6 @@ static int loopback_xmit(struct sk_buff *skb, struct net_device *dev) skb_orphan(skb); skb->protocol = eth_type_trans(skb,dev); -#ifndef LOOPBACK_MUST_CHECKSUM - skb->ip_summed = CHECKSUM_UNNECESSARY; -#endif #ifdef LOOPBACK_TSO if (skb_is_gso(skb)) {
On Fri, 2008-08-15 at 15:35 +1000, Herbert Xu wrote: > On Fri, Aug 15, 2008 at 02:44:26PM +1000, Herbert Xu wrote: > > > > Actually, I think know what's going on but a raw packet dump should > > confirm whether we're getting a partial checksum. > > Nevermind, I think I've found the problem. > > loopback: Drop obsolete ip_summed setting > > Now that the network stack can handle inbound packets with partial > checksums, we should no longer clobber the ip_summed field in the > loopback driver. This is because CHECKSUM_UNNECESSARY implies that > the checksum field is actually valid which is not true for loopback > packets since it's only partial (and thus complemented). > > This allows packets from lo to then be SNATed to an external source > while still preserving the checksum's validity. Nope, that doesn't fix it. NAT'd throughput remains about the same. Thanks, Alex
Alex Williamson <alex.williamson@hp.com> wrote: > > Nope, that doesn't fix it. NAT'd throughput remains about the same. Please take the raw packet dump on lo then. Thanks,
On Thu, Aug 14, 2008 at 11:30:37PM -0600, Alex Williamson wrote: > > Here it is. Thanks, Can you also post all your netfilter rules (filter + NAT) please? Thanks,
On Fri, Aug 15, 2008 at 05:33:43PM +1000, Herbert Xu wrote: > On Thu, Aug 14, 2008 at 11:30:37PM -0600, Alex Williamson wrote: > > > > Here it is. Thanks, > > Can you also post all your netfilter rules (filter + NAT) please? It's OK, I can reproduce it now. Cheers,
On Fri, Aug 15, 2008 at 06:14:42PM +1000, Herbert Xu wrote: > > It's OK, I can reproduce it now. This fixes it for me. loopback: Enable TSO This patch enables TSO since the loopback device is naturally capable of handling packets of any size. This also means that we won't enable GSO on lo which is good until GSO is fixed to preserve netfilter state as netfilter treats loopback packets in a special way. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> I'll work on the netfilter state preservation next. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- diff --git a/drivers/net/loopback.c b/drivers/net/loopback.c index 49f6bc0..c11e621 100644 --- a/drivers/net/loopback.c +++ b/drivers/net/loopback.c @@ -234,9 +231,7 @@ static void loopback_setup(struct net_device *dev) dev->type = ARPHRD_LOOPBACK; /* 0x0001*/ dev->flags = IFF_LOOPBACK; dev->features = NETIF_F_SG | NETIF_F_FRAGLIST -#ifdef LOOPBACK_TSO | NETIF_F_TSO -#endif | NETIF_F_NO_CSUM | NETIF_F_HIGHDMA | NETIF_F_LLTX
On Fri, Aug 15, 2008 at 08:32:35PM +1000, Herbert Xu wrote: > > I'll work on the netfilter state preservation next. Here it is: net: Preserve netfilter attributes in skb_gso_segment using __copy_skb_header skb_gso_segment didn't preserve some attributes in the original skb such as the netfilter fields. This was harmless until they were used which is the case for packets going through lo. This patch makes it call __copy_skb_header which also picks up some other missing attributes. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> diff --git a/net/core/skbuff.c b/net/core/skbuff.c index 8464017..ca1ccdf 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -2256,14 +2256,7 @@ struct sk_buff *skb_segment(struct sk_buff *skb, int features) segs = nskb; tail = nskb; - nskb->dev = skb->dev; - skb_copy_queue_mapping(nskb, skb); - nskb->priority = skb->priority; - nskb->protocol = skb->protocol; - nskb->vlan_tci = skb->vlan_tci; - nskb->dst = dst_clone(skb->dst); - memcpy(nskb->cb, skb->cb, sizeof(skb->cb)); - nskb->pkt_type = skb->pkt_type; + __copy_skb_header(nskb, skb); nskb->mac_len = skb->mac_len; skb_reserve(nskb, headroom); @@ -2274,6 +2267,7 @@ struct sk_buff *skb_segment(struct sk_buff *skb, int features) skb_copy_from_linear_data(skb, skb_put(nskb, doffset), doffset); if (!sg) { + nskb->ip_summed = CHECKSUM_NONE; nskb->csum = skb_copy_and_csum_bits(skb, offset, skb_put(nskb, len), len, 0); @@ -2283,8 +2277,6 @@ struct sk_buff *skb_segment(struct sk_buff *skb, int features) frag = skb_shinfo(nskb)->frags; k = 0; - nskb->ip_summed = CHECKSUM_PARTIAL; - nskb->csum = skb->csum; skb_copy_from_linear_data_offset(skb, offset, skb_put(nskb, hsize), hsize); Cheers,
Handled-By : Herbert Xu <herbert@gondor.apana.org.au> Patch : http://bugzilla.kernel.org/show_bug.cgi?id=11316#c15 Patch : http://bugzilla.kernel.org/show_bug.cgi?id=11316#c16
On Fri, 2008-08-15 at 20:53 +1000, Herbert Xu wrote: > On Fri, Aug 15, 2008 at 08:32:35PM +1000, Herbert Xu wrote: > > > > I'll work on the netfilter state preservation next. > > Here it is: Confirmed, these patches solve the problem. Thanks Herbert.
From: Herbert Xu <herbert@gondor.apana.org.au> Date: Fri, 15 Aug 2008 20:32:35 +1000 > loopback: Enable TSO > > This patch enables TSO since the loopback device is naturally > capable of handling packets of any size. This also means that > we won't enable GSO on lo which is good until GSO is fixed to > preserve netfilter state as netfilter treats loopback packets > in a special way. > > Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> This, effectively, "enables" LRO on loopback. And sure it's pretty obscure to shape, NAT, and end up forwarding loopback received packets, but do you want to be the user trying to do something like that and trying to find this particular patch which is causing it to not work? :-) I really don't know whether it's worth worrying about, I just wanted to mention it.
From: Herbert Xu <herbert@gondor.apana.org.au> Date: Fri, 15 Aug 2008 20:32:35 +1000 > loopback: Enable TSO > > This patch enables TSO since the loopback device is naturally > capable of handling packets of any size. This also means that > we won't enable GSO on lo which is good until GSO is fixed to > preserve netfilter state as netfilter treats loopback packets > in a special way. > > Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Meanwhile I applied this and I took the liberty of applying the following right afterwards: loopback: Remove rest of LOOPBACK_TSO code. It hasn't been enabled for a long time and the generic GSO engine is better documentation of what is expected of a device implementing TSO. Signed-off-by: David S. Miller <davem@davemloft.net> --- drivers/net/loopback.c | 62 ------------------------------------------------ 1 files changed, 0 insertions(+), 62 deletions(-) diff --git a/drivers/net/loopback.c b/drivers/net/loopback.c index 46e87cc..489d53b 100644 --- a/drivers/net/loopback.c +++ b/drivers/net/loopback.c @@ -64,68 +64,6 @@ struct pcpu_lstats { unsigned long bytes; }; -/* KISS: just allocate small chunks and copy bits. - * - * So, in fact, this is documentation, explaining what we expect - * of largesending device modulo TCP checksum, which is ignored for loopback. - */ - -#ifdef LOOPBACK_TSO -static void emulate_large_send_offload(struct sk_buff *skb) -{ - struct iphdr *iph = ip_hdr(skb); - struct tcphdr *th = (struct tcphdr *)(skb_network_header(skb) + - (iph->ihl * 4)); - unsigned int doffset = (iph->ihl + th->doff) * 4; - unsigned int mtu = skb_shinfo(skb)->gso_size + doffset; - unsigned int offset = 0; - u32 seq = ntohl(th->seq); - u16 id = ntohs(iph->id); - - while (offset + doffset < skb->len) { - unsigned int frag_size = min(mtu, skb->len - offset) - doffset; - struct sk_buff *nskb = alloc_skb(mtu + 32, GFP_ATOMIC); - - if (!nskb) - break; - skb_reserve(nskb, 32); - skb_set_mac_header(nskb, -ETH_HLEN); - skb_reset_network_header(nskb); - iph = ip_hdr(nskb); - skb_copy_to_linear_data(nskb, skb_network_header(skb), - doffset); - if (skb_copy_bits(skb, - doffset + offset, - nskb->data + doffset, - frag_size)) - BUG(); - skb_put(nskb, doffset + frag_size); - nskb->ip_summed = CHECKSUM_UNNECESSARY; - nskb->dev = skb->dev; - nskb->priority = skb->priority; - nskb->protocol = skb->protocol; - nskb->dst = dst_clone(skb->dst); - memcpy(nskb->cb, skb->cb, sizeof(skb->cb)); - nskb->pkt_type = skb->pkt_type; - - th = (struct tcphdr *)(skb_network_header(nskb) + iph->ihl * 4); - iph->tot_len = htons(frag_size + doffset); - iph->id = htons(id); - iph->check = 0; - iph->check = ip_fast_csum((unsigned char *) iph, iph->ihl); - th->seq = htonl(seq); - if (offset + doffset + frag_size < skb->len) - th->fin = th->psh = 0; - netif_rx(nskb); - offset += frag_size; - seq += frag_size; - id++; - } - - dev_kfree_skb(skb); -} -#endif /* LOOPBACK_TSO */ - /* * The higher levels take care of making this non-reentrant (it's * called with bh's disabled).
From: Alex Williamson <alex.williamson@hp.com> Date: Fri, 15 Aug 2008 09:34:47 -0600 > On Fri, 2008-08-15 at 20:53 +1000, Herbert Xu wrote: > > On Fri, Aug 15, 2008 at 08:32:35PM +1000, Herbert Xu wrote: > > > > > > I'll work on the netfilter state preservation next. > > > > Here it is: > > Confirmed, these patches solve the problem. Thanks Herbert. Thanks for your report and testing the fix Alex.
From: Herbert Xu <herbert@gondor.apana.org.au> Date: Fri, 15 Aug 2008 20:53:18 +1000 > net: Preserve netfilter attributes in skb_gso_segment using __copy_skb_header > > skb_gso_segment didn't preserve some attributes in the original skb > such as the netfilter fields. This was harmless until they were used > which is the case for packets going through lo. > > This patch makes it call __copy_skb_header which also picks up some > other missing attributes. > > Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Applied, thanks Herbert.
From: Herbert Xu <herbert@gondor.apana.org.au> Date: Fri, 15 Aug 2008 15:35:48 +1000 > loopback: Drop obsolete ip_summed setting > > Now that the network stack can handle inbound packets with partial > checksums, we should no longer clobber the ip_summed field in the > loopback driver. This is because CHECKSUM_UNNECESSARY implies that > the checksum field is actually valid which is not true for loopback > packets since it's only partial (and thus complemented). > > This allows packets from lo to then be SNATed to an external source > while still preserving the checksum's validity. > > Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> I've applied this one too, let me know if I should not have :)
On Fri, Aug 15, 2008 at 01:58:51PM -0700, David Miller wrote: > > This, effectively, "enables" LRO on loopback. > > And sure it's pretty obscure to shape, NAT, and end up forwarding > loopback received packets, but do you want to be the user trying to do > something like that and trying to find this particular patch which is > causing it to not work? :-) > > I really don't know whether it's worth worrying about, I just wanted > to mention it. Well the same code path is also used by Xen and virtio (apart from the netfilter bits which caused this particular bug), so we should be pretty safe here. Cheers,
Verified fixed in 2.6.27-rc4