Bug 29252
Summary: | IPv6 doesn't work in a kvm guest. | ||
---|---|---|---|
Product: | Networking | Reporter: | Kusanagi Kouichi (slash) |
Component: | IPV6 | Assignee: | Hideaki YOSHIFUJI (yoshfuji) |
Status: | CLOSED CODE_FIX | ||
Severity: | normal | CC: | florian, maciej.rutecki, rjw |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.38-rc5 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Bug Depends on: | |||
Bug Blocks: | 27352 |
Description
Kusanagi Kouichi
2011-02-16 13:03:11 UTC
(switched to email. Please respond via emailed reply-to-all, not via the bugzilla web interface). On Wed, 16 Feb 2011 13:03:15 GMT bugzilla-daemon@bugzilla.kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=29252 > > Summary: IPv6 doesn't work in a kvm guest. > Product: Networking > Version: 2.5 > Kernel Version: 2.6.38-rc5 > Platform: All > OS/Version: Linux > Tree: Mainline > Status: NEW > Severity: normal > Priority: P1 > Component: IPV6 > AssignedTo: yoshfuji@linux-ipv6.org > ReportedBy: slash@ac.auone-net.jp > Regression: Yes > > > apt-get and ping6 don't work in a kvm guest. It appears that neighbor > discovery > is not working. I'm using tap, virtio-net and vhost. I did bisect and results > is > > d80bc0fd262ef840ed4e82593ad6416fa1ba3fc4 is the first bad commit > commit d80bc0fd262ef840ed4e82593ad6416fa1ba3fc4 > Author: David S. Miller <davem@davemloft.net> > Date: Mon Jan 24 16:01:58 2011 -0800 > > ipv6: Always clone offlink routes. > > Do not handle PMTU vs. route lookup creation any differently > wrt. offlink routes, always clone them. > > Reported-by: PK <runningdoglackey@yahoo.com> > Signed-off-by: David S. Miller <davem@davemloft.net> > > :040000 040000 > > c4a34e9e36bd8cc81e898738f2d4344f6647d472f0fcd4b093ec22784798f3951203391f4567b278 > M net > > reverting this commit for guest kernel fixes the probrem. > From: Andrew Morton <akpm@linux-foundation.org> Date: Thu, 17 Feb 2011 14:25:17 -0800 >> https://bugzilla.kernel.org/show_bug.cgi?id=29252 ... >> apt-get and ping6 don't work in a kvm guest. It appears that neighbor >> discovery >> is not working. I'm using tap, virtio-net and vhost. I did bisect and >> results >> is >> >> d80bc0fd262ef840ed4e82593ad6416fa1ba3fc4 is the first bad commit >> commit d80bc0fd262ef840ed4e82593ad6416fa1ba3fc4 >> Author: David S. Miller <davem@davemloft.net> >> Date: Mon Jan 24 16:01:58 2011 -0800 >> >> ipv6: Always clone offlink routes. >> >> Do not handle PMTU vs. route lookup creation any differently >> wrt. offlink routes, always clone them. >> >> Reported-by: PK <runningdoglackey@yahoo.com> >> Signed-off-by: David S. Miller <davem@davemloft.net> >> >> :040000 040000 >> >> c4a34e9e36bd8cc81e898738f2d4344f6647d472f0fcd4b093ec22784798f3951203391f4567b278 >> M net >> >> reverting this commit for guest kernel fixes the probrem. Thanks for the report, I'll look into this. v2.6.38-rc7-163-gfb62c00 doesn't work. Incidentally, I found that this bug affects only 32bit guest. Ok, the following should address both bugs, #29252 and #30462, please give it some testing. -------------------- ipv6: Don't create clones of nonexthop routes forever. Addresses https://bugzilla.kernel.org/show_bug.cgi?id=29252 Addresses https://bugzilla.kernel.org/show_bug.cgi?id=30462 In commit d80bc0fd262ef840ed4e82593ad6416fa1ba3fc4 ("ipv6: Always clone offlink routes.") we forced the kernel to always clone offlink routes. The reason we do that is to make sure we never bind an inetpeer to a prefixed route. The logic turned on here has existed in the tree for many years, but was always off due to a protecting CPP define. So perhaps it's no surprise that there is a logic bug here. The issue is that there is nothing that stops the cloning process. Every route call will find the same route in the same condition, reclone it, and insert. This causes two problems: 1) If the in6_rt_ins() call succeeds, we get tons of crap in the routing tree. 2) If the int6_rt_ins() call fails, we get no networking to that destination. We've had reports of both cases. End the loop and the failures by keying on the RTF_CACHE flag which will only be set in the clone routes. Reported-by: slash@ac.auone-net.jp Reported-by: Ernst Sjöstrand <ernstp@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 904312e..8963635 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -739,8 +739,10 @@ restart: if (!rt->rt6i_nexthop && !(rt->rt6i_flags & RTF_NONEXTHOP)) nrt = rt6_alloc_cow(rt, &fl->fl6_dst, &fl->fl6_src); - else + else if (!(rt->rt6i_flags & RTF_CACHE)) nrt = rt6_alloc_clone(rt, &fl->fl6_dst); + else + goto out2; dst_release(&rt->dst); rt = nrt ? : net->ipv6.ip6_null_entry; From: David Miller <davem@davemloft.net> Date: Wed, 09 Mar 2011 15:58:18 -0800 (PST) > Ok, the following should address both bugs, #29252 and #30462, please > give it some testing. > > -------------------- > ipv6: Don't create clones of nonexthop routes forever. Nevermind, this patch has problems, I'm still debugging and trying to come up with a proper fix. Thanks in advance for your patience. From: David Miller <davem@davemloft.net> Date: Wed, 09 Mar 2011 19:20:12 -0800 (PST) > From: David Miller <davem@davemloft.net> > Date: Wed, 09 Mar 2011 15:58:18 -0800 (PST) > >> Ok, the following should address both bugs, #29252 and #30462, please >> give it some testing. >> >> -------------------- >> ipv6: Don't create clones of nonexthop routes forever. > > Nevermind, this patch has problems, I'm still debugging and trying to > come up with a proper fix. > > Thanks in advance for your patience. Ok, I'm more confident in this version of the fix. It passes all of my tests, and I've added instrumentation to make sure various cases are performing the operations the way I expect them to. -------------------- ipv6: Don't create clones of host routes. MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Addresses https://bugzilla.kernel.org/show_bug.cgi?id=29252 Addresses https://bugzilla.kernel.org/show_bug.cgi?id=30462 In commit d80bc0fd262ef840ed4e82593ad6416fa1ba3fc4 ("ipv6: Always clone offlink routes.") we forced the kernel to always clone offlink routes. The reason we do that is to make sure we never bind an inetpeer to a prefixed route. The logic turned on here has existed in the tree for many years, but was always off due to a protecting CPP define. So perhaps it's no surprise that there is a logic bug here. The problem is that we canot clone a route that is already a host route (ie. has DST_HOST set). Because if we do, an identical entry already exists in the routing tree and therefore the ip6_rt_ins() call is going to fail. This sets off a series of failures and high cpu usage, because when ip6_rt_ins() fails we loop retrying this operation a few times in order to handle a race between two threads trying to clone and insert the same host route at the same time. Fix this by simply using the route as-is when DST_HOST is set. Reported-by: slash@ac.auone-net.jp Reported-by: Ernst Sjöstrand <ernstp@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> --- net/ipv6/route.c | 4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 904312e..e7db701 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -739,8 +739,10 @@ restart: if (!rt->rt6i_nexthop && !(rt->rt6i_flags & RTF_NONEXTHOP)) nrt = rt6_alloc_cow(rt, &fl->fl6_dst, &fl->fl6_src); - else + else if (!(rt->dst.flags & DST_HOST)) nrt = rt6_alloc_clone(rt, &fl->fl6_dst); + else + goto out2; dst_release(&rt->dst); rt = nrt ? : net->ipv6.ip6_null_entry; Your patch fixes the problem. Thanks. Hi, this patch solved the issue I reported. Applied the patch on -rc8. Regards //Ernst On Thu, Mar 10, 2011 at 05:04, David Miller <davem@davemloft.net> wrote: > From: David Miller <davem@davemloft.net> > Date: Wed, 09 Mar 2011 19:20:12 -0800 (PST) > >> From: David Miller <davem@davemloft.net> >> Date: Wed, 09 Mar 2011 15:58:18 -0800 (PST) >> >>> Ok, the following should address both bugs, #29252 and #30462, please >>> give it some testing. >>> >>> -------------------- >>> ipv6: Don't create clones of nonexthop routes forever. >> >> Nevermind, this patch has problems, I'm still debugging and trying to >> come up with a proper fix. >> >> Thanks in advance for your patience. > > Ok, I'm more confident in this version of the fix. It passes all of > my tests, and I've added instrumentation to make sure various cases > are performing the operations the way I expect them to. > > -------------------- > ipv6: Don't create clones of host routes. > MIME-Version: 1.0 > Content-Type: text/plain; charset=UTF-8 > Content-Transfer-Encoding: 8bit > > Addresses https://bugzilla.kernel.org/show_bug.cgi?id=29252 > Addresses https://bugzilla.kernel.org/show_bug.cgi?id=30462 > > In commit d80bc0fd262ef840ed4e82593ad6416fa1ba3fc4 ("ipv6: Always > clone offlink routes.") we forced the kernel to always clone offlink > routes. > > The reason we do that is to make sure we never bind an inetpeer to a > prefixed route. > > The logic turned on here has existed in the tree for many years, > but was always off due to a protecting CPP define. So perhaps > it's no surprise that there is a logic bug here. > > The problem is that we canot clone a route that is already a > host route (ie. has DST_HOST set). Because if we do, an identical > entry already exists in the routing tree and therefore the > ip6_rt_ins() call is going to fail. > > This sets off a series of failures and high cpu usage, because when > ip6_rt_ins() fails we loop retrying this operation a few times in > order to handle a race between two threads trying to clone and insert > the same host route at the same time. > > Fix this by simply using the route as-is when DST_HOST is set. > > Reported-by: slash@ac.auone-net.jp > Reported-by: Ernst Sjöstrand <ernstp@gmail.com> > Signed-off-by: David S. Miller <davem@davemloft.net> > --- > net/ipv6/route.c | 4 +++- > 1 files changed, 3 insertions(+), 1 deletions(-) > > diff --git a/net/ipv6/route.c b/net/ipv6/route.c > index 904312e..e7db701 100644 > --- a/net/ipv6/route.c > +++ b/net/ipv6/route.c > @@ -739,8 +739,10 @@ restart: > > if (!rt->rt6i_nexthop && !(rt->rt6i_flags & RTF_NONEXTHOP)) > nrt = rt6_alloc_cow(rt, &fl->fl6_dst, &fl->fl6_src); > - else > + else if (!(rt->dst.flags & DST_HOST)) > nrt = rt6_alloc_clone(rt, &fl->fl6_dst); > + else > + goto out2; > > dst_release(&rt->dst); > rt = nrt ? : net->ipv6.ip6_null_entry; > -- > 1.7.4.1 > > From: Ernst Sjöstrand <ernstp@gmail.com> Date: Fri, 11 Mar 2011 20:44:39 +0100 > this patch solved the issue I reported. Applied the patch on -rc8. Thank you for testing. There was a commit merged for 2.6.38-rc9 (or final) which mentions this bug report: commit 7343ff31ebf01691ea4515d3126467434b9d22d6 Author: David S. Miller <davem@davemloft.net> Date: Wed Mar 9 19:55:25 2011 -0800 ipv6: Don't create clones of host routes. |