Latest working kernel version: 2.6.26-rc4 Earliest failing kernel version: 2.6.26 Distribution: NA Hardware Environment: x86 Software Environment: ip Problem Description: $ ip -f inet6 route get fec0::1 Produces this oops - BUG: unable to handle kernel NULL pointer dereference at 00000000 IP: [<c0369b85>] rt6_fill_node+0x175/0x3b0 *pdpt = 0000000036466001 *pde = 0000000000000000 Oops: 0000 [#1] SMP Modules linked in: pcnet32 smsc47m192 i2c_i801 i2c_dev i2c_core r8169 coretemp i t87 hwmon_vid lcm e1000e Pid: 3033, comm: ip Not tainted (2.6.26.2 #1) EIP: 0060:[<c0369b85>] EFLAGS: 00010246 CPU: 1 EIP is at rt6_fill_node+0x175/0x3b0 EAX: 00000000 EBX: f7115bbc ECX: 00000000 EDX: f7115c60 ESI: f7c1f100 EDI: f7548f00 EBP: f7115bdc ESP: f7115ba4 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 Process ip (pid: 3033, ti=f7114000 task=f64cbc50 task.ti=f7114000) Stack: f7115bbc 00000000 f7115c54 f7115bc0 f7115c60 f6d75078 00000000 f7115bdc c036a5f0 c036b360 00000000 f75487a0 00000000 f7548f00 f7115c9c c036c30e f7115c70 00000000 00000018 00000bd9 489b2024 00000000 00000000 00000000 Call Trace: [<c036a5f0>] ? ip6_route_output+0x50/0xa0 [<c036b360>] ? ip6_pol_route_output+0x0/0x20 [<c036c30e>] ? inet6_rtm_getroute+0x16e/0x200 [<c036c1a0>] ? inet6_rtm_getroute+0x0/0x200 [<c030ef19>] ? rtnetlink_rcv_msg+0x1b9/0x1f0 [<c030ed60>] ? rtnetlink_rcv_msg+0x0/0x1f0 [<c031426d>] ? netlink_rcv_skb+0x8d/0xb0 [<c030ed57>] ? rtnetlink_rcv+0x17/0x20 [<c031402d>] ? netlink_unicast+0x23d/0x270 [<c030162a>] ? memcpy_fromiovec+0x4a/0x70 [<c0314811>] ? netlink_sendmsg+0x1c1/0x290 [<c02fa165>] ? sock_sendmsg+0xc5/0xf0 [<c01363a0>] ? autoremove_wake_function+0x0/0x50 [<c01363a0>] ? autoremove_wake_function+0x0/0x50 [<c02fa165>] ? sock_sendmsg+0xc5/0xf0 [<c0217f37>] ? copy_from_user+0x37/0x70 [<c03018ec>] ? verify_iovec+0x2c/0x90 [<c02fa29a>] ? sys_sendmsg+0x10a/0x220 [<c015ab08>] ? __inc_zone_page_state+0x18/0x20 [<c01642ed>] ? __page_set_anon_rmap+0x2d/0x40 [<c0164325>] ? page_add_new_anon_rmap+0x25/0x30 [<c015eda6>] ? handle_mm_fault+0x606/0x750 [<c0160f5e>] ? vma_adjust+0xfe/0x410 [<c0113156>] ? do_page_fault+0x126/0x830 [<c02fb343>] ? sys_socketcall+0x233/0x260 [<c0102f39>] ? sysenter_past_esp+0x6a/0x91 ======================= Code: 62 01 00 00 c6 43 01 80 8b 45 0c 85 c0 0f 85 13 02 00 00 8b 45 d8 85 c0 74 3c 8b 86 88 00 00 00 8d 5d e0 31 c9 89 1c 24 8b 55 d8 <8b> 00 e8 d4 e3 ff ff 85 c0 75 20 b9 10 00 00 00 ba 07 00 00 00 EIP: [<c0369b85>] rt6_fill_node+0x175/0x3b0 SS:ESP 0068:f7115ba4 ---[ end trace e9f2563374550ae8 ]--- Steps to reproduce: $ ip -f inet6 route get fec0::1
Subject : OOPS, ip -f inet6 route get fec0::1, linux-2.6.26, ip6_route_output, rt6_fill_node+0x175 Submitter : John Gumb <john.gumb@tandberg.com> Date : 2008-08-07 17:00:56 GMT References : http://article.gmane.org/gmane.linux.kernel/718101 Handled-By : Brian Haley <brian.haley@hp.com> Patch : http://article.gmane.org/gmane.linux.network/102189
Patch not in today's Linus git - will close after it lands there.
Not sure the patch is completely correct - it prevents the oops but seems to introduce new regression, namely that where 2.6.24 produces some output for the command in question, with the patch applied, all I get is a EOF - parag@parag-desktop:~$ ip -f inet6 route get fec0::1 EOF on netlink Reopening.
Reply-To: akpm@linux-foundation.org On Sat, 9 Aug 2008 17:26:17 -0700 (PDT) bugme-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=11297 > > Summary: OOPS in rt6_fill_node > Product: Networking > Version: 2.5 > KernelVersion: 2.6.27-rc2 > Platform: All > OS/Version: Linux > Tree: Mainline > Status: NEW > Severity: normal > Priority: P1 > Component: IPV6 > AssignedTo: yoshfuji@linux-ipv6.org > ReportedBy: parag.warudkar@gmail.com > > > Latest working kernel version: 2.6.26-rc4 > Earliest failing kernel version: 2.6.26 Can you please confirm the version numbers here? 2.6.26-rc4 was OK, but 2.6.26 and 2.6.27-rc2 are busted? Brian had a patch but apparently things still aren't right (see the full bugzilla report for details). > Distribution: NA > Hardware Environment: x86 > Software Environment: ip > Problem Description: > > $ ip -f inet6 route get fec0::1 > > Produces this oops - > BUG: unable to handle kernel NULL pointer dereference at 00000000 > > IP: [<c0369b85>] rt6_fill_node+0x175/0x3b0 > > *pdpt = 0000000036466001 *pde = 0000000000000000 > > Oops: 0000 [#1] SMP > > Modules linked in: pcnet32 smsc47m192 i2c_i801 i2c_dev i2c_core r8169 > coretemp i > t87 hwmon_vid lcm e1000e > > Pid: 3033, comm: ip Not tainted (2.6.26.2 #1) > > EIP: 0060:[<c0369b85>] EFLAGS: 00010246 CPU: 1 > > EIP is at rt6_fill_node+0x175/0x3b0 > > EAX: 00000000 EBX: f7115bbc ECX: 00000000 EDX: f7115c60 > > ESI: f7c1f100 EDI: f7548f00 EBP: f7115bdc ESP: f7115ba4 > > DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 > > Process ip (pid: 3033, ti=f7114000 task=f64cbc50 task.ti=f7114000) > > Stack: f7115bbc 00000000 f7115c54 f7115bc0 f7115c60 f6d75078 00000000 > f7115bdc > c036a5f0 c036b360 00000000 f75487a0 00000000 f7548f00 f7115c9c > c036c30e > f7115c70 00000000 00000018 00000bd9 489b2024 00000000 00000000 > 00000000 > Call Trace: > > [<c036a5f0>] ? ip6_route_output+0x50/0xa0 > > [<c036b360>] ? ip6_pol_route_output+0x0/0x20 > > [<c036c30e>] ? inet6_rtm_getroute+0x16e/0x200 > > [<c036c1a0>] ? inet6_rtm_getroute+0x0/0x200 > > [<c030ef19>] ? rtnetlink_rcv_msg+0x1b9/0x1f0 > > [<c030ed60>] ? rtnetlink_rcv_msg+0x0/0x1f0 > > [<c031426d>] ? netlink_rcv_skb+0x8d/0xb0 > > [<c030ed57>] ? rtnetlink_rcv+0x17/0x20 > > [<c031402d>] ? netlink_unicast+0x23d/0x270 > > [<c030162a>] ? memcpy_fromiovec+0x4a/0x70 > > [<c0314811>] ? netlink_sendmsg+0x1c1/0x290 > > [<c02fa165>] ? sock_sendmsg+0xc5/0xf0 > > [<c01363a0>] ? autoremove_wake_function+0x0/0x50 > > [<c01363a0>] ? autoremove_wake_function+0x0/0x50 > > [<c02fa165>] ? sock_sendmsg+0xc5/0xf0 > > [<c0217f37>] ? copy_from_user+0x37/0x70 > > [<c03018ec>] ? verify_iovec+0x2c/0x90 > > [<c02fa29a>] ? sys_sendmsg+0x10a/0x220 > > [<c015ab08>] ? __inc_zone_page_state+0x18/0x20 > > [<c01642ed>] ? __page_set_anon_rmap+0x2d/0x40 > > [<c0164325>] ? page_add_new_anon_rmap+0x25/0x30 > > [<c015eda6>] ? handle_mm_fault+0x606/0x750 > > [<c0160f5e>] ? vma_adjust+0xfe/0x410 > > [<c0113156>] ? do_page_fault+0x126/0x830 > > [<c02fb343>] ? sys_socketcall+0x233/0x260 > > [<c0102f39>] ? sysenter_past_esp+0x6a/0x91 > > ======================= > > Code: 62 01 00 00 c6 43 01 80 8b 45 0c 85 c0 0f 85 13 02 00 00 8b 45 d8 > 85 c0 74 > 3c 8b 86 88 00 00 00 8d 5d e0 31 c9 89 1c 24 8b 55 d8 <8b> 00 e8 d4 e3 > ff ff 85 > c0 75 20 b9 10 00 00 00 ba 07 00 00 00 > > EIP: [<c0369b85>] rt6_fill_node+0x175/0x3b0 SS:ESP 0068:f7115ba4 > > ---[ end trace e9f2563374550ae8 ]--- > > > Steps to reproduce: > $ ip -f inet6 route get fec0::1 > > > -- > Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email > ------- You are receiving this mail because: ------- > You are on the CC list for the bug, or are watching someone who is.
On Sun, Aug 10, 2008 at 12:31 AM, Andrew Morton <akpm@linux-foundation.org> wrote: > Can you please confirm the version numbers here? 2.6.26-rc4 was OK, > but 2.6.26 and 2.6.27-rc2 are busted? > John Gumb's original report (referenced in bugzilla) says the issue is happening since 2.6.26-rc4 and that 2.6.26.2 also had it. Current mainline also has this problem (tested on everything past 2.6.27-rc2). Parag
Reply-To: brian.haley@hp.com Andrew Morton wrote: > Can you please confirm the version numbers here? 2.6.26-rc4 was OK, > but 2.6.26 and 2.6.27-rc2 are busted? This commit would have broken this, which git-whatchanged shows as being in 2.6.24-rc4: commit 5e5f3f0f801321078c897a5de0b4b4304f234da0 Author: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Date: Mon Mar 3 21:44:34 2008 +0900 [IPV6] ADDRCONF: Convert ipv6_get_saddr() to ipv6_dev_get_saddr(). Since most users of ipv6_get_saddr() pass non-NULL as dst argument, use ipv6_dev_get_saddr() directly. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Most people have default IPv6 routes, which is probably why it hasn't been seen. > Brian had a patch but apparently things still aren't right (see the > full bugzilla report for details). John sent me email off-line saying the patch fixed the problem, but he'd do more testing over the weekend. -Brian
On Sun, Aug 10, 2008 at 8:30 PM, Brian Haley <brian.haley@hp.com> wrote: > John sent me email off-line saying the patch fixed the problem, but he'd do > more testing over the weekend. I did test the patch and it now spits out "EOF on netlink" like mentioned in bugzilla, which is not consistent with earlier behavior which was to output - unreachable fec0::1 dev lo table unspec proto none src ::1 metric -1 error -101 hoplimit 255 . Parag
Actually I'm curious why rt6i_idev is still NULL there... I think that is the problem, but okay, I have to agree with Brian's patch. The output difference might be another problem. Need more analysis.
From: "Parag Warudkar" <parag.warudkar@gmail.com> Date: Sun, 10 Aug 2008 21:05:07 -0400 > On Sun, Aug 10, 2008 at 8:30 PM, Brian Haley <brian.haley@hp.com> wrote: > > > John sent me email off-line saying the patch fixed the problem, but he'd do > > more testing over the weekend. > > I did test the patch and it now spits out "EOF on netlink" like > mentioned in bugzilla, which is not consistent with earlier behavior > which was to output - > unreachable fec0::1 dev lo table unspec proto none src ::1 metric > -1 error -101 hoplimit 255 . We need to resolve this, Brian?
Reply-To: brian.haley@hp.com David Miller wrote: > From: "Parag Warudkar" <parag.warudkar@gmail.com> > Date: Sun, 10 Aug 2008 21:05:07 -0400 > >> On Sun, Aug 10, 2008 at 8:30 PM, Brian Haley <brian.haley@hp.com> wrote: >> >>> John sent me email off-line saying the patch fixed the problem, but he'd do >>> more testing over the weekend. >> I did test the patch and it now spits out "EOF on netlink" like >> mentioned in bugzilla, which is not consistent with earlier behavior >> which was to output - >> unreachable fec0::1 dev lo table unspec proto none src ::1 metric >> -1 error -101 hoplimit 255 . > > We need to resolve this, Brian? I don't see "EOF on netlink" on 2.6.27-rc2. I can dig-up a 2.6.24 kernel, I'm just wondering if that EOF happens when the IPv6 module isn't loaded or something? -Brian
On Mon, Aug 11, 2008 at 7:44 PM, Brian Haley <brian.haley@hp.com> wrote: > I don't see "EOF on netlink" on 2.6.27-rc2. I can dig-up a 2.6.24 kernel, > I'm just wondering if that EOF happens when the IPv6 module isn't loaded or > something? With ipv6 not loaded it returns not supported or something similar - correct of course. What output do you see with your patch? Parag
Reply-To: brian.haley@hp.com Parag Warudkar wrote: > > What output do you see with your patch? # ip -f inet6 route get fec0::1 unreachable fec0::1 dev lo table unspec proto none src 2001:1890:1109:a10:218:feff:fe7f:49c8 metric -1 error -101 hoplimit 255 And if I down eth0 I get: # ip -f inet6 route get fec0::1 unreachable fec0::1 dev lo table unspec proto none src ::1 metric -1 error -101 hoplimit 255 This is 2.6.27-rc2, like I said, I"m building a 2.6.24 kernel now. -Brian
Grrr. It looks like I was bitten by the infamous Netlink "No buffer space available" error which I was somehow overlooked. Applying your patch to a kernel which boots without the Netlink buffer space error shows the right output. Closing. Thanks.
Reply-To: eugeneteo@kernel.sg On Tue, Aug 12, 2008 at 8:41 AM, Brian Haley <brian.haley@hp.com> wrote: > Parag Warudkar wrote: >> >> What output do you see with your patch? > > # ip -f inet6 route get fec0::1 > unreachable fec0::1 dev lo table unspec proto none src > 2001:1890:1109:a10:218:feff:fe7f:49c8 metric -1 error -101 hoplimit 255 Hmm, I tried it on an older kernel that doesn't have Yoshfuji-san's ipv6_get_saddr() changes, and it should display the output with the loopback MAC address instead of ethX MAC address. Correct me if I am wrong. Thanks, Eugene
From: "Eugene Teo" <eugeneteo@kernel.sg> Date: Tue, 12 Aug 2008 09:28:05 +0800 > On Tue, Aug 12, 2008 at 9:10 AM, Eugene Teo <eugeneteo@kernel.sg> wrote: > > On Tue, Aug 12, 2008 at 8:41 AM, Brian Haley <brian.haley@hp.com> wrote: > >> Parag Warudkar wrote: > >>> > >>> What output do you see with your patch? > >> > >> # ip -f inet6 route get fec0::1 > >> unreachable fec0::1 dev lo table unspec proto none src > >> 2001:1890:1109:a10:218:feff:fe7f:49c8 metric -1 error -101 hoplimit 255 > > > > Hmm, I tried it on an older kernel that doesn't have Yoshfuji-san's > > ipv6_get_saddr() changes, > > and it should display the output with the loopback MAC address instead > > of ethX MAC address. > > Correct me if I am wrong. > > Evidence of me still sleepy. Not the MAC address but the ipv6 address... Hmmm... from what I understand so far based upon Parag's most recent reply, Brian's patch should be OK. Does everyone else agree?
Reply-To: eugeneteo@kernel.sg On Tue, Aug 12, 2008 at 9:10 AM, Eugene Teo <eugeneteo@kernel.sg> wrote: > On Tue, Aug 12, 2008 at 8:41 AM, Brian Haley <brian.haley@hp.com> wrote: >> Parag Warudkar wrote: >>> >>> What output do you see with your patch? >> >> # ip -f inet6 route get fec0::1 >> unreachable fec0::1 dev lo table unspec proto none src >> 2001:1890:1109:a10:218:feff:fe7f:49c8 metric -1 error -101 hoplimit 255 > > Hmm, I tried it on an older kernel that doesn't have Yoshfuji-san's > ipv6_get_saddr() changes, > and it should display the output with the loopback MAC address instead > of ethX MAC address. > Correct me if I am wrong. Evidence of me still sleepy. Not the MAC address but the ipv6 address... Eugene
Reply-To: brian.haley@hp.com David Miller wrote: > Hmmm... from what I understand so far based upon Parag's most > recent reply, Brian's patch should be OK. > > Does everyone else agree? Just an fyi I think part of the confusion was the output I posted: # ip -f inet6 route get fec0::1 unreachable fec0::1 dev lo table unspec proto none src 2001:1890:1109:a10:218:feff:fe7f:49c8 metric -1 error -101 hoplimit 255 On my system I have a global address on eth0, so that's printed in my output. Others don't have a global, so see ::1, which is expected. I see the same behavior on my Debian Lenny 2.6.18 box as 2.6.27, so my patch doesn't seem to have changed anything. -Brian