Since Kernel 2.6.18 sunhme gives me a "irq nobody cared" and transmit timeouts. 2.6.17 worked fine. The NIC is a Sun HappyMeal Quad-Port and my architecture is x86. The same problem is described at http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=397460 and it seems the problem is introduced by http://www.kernel.org/git/?p=linux/kernel/git/stable/linux-2.6.18.y.git;a=commit;h=050bbb196392b9c178f82b1205a23dd2f915ee93 Relevant dmesg part: Oct 27 01:12:27 merkur eth0: Link is up using internal transceiver at 100Mb/s, Full Duplex. Oct 27 01:12:29 merkur irq 19: nobody cared (try booting with the "irqpoll" option) Oct 27 01:12:29 merkur handlers: Oct 27 01:12:29 merkur [<0031333e>] (0x31333e) Oct 27 01:12:29 merkur Disabling IRQ #19 Oct 27 01:12:32 merkur eth1: Link is up using internal transceiver at 100Mb/s, Full Duplex. Oct 27 01:14:03 merkur NETDEV WATCHDOG: eth1: transmit timed out Oct 27 01:14:03 merkur eth1: transmit timed out, resetting Oct 27 01:14:03 merkur eth1: Happy Status 03030000 TX[000003ff:00000301]
On Sun, 12 Nov 2006 16:48:39 -0800 bugme-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=7502 > > Summary: sunhme not working with 2.6.18 on x86 > Kernel Version: 2.6.18 > Status: NEW > Severity: normal > Owner: jgarzik@pobox.com > Submitter: jasmin-bugs@pacifica.ch > > > Since Kernel 2.6.18 sunhme gives me a "irq nobody cared" and transmit > timeouts. 2.6.17 worked fine. The NIC is a Sun HappyMeal Quad-Port and my > architecture is x86. > > The same problem is described at > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=397460 and it seems the > problem is introduced by > http://www.kernel.org/git/?p=linux/kernel/git/stable/linux-2.6.18.y.git;a=commit;h=050bbb196392b9c178f82b1205a23dd2f915ee93 > > Relevant dmesg part: > > Oct 27 01:12:27 merkur eth0: Link is up using internal transceiver at 100Mb/s, > Full Duplex. > Oct 27 01:12:29 merkur irq 19: nobody cared (try booting with the "irqpoll" > option) > Oct 27 01:12:29 merkur handlers: > Oct 27 01:12:29 merkur [<0031333e>] (0x31333e) > Oct 27 01:12:29 merkur Disabling IRQ #19 > Oct 27 01:12:32 merkur eth1: Link is up using internal transceiver at 100Mb/s, > Full Duplex. > Oct 27 01:14:03 merkur NETDEV WATCHDOG: eth1: transmit timed out > Oct 27 01:14:03 merkur eth1: transmit timed out, resetting > Oct 27 01:14:03 merkur eth1: Happy Status 03030000 TX[000003ff:00000301] > I'd have expected that this would have been caused by i386 platform borkage. But in the debian bug report, Michal has fingered davem's http://www.kernel.org/git/?p=linux/kernel/git/stable/linux-2.6.18.y.git;a=commit;h=050bbb196392b9c178f82b1205a23dd2f915ee93 as the cause. Michal, what makes you believe that this particualr patch is to blame?
Reply-To: mpokrywka@hoga.pl > I'd have expected that this would have been caused by i386 platform > borkage. But in the debian bug report, Michal has fingered davem's > http://www.kernel.org/git/?p=linux/kernel/git/stable/linux-2.6.18.y. > git;a=commit;h=050bbb196392b9c178f82b1205a23dd2f915ee93 > as the cause. > > Michal, what makes you believe that this particualr patch is to > blame? Sorry, I'm not git expert, I only browsed sunhme.c changes through http interface and I downloaded a11 driver revisions between what I considered 2.6.17 driver (by date) and latest revision. I compiled all revisions by hand and insmoded all these modules until driver worked. Probably I didn't mentioned it in bug report, but after loading working driver and kernel assigning correct interrupts as seen in dmesg, all driver revisions loaded afterwards worked, even those bad. From my pov driver is broken, because older version works with 2.6.18. Regards Michal Pokrywka
On Tue, 14 Nov 2006 02:50:22 +0100 "Michal Pokrywka" <mpokrywka@hoga.pl> wrote: > > I'd have expected that this would have been caused by i386 platform > > borkage. But in the debian bug report, Michal has fingered davem's > > http://www.kernel.org/git/?p=linux/kernel/git/stable/linux-2.6.18.y. > > git;a=commit;h=050bbb196392b9c178f82b1205a23dd2f915ee93 > > as the cause. > > > > Michal, what makes you believe that this particualr patch is to > > blame? > > Sorry, I'm not git expert, I only browsed sunhme.c changes through > http interface and I downloaded a11 driver revisions between what > I considered 2.6.17 driver (by date) and latest revision. > I compiled all revisions by hand and insmoded all these modules > until driver worked. Probably I didn't mentioned it in bug report, > but after loading working driver and kernel assigning correct > interrupts as seen in dmesg, all driver revisions loaded afterwards > worked, even those bad. > >From my pov driver is broken, because older version works with > 2.6.18. > OK, thanks. That method works ;) The usual technique is git-bisect. That's briefly described in Documentation/BUG-HUNTING, but that just send you to the git-bisect manpage. Perhaps we need a super-simple kernel-specific document. But anyway. That's for next time.
Created attachment 9609 [details] Patch to re-introduce pci_enable_dev() and pci_set_master() This problems appears to be caused by missing pci_enable_dev() and pci_set_master() calls, which disappeared during code refactoring between 2.6.17 and 2.6.18. Attached patch re-adds the calls to happy_meal_pci_probe(). Michal Pokrywka has confirmed that it fixes the problems with sunhme on his box.
Patch has been merged in mainline as ef9467f8f0803881d6b20ad6f0f770fc39bcc2c2. It is included in the main trunk since v2.6.20-rc1. -- Ueimor