Bug 7431

Summary: ohci1394 Oops after a rmmod/modprobe cycle
Product: Platform Specific/Hardware Reporter: Gioele Barabucci (dev)
Component: PPC-32Assignee: Stefan Richter (stefanr)
Status: RESOLVED CODE_FIX    
Severity: normal CC: stefanr
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.18 (gentoo-sources-2.6.18) Subsystem:
Regression: --- Bisected commit-id:
Attachments: add platform code to ohci1394's pci_driver.probe()

Description Gioele Barabucci 2006-10-29 17:11:23 UTC
Distribution: Gentoo
Hardware Environment: PPC iBook G3 (PowerBook4,3)
Problem Description:

When I 'rmmod ohci1394' and then 'modprobe ohci1394', I get a "Bus error" 
message from modprobe and my backlight is set to full brightness.
dmesg reveals that an oops occurred. Here is the dmesg output after 'modprobe 
ohci1394':

| Machine check in kernel mode.
| Caused by (from SRR1=49030): Transfer error ack signal
| Oops: Machine check, sig: 7 [#1]
|
| Modules linked in: ohci1394 ipv6 af_packet radeon drm snd_seq snd_seq_device 
snd_powermac ide_cd cdrom snd_aoa_i2sbus snd_pcm airport snd_timer 
snd_page_alloc snd orinoco soundcore hermes snd_aoa_soundbus sungem ieee1394 
ohci_hcd usbcore uninorth_agp sungem_phy agpgart evdev unix
| NIP: EA08D29C LR: EA091618 CTR: C002CCB0
| REGS: cd533c80 TRAP: 0200   Not tainted  (2.6.18-gentoo)
| MSR: 00049030 <EE,ME,IR,DR>  CR: 24004482  XER: 00000000
| TASK = cd947850[8127] 'modprobe' THREAD: cd532000
| GPR00: FFFFFFFF CD533D30 CD947850 CB2B9E68 EA08E094 CB2B9F44 CC1BD7FC 
00000000
| GPR08: 0000001F EA084000 000007C0 EA084000 84004488 1001F288 0000001D 
EA08B13C
| GPR16: E7454AE0 EA0886D4 00000124 00000000 C0045298 EA088148 EA088724 
EA090000
| GPR24: CB2B8000 CB2B9EE8 CB2B9FA0 CB2B9E8C CB2B9F44 C79BA000 CB2B9E68 
00000000
| NIP [EA08D29C] ohci_soft_reset+0x50/0x78 [ohci1394]
| LR [EA091618] ohci1394_pci_probe+0x2f4/0xb90 [ohci1394]
| Call Trace:
| [CD533D30] [C79BA000] 0xc79ba000 (unreliable)
| [CD533D40] [EA091618] ohci1394_pci_probe+0x2f4/0xb90 [ohci1394]
| [CD533DA0] [C00F755C] pci_device_probe+0x84/0xa4
| [CD533DC0] [C014EE88] driver_probe_device+0x60/0x118
| [CD533DE0] [C014F0C0] __driver_attach+0xcc/0xf8
| [CD533E00] [C014E764] bus_for_each_dev+0x58/0x94
| [CD533E30] [C014ED9C] driver_attach+0x24/0x34
| [CD533E40] [C014E2CC] bus_add_driver+0x88/0x164
| [CD533E60] [C014F36C] driver_register+0x70/0xb8
| [CD533E70] [C00F7378] __pci_register_driver+0x4c/0x8c
| [CD533E80] [E900D020] ohci1394_init+0x20/0x60 [ohci1394]
| [CD533E90] [C0046214] sys_init_module+0x170/0x1544
| [CD533F40] [C000F2FC] ret_from_syscall+0x0/0x38
| --- Exception: c01 at 0xff3caac
|    LR = 0x10002e84
| Instruction dump:
| 7c0004ac 7d20052c 3be00000 48000014 4800570d 2f9f0063 3bff0001 419e0028
| 813e0008 38090050 7c0004ac 7c00042c <0c000000> 4c00012c 74090001 386003e8

Steps to reproduce:
* rmmod ohci1394
* modprobe ohci1394

In addition, it is impossible now to remove the ohci1394 module:
| # rmmod ohci1394
| ERROR: Module ohci1394 is in use
Comment 1 Stefan Richter 2006-11-05 02:31:40 UTC
There are also problems on older PowerBook G3 (Pismo).
https://bugzilla.novell.com/show_bug.cgi?id=115228
I don't know if these are related.
Comment 2 Stefan Richter 2006-11-05 03:24:05 UTC
Created attachment 9407 [details]
add platform code to ohci1394's pci_driver.probe()
Comment 3 Stefan Richter 2006-11-05 03:49:15 UTC
Above patch is entirely untested since I don't have a PPC_PMAC.
Discussion of the patch starts at
http://ozlabs.org/pipermail/linuxppc-dev/2006-November/027611.html
Comment 4 Gioele Barabucci 2006-11-09 00:42:37 UTC
I tested this patch in the last two days with linux-2.6.18-gentoo-r1.
It survived without problems various poweron/poweroff cycles, rmmod/modprobe 
cycles and suspend-to-ram/resume cycles.

My ieee1394 external box is dead so I could not test it.

Anyway this solved the backlight problem. Is that just an accidental side 
effect?

Comment 5 Stefan Richter 2006-11-09 08:27:08 UTC
Thanks for your tests.
I don't understand the last question. Which could be a side effect of what? :-)
Comment 6 Gioele Barabucci 2006-11-10 00:50:36 UTC
I mean, is this patch supposed to address the oops only or also the 
interaction of the ieee1394 module with the LCD backlight?
Comment 7 Stefan Richter 2006-11-10 09:11:44 UTC
It is supposed to prevent both: The Machine Check exception and the backlight
interaction. I should have mentioned that in the comment to the patch.

Fine, so this bug is resolved. I will close the bug when the patch went into
Linus' tree. I plan to submit it after Linux 2.6.19 was released, i.e. for
inclusion into 2.6.20. That way, more PPC_PMACs will be tested with the patch.
The slightly different problem of the Pismo mentioned in Novell's bugzilla is
not fixed by the patch, but at least the patch doesn't seem to make it worse on
that Pismo.