Distribution: Fedora devel Hardware Environment: Athlon 64, Ti FireWire + Via VT6307 FireWire controllers Problem Description: general protection fault encountered loading and unloading modules. Steps to reproduce: [root@xantham ~]# while : > do > rmmod firewire-sbp2 > rmmod firewire-ohci > modprobe firewire-ohci > rmmod firewire-sbp2 > rmmod firewire-ohci > rmmod firewire-core > modprobe firewire-ohci > sleep 10 > done After a few loops, I got the following: general protection fault: 0000 [1] SMP CPU 0 Modules linked in: firewire_sbp2 firewire_ohci firewire_core radeon drm ipt_MASQUERADE iptable_nat nf_nat bridge rfcomm l2cap bluetooth autofs4 sunrpc nf_conntrack_ipv4 ipt_REJECT iptable_filter ip_tables nf_conntrack_ipv6 xt_state nf_conntrack xt_tcpudp ip6t_ipv6header ip6t_REJECT ip6table_filter ip6_tables x_tables ipv6 cpufreq_ondemand dm_multipath parport_pc parport snd_intel8x0 snd_ac97_codec floppy ac97_bus snd_seq_dummy snd_seq_oss snd_seq_midi_event serio_raw snd_seq pcspkr snd_seq_device snd_pcm_oss crc_itu_t snd_mixer_oss k8temp hwmon snd_pcm snd_timer snd soundcore snd_page_alloc forcedeth i2c_nforce2 i2c_core button sr_mod sg cdrom pata_amd dm_snapshot dm_zero dm_mirror dm_mod shpchp sata_sil libata sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd Pid: 6, comm: events/0 Not tainted 2.6.24-9.fc9 #1 RIP: 0010:[<ffffffff8818e2c0>] [<ffffffff8818e2c0>] :firewire_core:fw_card_bm_work+0x1b9/0x292 RSP: 0018:ffff81003fa1dd30 EFLAGS: 00010002 RAX: 0000000000000001 RBX: ffff8100368e7518 RCX: 000000010009faa3 RDX: 0000000000000001 RSI: 000000010009837e RDI: 0000000000020340 RBP: ffff81003f9a7e18 R08: 0000000000000000 R09: 0000000000000001 R10: ffffffff8818e132 R11: ffffffff8818f382 R12: ffff8100368e7000 R13: ffff8100368e74a8 R14: 0000000000000286 R15: 6b6b6b6b6b6b6b6b FS: 00002aaaaaaca7b0(0000) GS:ffffffff813ee000(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 00002aaaaad5c000 CR3: 000000003a9b3000 CR4: 00000000000006a0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process events/0 (pid: 6, threadinfo ffff81003fa1c000, task ffff81003fa1a000) Stack: ffffffff810a23bd ffff81003fa1a000 ffffffff81483cd8 0000000000000046 ffffffff81483cc0 ffffe200016cb240 ffff81003fa1a8b0 0000000100006b6b 0000000000000002 ffffffff81055985 ffffffff81483cc0 ffff81003fa1a000 Call Trace: [<ffffffff810a23bd>] add_partial_tail+0x12/0x34 [<ffffffff81055985>] mark_held_locks+0x49/0x67 [<ffffffff810a44a3>] kfree+0xe1/0xec [<ffffffff81055b36>] trace_hardirqs_on+0x107/0x12a [<ffffffff81053c99>] lock_release_holdtime+0x27/0x49 [<ffffffff8818e107>] :firewire_core:fw_card_bm_work+0x0/0x292 [<ffffffff8818e107>] :firewire_core:fw_card_bm_work+0x0/0x292 [<ffffffff81046cbe>] run_workqueue+0xdf/0x1df [<ffffffff81047793>] worker_thread+0x0/0xe7 [<ffffffff81047870>] worker_thread+0xdd/0xe7 [<ffffffff8104ad0a>] autoremove_wake_function+0x0/0x2e [<ffffffff8104abea>] kthread+0x47/0x75 [<ffffffff81275fb5>] trace_hardirqs_on_thunk+0x35/0x3a [<ffffffff8100cde8>] child_rip+0xa/0x12 [<ffffffff8100c4ff>] restore_args+0x0/0x30 [<ffffffff8104aba3>] kthread+0x0/0x75 [<ffffffff8100cdde>] child_rip+0x0/0x12 Code: 41 83 3f 01 75 72 49 8b 87 a0 03 00 00 8b 5c 24 38 f6 40 0b RIP [<ffffffff8818e2c0>] :firewire_core:fw_card_bm_work+0x1b9/0x292 RSP <ffff81003fa1dd30> ---[ end trace 4c9652622aa156e6 ]--- A bit after that, a spinlock lockup warning: BUG: spinlock lockup on CPU#0, events/0/6, ffff8100368e74a8 (Tainted: G D) Pid: 6, comm: events/0 Tainted: G D 2.6.24-9.fc9 #1 Call Trace: <IRQ> [<ffffffff8113213c>] _raw_spin_lock+0xd7/0xfe [<ffffffff8818e08b>] :firewire_core:flush_timer_callback+0x0/0x5 [<ffffffff812769e8>] _spin_lock_irqsave+0x4e/0x5e [<ffffffff8818f23e>] :firewire_core:fw_flush_transactions+0x23/0xb2 [<ffffffff8103fd6b>] run_timer_softirq+0x166/0x1da [<ffffffff8103d23d>] __do_softirq+0x5e/0xe0 [<ffffffff8100d0cc>] call_softirq+0x1c/0x28 [<ffffffff8100e6ce>] do_softirq+0x31/0x86 [<ffffffff8103d19b>] irq_exit+0x4e/0x92 [<ffffffff8101d23b>] smp_apic_timer_interrupt+0x3f/0x54 [<ffffffff8103b239>] do_exit+0x208/0x7c8 [<ffffffff8100cc0b>] apic_timer_interrupt+0x6b/0x70 <EOI> [<ffffffff810648b9>] acct_collect+0xa4/0x18f [<ffffffff810648b9>] acct_collect+0xa4/0x18f [<ffffffff8103b239>] do_exit+0x208/0x7c8 [<ffffffff812768ea>] _spin_unlock_irq+0x26/0x27 [<ffffffff8103b239>] do_exit+0x208/0x7c8 [<ffffffff8100d6e6>] kernel_math_error+0x0/0x71 [<ffffffff81276c1d>] error_exit+0x0/0xa9 [<ffffffff8818f382>] :firewire_core:transmit_complete_callback+0x0/0x56 [<ffffffff8818e132>] :firewire_core:fw_card_bm_work+0x2b/0x292 [<ffffffff8818e2c0>] :firewire_core:fw_card_bm_work+0x1b9/0x292 [<ffffffff810a23bd>] add_partial_tail+0x12/0x34 [<ffffffff81055985>] mark_held_locks+0x49/0x67 [<ffffffff810a44a3>] kfree+0xe1/0xec [<ffffffff81055b36>] trace_hardirqs_on+0x107/0x12a [<ffffffff81053c99>] lock_release_holdtime+0x27/0x49 [<ffffffff8818e107>] :firewire_core:fw_card_bm_work+0x0/0x292 [<ffffffff8818e107>] :firewire_core:fw_card_bm_work+0x0/0x292 [<ffffffff81046cbe>] run_workqueue+0xdf/0x1df [<ffffffff81047793>] worker_thread+0x0/0xe7 [<ffffffff81047870>] worker_thread+0xdd/0xe7 [<ffffffff8104ad0a>] autoremove_wake_function+0x0/0x2e [<ffffffff8104abea>] kthread+0x47/0x75 [<ffffffff81275fb5>] trace_hardirqs_on_thunk+0x35/0x3a [<ffffffff8100cde8>] child_rip+0xa/0x12 [<ffffffff8100c4ff>] restore_args+0x0/0x30 [<ffffffff8104aba3>] kthread+0x0/0x75 [<ffffffff8100cdde>] child_rip+0x0/0x12
Of possible relevance is that when things *don't* lock up, I'm usually seeing a "firewire_core: BM lock failed, making local node (ffc0) root." message for the Via controller.
Yeesh. Same box on reboot, moving the hub and iidc camera over to the Via controller from the Ti, and the dv camera from the Ti over to the Via: kernel BUG at lib/list_debug.c:33! invalid opcode: 0000 [1] SMP CPU 0 Modules linked in: radeon drm ipt_MASQUERADE iptable_nat nf_nat bridge rfcomm l2cap bluetooth autofs4 sunrpc nf_conntrack_ipv4 ipt_REJECT iptable_filter ip_tables nf_conntrack_ipv6 xt_state nf_conntrack xt_tcpudp ip6t_ipv6header ip6t_REJECT ip6table_filter ip6_tables x_tables ipv6 cpufreq_ondemand dm_multipath firewire_sbp2 parport_pc parport floppy snd_intel8x0 snd_ac97_codec ac97_bus snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq pcspkr firewire_ohci firewire_core snd_seq_device serio_raw snd_pcm_oss crc_itu_t snd_mixer_oss k8temp hwmon snd_pcm snd_timer snd soundcore snd_page_alloc forcedeth i2c_nforce2 i2c_core button sr_mod sg cdrom pata_amd dm_snapshot dm_zero dm_mirror dm_mod shpchp sata_sil libata sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd Pid: 0, comm: swapper Not tainted 2.6.24-9.fc9 #1 RIP: 0010:[<ffffffff8113259a>] [<ffffffff8113259a>] __list_add+0x47/0x5b RSP: 0018:ffffffff8155adf0 EFLAGS: 00010086 RAX: 0000000000000079 RBX: ffff81003e9cc6e8 RCX: ffff81003d53c198 RDX: ffffffff813a37a0 RSI: 0000000000000001 RDI: ffffffff813a92a0 RBP: ffff81003d53c198 R08: ffffffff813a92c0 R09: ffff810001005900 R10: 000000000000a7b9 R11: 0000000000000000 R12: ffff81003e0e0000 R13: 0000000000000000 R14: 000000000000003f R15: ffff81003d53c220 FS: 0000000040a00950(0000) GS:ffffffff813ee000(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 0000000000647000 CR3: 000000003c844000 CR4: 00000000000006a0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process swapper (pid: 0, threadinfo ffffffff81498000, task ffffffff813a37a0) Stack: ffff81003e0e0000 ffffffff88193fbb ffffffff813a4008 ffffffff00000002 ffff81003d53c1a8 ffff81003e0e0648 0000000100012a4e ffff81003e0e064c ffff81003e0e0640 0000000300000000 ffff81003e0e04a8 0000000000000246 Call Trace: <IRQ> [<ffffffff88193fbb>] :firewire_core:fw_core_handle_bus_reset+0x67a/0x758 [<ffffffff81276950>] _spin_unlock_irqrestore+0x3e/0x44 [<ffffffff8103d32c>] tasklet_action+0x2e/0xb0 [<ffffffff8103d35b>] tasklet_action+0x5d/0xb0 [<ffffffff8103d23d>] __do_softirq+0x5e/0xe0 [<ffffffff81053c99>] lock_release_holdtime+0x27/0x49 [<ffffffff8100d0cc>] call_softirq+0x1c/0x28 [<ffffffff8100e6ce>] do_softirq+0x31/0x86 [<ffffffff8103d19b>] irq_exit+0x4e/0x92 [<ffffffff8100e861>] do_IRQ+0x13e/0x161 [<ffffffff8100b066>] default_idle+0x0/0x51 [<ffffffff8100b066>] default_idle+0x0/0x51 [<ffffffff8100c456>] ret_from_intr+0x0/0xf <EOI> [<ffffffff8101ce78>] lapic_next_event+0x0/0xa [<ffffffff8100b066>] default_idle+0x0/0x51 [<ffffffff8100b09d>] default_idle+0x37/0x51 [<ffffffff8100b09b>] default_idle+0x35/0x51 [<ffffffff8100b155>] cpu_idle+0x9e/0xc6 [<ffffffff814a2b2b>] start_kernel+0x301/0x30d [<ffffffff814a211d>] _sinittext+0x11d/0x124 Code: 0f 0b eb fe 48 89 7e 08 48 89 37 48 89 57 08 48 89 3a 5a c3 RIP [<ffffffff8113259a>] __list_add+0x47/0x5b RSP <ffffffff8155adf0> ---[ end trace d4a4f763a33e3c92 ]--- Kernel panic - not syncing: Aiee, killing interrupt handler! I'm going to go out on a limb here and say we don't like Via ohci 1.0 controllers very well... ;)
The bug in comment #2 seems quite different. The bug in the description has similarities to bug 8906.
Reply-To: stefanr@s5r6.in-berlin.de bugme-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=9870 > ------- Comment #1 from jwilson@redhat.com 2008-02-01 11:29 ------- > Of possible relevance is that when things *don't* lock up, I'm usually seeing > a > "firewire_core: BM lock failed, making local node (ffc0) root." message for > the > Via controller. So then the time spent in the bus manager workqueue job is different and avoids whatever race condition caused the GPF.
The two bugs in the description may be fixed by patches posted today: http://thread.gmane.org/gmane.linux.kernel.firewire.devel/11617 Also available in patchkit v646 and later at http://me.in-berlin.de/~s5r6/linux1394/updates/
Okay, so the original spew from loading and unloading modules does seem to be resolved by the patchset in comment #5, but the spew in comment #2 just happened again, with a slight twist. Previously, I saw this while booting the machine up. This time, I moved the hub and camera over to the Via controller after the system was already booted. Looks like I first started getting never-ending bus resets (phy config line printing over and over on the console), and when I told the box to reboot, I hit the same list_debug.c BUG in comment #2 -- didn't panic, just hung. Oh, back to the original spew for a sec... Looks like this patchset makes it impossible to rmmod firewire-spb2 if there's a drive plugged in. Is this intentional? This is the case even if the drive isn't actually in use, so reloading the firewire-sbp2 module requires unplugging or powering off all sbp2 devices.
Ah, and I do still get the same panic behavior as comment #2 when freshly booted with the hub and iidc camera hooked to the via controller.
> Looks like this patchset makes it impossible to rmmod firewire-spb2 > if there's a drive plugged in. Is this intentional? No, it is a bug.
> Ah, and I do still get the same panic behavior as comment #2 when > freshly booted with the hub and iidc camera hooked to the via controller. I will start using CONFIG_DEBUG_LIST.
>> Looks like this patchset makes it impossible to rmmod firewire-spb2 >> if there's a drive plugged in. Is this intentional? > > No, it is a bug. The original module unloading bug makes it quite time-consuming to bisect the patch series for the patch where this new bug went in... Stay tuned.
It is patch "firewire: fw-sbp2: fix NULL pointer deref. in scsi_remove_device" which keeps the refcount of firewire-sbp2 one up. It's so obvious in hindsight. Now back to the drawingboard for a better fix of the scsi_remove_device bug.
firewire-sbp2 unloading brought back in http://thread.gmane.org/gmane.linux.kernel.firewire.devel/11631 Also available in patchkit v647 and later at http://me.in-berlin.de/~s5r6/linux1394/updates/
> Ah, and I do still get the same panic behavior as comment #2 when > freshly booted with the hub and iidc camera hooked to the via controller. Could you test with vanilla 2.6.25-rc3 plus linux1394-2.6.git master pulled in, or simply with Linus' current git? This is only to eliminate a bug by some unrelated Fedora kernel patches. If mainline has the bug too, please narrow down where in fw_core_handle_bus_reset the bug happens, e.g. insert a few printk()s. Your log didn't show a more specific place due to function inlining by the compiler. build_tree() could be a suspect. for_each_fw_node() too but that is less likely to be automatically inlined. update_tree() might be another candidate.
(In reply to comment #12) > firewire-sbp2 unloading brought back in > http://thread.gmane.org/gmane.linux.kernel.firewire.devel/11631 Excellent, will test that out in the morning. Just as an fyi, '[PATCH 5/5] firewire: refactor fw_unit reference counting' requires some minor rediffing to get applied with this in place. (May already be fixed in your patchkit, dunno, already fixed things up for my local build).
(In reply to comment #13) > > Ah, and I do still get the same panic behavior as comment #2 when > > freshly booted with the hub and iidc camera hooked to the via controller. > > Could you test with vanilla 2.6.25-rc3 plus linux1394-2.6.git master pulled > in, > or simply with Linus' current git? This is only to eliminate a bug by some > unrelated Fedora kernel patches. > > If mainline has the bug too, please narrow down where in > fw_core_handle_bus_reset the bug happens, e.g. insert a few printk()s. Your > log didn't show a more specific place due to function inlining by the > compiler. > build_tree() could be a suspect. for_each_fw_node() too but that is less > likely to be automatically inlined. update_tree() might be another > candidate. I'll see what I can do with this tomorrow as well...
> Just as an fyi, '[PATCH 5/5] firewire: refactor fw_unit reference counting' > requires some minor rediffing to get applied with this in place. Right, there was "fuzz 2". (Quilt is configured here to accept fuzz but I always check the result after patching with "fuzz".) Would a duplicate of my quilt trees in git be useful?
> Would a duplicate of my quilt trees in git be useful? The quilt bits could be useful, but istr git is set up to ignore a patches/ folder, and we actually reject patches with fuzz greater than 1 (iirc) in the fedora rpm spec patch application section as an extra safeguard against mis-merging something after a rebase.
(In reply to comment #14) > > firewire-sbp2 unloading brought back in > > http://thread.gmane.org/gmane.linux.kernel.firewire.devel/11631 > > Excellent, will test that out in the morning. Well, I was permitted to unload the module, but when I did, I got a general protection fault. general protection fault: 0000 [1] SMP DEBUG_PAGEALLOC CPU 0 Modules linked in: radeon drm ipt_MASQUERADE iptable_nat nf_nat bridge rfcomm l2cap bluetooth autofs4 sunrpc nf_conntrack_ipv4 ipt_REJECT iptable_filter ip_tables nf_conntrack_ipv6 xt_state nf_conntrack xt_tcpudp ip6t_ipv6header ip6t_REJECT ip6table_filter ip6_tables x_tables ipv6 dm_multipath firewire_sbp2(-) parport_pc parport snd_intel8x0 snd_ac97_codec ac97_bus snd_seq_dummy floppy snd_seq_oss snd_seq_midi_event serio_raw snd_seq pcspkr firewire_ohci snd_seq_device firewire_core snd_pcm_oss crc_itu_t snd_mixer_oss k8temp snd_pcm hwmon snd_timer snd soundcore snd_page_alloc forcedeth i2c_nforce2 i2c_core button sg sr_mod cdrom pata_amd dm_snapshot dm_zero dm_mirror dm_mod shpchp sata_sil libata sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded: freq_table] Pid: 2601, comm: rmmod Not tainted 2.6.25-0.70.rc3.git1.fc9.fw #1 RIP: 0010:[<ffffffff8822adef>] [<ffffffff8822adef>] :firewire_sbp2:sbp2_release_target+0xea/0x103 RSP: 0018:ffff810032d7bdc8 EFLAGS: 00010286 RAX: 6b6b6b6b6b6b6b6b RBX: ffff81003f031090 RCX: ffff810032d7bd28 RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff810032d88000 RBP: ffff810032d7bdf8 R08: ffff810032d7bd58 R09: ffffe20001908300 R10: ffffffff8805393b R11: ffff810032d7bbc8 R12: ffff81003f0306b0 R13: ffff81003f0306a0 R14: ffff81003f0306b0 R15: ffff81003f030000 FS: 00007f7f4546e6f0(0000) GS:ffffffff81416000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00007fddb34e73c0 CR3: 0000000032d4a000 CR4: 00000000000006a0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process rmmod (pid: 2601, threadinfo ffff810032d7a000, task ffff810032d88000) Stack: ffff81003f0306b8 ffff81003f0306a0 ffffffff8822ad05 ffff81003d8cdb40 ffffffff8822d740 0000000000000880 ffff810032d7be18 ffffffff8113589a ffff81003dcca240 ffffffff8822d740 ffff810032d7be28 ffffffff8822a130 Call Trace: [<ffffffff8822ad05>] ? :firewire_sbp2:sbp2_release_target+0x0/0x103 [<ffffffff8113589a>] kref_put+0x43/0x4f [<ffffffff8822a130>] :firewire_sbp2:sbp2_target_put+0x10/0x12 [<ffffffff8822a142>] :firewire_sbp2:sbp2_remove+0x10/0x14 [<ffffffff811b43a4>] __device_release_driver+0x76/0x9a [<ffffffff811b490e>] driver_detach+0xe3/0x125 [<ffffffff811b3c5b>] bus_remove_driver+0x86/0xa8 [<ffffffff811b49b3>] driver_unregister+0x36/0x3b [<ffffffff8822bb3c>] :firewire_sbp2:sbp2_cleanup+0x10/0x1e [<ffffffff8105ccd0>] sys_delete_module+0x18e/0x1d6 [<ffffffff81014768>] ? syscall_trace_enter+0xb0/0xb5 [<ffffffff8100c137>] tracesys+0xdc/0xe1 Code: c9 9b e2 ff 49 8b 75 10 48 c7 c7 ed bd 22 88 31 c0 e8 df a1 e0 f8 49 8b 7d 08 e8 42 6a f8 f8 4c 89 ff e8 d5 93 e2 ff 49 8b 45 08 <48> 8b b8 a8 01 00 00 e8 2a 6a f8 f8 41 58 5b 41 5c 41 5d 41 5e RIP [<ffffffff8822adef>] :firewire_sbp2:sbp2_release_target+0xea/0x103 RSP <ffff810032d7bdc8> ---[ end trace b854416fd93a1b7c ]---
I enabled DEBUG_PAGEALLOC now as well. The general protection fault happens here too, either when unloading firewire-sbp2 or when unplugging the disk.
I bisected my patch series and found that patch "firewire: fix crash in automatic module unloading" introduced the unable-to-handle-kernel-paging-request bug.
> found that patch "firewire: fix crash in automatic module unloading" > introduced the unable-to-handle-kernel-paging-request bug. Yeah, but only after I updated it. I think I see the problem.
Fixed in http://thread.gmane.org/gmane.linux.kernel.firewire.devel/11631/focus=11639 and in patchkit v647a on my site.
(In reply to comment #13) > > Ah, and I do still get the same panic behavior as comment #2 when > > freshly booted with the hub and iidc camera hooked to the via controller. > > Could you test with vanilla 2.6.25-rc3 plus linux1394-2.6.git master pulled > in, > or simply with Linus' current git? This is only to eliminate a bug by some > unrelated Fedora kernel patches. 2.6.25-rc3-git1 + linux1394-2.6.git + firewire patches under review, the panic still happens. Looks like it happens with a hub between the iidc camera and the via controller, but not with the hub removed. i.e., if I plug the camera in directly to the controller, all is well, but if I insert the hub between the camera and the controller, I hit the panic. And thus far, this is *only* with the Via VT6307 ohci 1.0 controller in this box. If I move the hub and camera to the Ti controller in the same box, no problems. Hm... I should try to reproduce it with another hub and on my other box w/a VT6307 ohci 1.0 controller... > If mainline has the bug too, please narrow down where in > fw_core_handle_bus_reset the bug happens, e.g. insert a few printk()s. Your > log didn't show a more specific place due to function inlining by the > compiler. > build_tree() could be a suspect. for_each_fw_node() too but that is less > likely to be automatically inlined. update_tree() might be another > candidate. Starting to prod right now...
(In reply to comment #23) > Hm... I should try to reproduce it with another hub Oh fun. Doesn't happen if I replace the iogear hub that was there with a kensington one (both bus-powered, fwiw). I'll see what I can see on my other VT6307 ohci 1.0 box, but at the moment, this appears to be specific to the combination of a VT6307 controller and this iogear hub. (iogear usb 2.0 & Firewire Combo Hub, Model# GUH420)
(In reply to comment #22) > Fixed in > http://thread.gmane.org/gmane.linux.kernel.firewire.devel/11631/focus=11639 > and in patchkit v647a on my site. Can successfully rmmod firewire-sbp2 and no more oops here either.
(In reply to comment #24) > (In reply to comment #23) > > Hm... I should try to reproduce it with another hub > > Oh fun. Doesn't happen if I replace the iogear hub that was there with a > kensington one (both bus-powered, fwiw). I'll see what I can see on my other > VT6307 ohci 1.0 box, but at the moment, this appears to be specific to the > combination of a VT6307 controller and this iogear hub. (iogear usb 2.0 & > Firewire Combo Hub, Model# GUH420) Oh yeah, and *just* the hub plugged in, or the hub + an sbp2 hard disk (complete with dd'ing its block device to /dev/null), no panic. So its seems the iidc camera is also required in the above setup (unibrain fire-i in this case).
Maybe what gets into the selfID buffer looks very very special in that one case.
(In reply to comment #13) > > Ah, and I do still get the same panic behavior as comment #2 when > > freshly booted with the hub and iidc camera hooked to the via controller. > > Could you test with vanilla 2.6.25-rc3 plus linux1394-2.6.git master pulled > in, > or simply with Linus' current git? This is only to eliminate a bug by some > unrelated Fedora kernel patches. > > If mainline has the bug too, please narrow down where in > fw_core_handle_bus_reset the bug happens, e.g. insert a few printk()s. Your > log didn't show a more specific place due to function inlining by the > compiler. > build_tree() could be a suspect. for_each_fw_node() too but that is less > likely to be automatically inlined. update_tree() might be another > candidate. > Looks like update_tree() is the culprit. The crash seems to happen in here: for (i = 0; i < node0->port_count; i++) { if (node0->ports[i] && node1->ports[i]) { /* * This port didn't change, queue the * connected node for further * investigation. */ if (node0->ports[i]->color == card->color) continue; list_add_tail(&node0->ports[i]->link, &list0); list_add_tail(&node1->ports[i]->link, &list1); I get a slew of bus resets that all go through that code okay for a number of iterations, but it finally gives up the ghost around one of those list_add_tail() calls. Out of time for this one tonight, gotta head homeward...
"kernel BUG at lib/list_debug.c:33" moved over to bug 10128. Closing this bug since module unloading finally works.
The fix to the problem according to the initial report has been merged in Linux 2.6.25-rc4.