Bug 8623 - (fw_core) Kernel ooops and system freeze when trying to access a Sony HandyCam using a VIA chipset FireWire card
Summary: (fw_core) Kernel ooops and system freeze when trying to access a Sony HandyCa...
Status: CLOSED CODE_FIX
Alias: None
Product: Drivers
Classification: Unclassified
Component: IEEE1394 (show other bugs)
Hardware: All Linux
: P1 blocking
Assignee: drivers_ieee1394
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-06-13 11:27 UTC by Mauro M.
Modified: 2007-09-04 08:28 UTC (History)
3 users (show)

See Also:
Kernel Version: 2.6.21-1.3194.fc7
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
firewire: Only set client->iso_context if allocation was successful. (1.67 KB, patch)
2007-06-20 15:25 UTC, Stefan Richter
Details | Diff

Description Mauro M. 2007-06-13 11:27:44 UTC
Most recent kernel where this bug did not occur: not tried with more recent kernel
Distribution: Fedora Core 7
Hardware Environment: Arima HDAMB Dual Opteron; 2 x Opteron 246-HE Stepping 0a; FireWire (IEEE 1394): VIA Technologies, Inc. IEEE 1394 Host Controller (rev 46); Sony HandyCam DCR-HC35 (I do not have any other firewire devices available to test with). The FireWire adapter is on a PCI card as the version of this motherboard in my possess does not have an on-board chip.
Software Environment: kino, dvgrab
Problem Description: When trying to capture from camera the following is logged and system freezes requiring a reset to re-boot:

[...]
Jun 13 18:21:02 mars kernel: Unable to handle kernel paging request at ffffffffffffffea RIP: 
Jun 13 18:21:02 mars kernel:  [<ffffffff881edbec>] :fw_core:fw_iso_context_destroy+0x0/0xd
Jun 13 18:21:02 mars kernel: PGD 203027 PUD 2c0e067 PMD 0 
Jun 13 18:21:02 mars kernel: Oops: 0000 [1] SMP 
Jun 13 18:21:02 mars kernel: last sysfs file: /devices/platform/i2c-9191/9191-0600/in8_input
Jun 13 18:21:02 mars kernel: CPU 0 
Jun 13 18:21:02 mars kernel: Modules linked in: fw_sbp2 ipt_MASQUERADE iptable_nat nf_nat bridge autofs4 pc87360 hwmon_vid i2c_isa eeprom nfs lockd nfs_acl vmnet(P)(U) vmblock(P)(U) vmmon(P)(U) sunrpc nf_conntrack_netbios_ns ipt_REJECT nf_conntrack_ipv4 xt_state nf_conntrack nfnetlink xt_tcpudp iptable_filter ip_tables x_tables radeon drm ipv6 xfs dm_mirror dm_mod video sbs i2c_ec button dock battery ac lp loop sr_mod cdrom snd_intel8x0 snd_ac97_codec ac97_bus snd_seq_dummy snd_seq_oss ata_generic snd_seq_midi_event snd_seq snd_seq_device fw_ohci snd_pcm_oss 3c59x fw_core tg3 snd_mixer_oss mii snd_pcm i2c_amd756 snd_timer i2c_core floppy snd parport_pc soundcore k8temp hwmon k8_edac edac_mc serio_raw parport snd_page_alloc pata_amd amd_rng st sg pcspkr aic7xxx scsi_transport_spi shpchp sata_sil libata sd_mod scsi_mod ext3 jbd mbcache ehci_hcd ohci_hcd uhci_hcd
Jun 13 18:21:02 mars kernel: Pid: 3967, comm: kino Tainted: P       2.6.21-1.3194.fc7 #1
Jun 13 18:21:02 mars kernel: RIP: 0010:[<ffffffff881edbec>]  [<ffffffff881edbec>] :fw_core:fw_iso_context_destroy+0x0/0xd
Jun 13 18:21:02 mars kernel: RSP: 0018:ffff81004e35bee0  EFLAGS: 00010282
Jun 13 18:21:02 mars kernel: RAX: ffffffff881eeec9 RBX: 0000000000000008 RCX: 000000000000000c
Jun 13 18:21:02 mars kernel: RDX: 0000000000000012 RSI: ffff81005e713ec0 RDI: ffffffffffffffea
Jun 13 18:21:02 mars kernel: RBP: ffff81004e247e80 R08: 00000000ffffffff R09: ffff81007db1a118
Jun 13 18:21:02 mars kernel: R10: ffff81005e713ec0 R11: 0000000000000202 R12: ffff81007db1a118
Jun 13 18:21:02 mars kernel: R13: ffff81007db1a118 R14: ffff81007ff54180 R15: ffff81007ce10a40
Jun 13 18:21:02 mars kernel: FS:  0000000041e02950(0063) GS:ffffffff8059c000(0000) knlGS:0000000000000000
Jun 13 18:21:02 mars kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Jun 13 18:21:02 mars kernel: CR2: ffffffffffffffea CR3: 0000000050ce9000 CR4: 00000000000006e0
Jun 13 18:21:02 mars kernel: Process kino (pid: 3967, threadinfo ffff81004e35a000, task ffff810058dd3100)
Jun 13 18:21:02 mars kernel: Stack:  ffffffff881eeefc 0000000000000008 0000000000000008 ffff81005e713ec0
Jun 13 18:21:02 mars kernel:  ffffffff8021140b 0000000000000000 ffff81005e713ec0 ffff81005863d080
Jun 13 18:21:02 mars kernel:  0000000000000000 00000000000003e8 000000000000003f 00000000ffffffff
Jun 13 18:21:02 mars kernel: Call Trace:
Jun 13 18:21:02 mars kernel:  [<ffffffff881eeefc>] :fw_core:fw_device_op_release+0x33/0xd0
Jun 13 18:21:02 mars kernel:  [<ffffffff8021140b>] __fput+0xc2/0x191
Jun 13 18:21:02 mars kernel:  [<ffffffff80222848>] filp_close+0x5d/0x65
Jun 13 18:21:02 mars kernel:  [<ffffffff8021c581>] sys_close+0x8c/0xc9
Jun 13 18:21:02 mars kernel:  [<ffffffff8025729c>] tracesys+0xdc/0xe1
Jun 13 18:21:02 mars kernel: 
Jun 13 18:21:02 mars kernel: 
Jun 13 18:21:02 mars kernel: Code: 48 8b 07 48 8b 00 4c 8b 58 50 41 ff e3 48 8b 07 48 8b 00 4c 
Jun 13 18:21:02 mars kernel: RIP  [<ffffffff881edbec>] :fw_core:fw_iso_context_destroy+0x0/0xd
Jun 13 18:21:02 mars kernel:  RSP <ffff81004e35bee0>
Jun 13 18:21:02 mars kernel: CR2: ffffffffffffffea
Jun 13 18:23:18 mars kernel: fw_core: created new fw device fw1 (0 config rom retries)
Jun 13 18:23:18 mars kernel: fw_core: phy config: card 0, new root=ffc1, gap_count=5
Jun 13 18:23:49 mars last message repeated 27 times
Jun 13 18:23:54 mars last message repeated 4 times
Jun 13 18:23:55 mars kernel: fw_core: Unsolicited response (source ffc0, tlabel f)
Jun 13 18:23:55 mars kernel: fw_core: phy config: card 0, new root=ffc1, gap_count=5
Jun 13 18:24:11 mars last message repeated 14 times
Jun 13 18:24:11 mars kernel: Unable to handle kernel paging request at ffffffffffffffea RIP: 
Jun 13 18:24:11 mars kernel:  [<ffffffff881edbec>] :fw_core:fw_iso_context_destroy+0x0/0xd
Jun 13 18:24:11 mars kernel: PGD 203027 PUD 2c0e067 PMD 0 
Jun 13 18:24:11 mars kernel: Oops: 0000 [2] SMP 
Jun 13 18:24:11 mars kernel: last sysfs file: /devices/platform/i2c-9191/9191-0600/in8_input
Jun 13 18:24:11 mars kernel: CPU 0 
Jun 13 18:24:11 mars kernel: Modules linked in: fw_sbp2 ipt_MASQUERADE iptable_nat nf_nat bridge autofs4 pc87360 hwmon_vid i2c_isa eeprom nfs lockd nfs_acl vmnet(P)(U) vmblock(P)(U) vmmon(P)(U) sunrpc nf_conntrack_netbios_ns ipt_REJECT nf_conntrack_ipv4 xt_state nf_conntrack nfnetlink xt_tcpudp iptable_filter ip_tables x_tables radeon drm ipv6 xfs dm_mirror dm_mod video sbs i2c_ec button dock battery ac lp loop sr_mod cdrom snd_intel8x0 snd_ac97_codec ac97_bus snd_seq_dummy snd_seq_oss ata_generic snd_seq_midi_event snd_seq snd_seq_device fw_ohci snd_pcm_oss 3c59x fw_core tg3 snd_mixer_oss mii snd_pcm i2c_amd756 snd_timer i2c_core floppy snd parport_pc soundcore k8temp hwmon k8_edac edac_mc serio_raw parport snd_page_alloc pata_amd amd_rng st sg pcspkr aic7xxx scsi_transport_spi shpchp sata_sil libata sd_mod scsi_mod ext3 jbd mbcache ehci_hcd ohci_hcd uhci_hcd
Jun 13 18:24:11 mars kernel: Pid: 4153, comm: kino Tainted: P       2.6.21-1.3194.fc7 #1
Jun 13 18:24:11 mars kernel: RIP: 0010:[<ffffffff881edbec>]  [<ffffffff881edbec>] :fw_core:fw_iso_context_destroy+0x0/0xd
Jun 13 18:24:11 mars kernel: RSP: 0018:ffff8100617bdee0  EFLAGS: 00010282
Jun 13 18:24:11 mars kernel: RAX: ffffffff881eeec9 RBX: 0000000000000008 RCX: 000000000000000f
Jun 13 18:24:12 mars kernel: RDX: 0000000000000002 RSI: ffff8100512920c0 RDI: ffffffffffffffea
Jun 13 18:24:12 mars kernel: RBP: ffff81006adc3180 R08: 00000000ffffffff R09: ffff81007db1a118
Jun 13 18:24:12 mars kernel: R10: ffff8100512920c0 R11: 0000000000000202 R12: ffff81007db1a118
Jun 13 18:24:12 mars kernel: R13: ffff81007db1a118 R14: ffff81007ff54180 R15: ffff81007ce10a40
Jun 13 18:24:12 mars kernel: FS:  0000000041e02950(0063) GS:ffffffff8059c000(0000) knlGS:0000000000000000
Jun 13 18:24:12 mars kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Jun 13 18:24:12 mars kernel: CR2: ffffffffffffffea CR3: 000000004e715000 CR4: 00000000000006e0
Jun 13 18:24:12 mars kernel: Process kino (pid: 4153, threadinfo ffff8100617bc000, task ffff810058d267e0)
Jun 13 18:24:12 mars kernel: Stack:  ffffffff881eeefc 0000000000000008 0000000000000008 ffff8100512920c0
Jun 13 18:24:12 mars kernel:  ffffffff8021140b 0000000000000000 ffff8100512920c0 ffff810050c299c0
Jun 13 18:24:12 mars kernel:  0000000000000000 00000000000003e8 000000000000003f 00000000ffffffff
Jun 13 18:24:12 mars kernel: Call Trace:
Jun 13 18:24:12 mars kernel:  [<ffffffff881eeefc>] :fw_core:fw_device_op_release+0x33/0xd0
Jun 13 18:24:12 mars kernel:  [<ffffffff8021140b>] __fput+0xc2/0x191
Jun 13 18:24:12 mars kernel:  [<ffffffff80222848>] filp_close+0x5d/0x65
Jun 13 18:24:12 mars kernel:  [<ffffffff8021c581>] sys_close+0x8c/0xc9
Jun 13 18:24:12 mars kernel:  [<ffffffff8025729c>] tracesys+0xdc/0xe1
Jun 13 18:24:12 mars kernel: 
Jun 13 18:24:12 mars kernel: 
Jun 13 18:24:12 mars kernel: Code: 48 8b 07 48 8b 00 4c 8b 58 50 41 ff e3 48 8b 07 48 8b 00 4c 
Jun 13 18:24:12 mars kernel: RIP  [<ffffffff881edbec>] :fw_core:fw_iso_context_destroy+0x0/0xd
Jun 13 18:24:12 mars kernel:  RSP <ffff8100617bdee0>
Jun 13 18:24:12 mars kernel: CR2: ffffffffffffffea
Jun 13 18:24:12 mars kernel: fw_core: phy config: card 0, new root=ffc1, gap_count=5
Jun 13 18:24:43 mars last message repeated 27 times
Jun 13 18:51:51 mars syslogd 1.4.2: restart.
Jun 13 18:51:51 mars kernel: klogd 1.4.2, log source = /proc/kmsg started.
Jun 13 18:51:51 mars kernel: Linux version 2.6.21-1.3194.fc7 (kojibuilder@hammer2.fedora.redhat.com) (gcc version 4.1.2 20070502 (Red Hat 4.1.2-12)) #1 SMP Wed May 23 22:47:07 EDT 2007
[...]

Steps to reproduce: use kino or dvgrab to capture from firewire camera on the hardware configuration above.

Further information:

# lspci
00:01.0 Host bridge: Advanced Micro Devices [AMD] AMD-8151 System Controller (rev 13)
00:02.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8151 AGP Bridge (rev 13)
00:06.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8111 PCI (rev 07)
00:07.0 ISA bridge: Advanced Micro Devices [AMD] AMD-8111 LPC (rev 05)
00:07.1 IDE interface: Advanced Micro Devices [AMD] AMD-8111 IDE (rev 03)
00:07.3 Bridge: Advanced Micro Devices [AMD] AMD-8111 ACPI (rev 05)
00:07.5 Multimedia audio controller: Advanced Micro Devices [AMD] AMD-8111 AC97 Audio (rev 03)
00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technolog                                                                                                     y Configuration
00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map
00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller
00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control
00:19.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technolog                                                                                                     y Configuration
00:19.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map
00:19.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller
00:19.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control
01:00.0 VGA compatible controller: ATI Technologies Inc RV280 [Radeon 9200] (rev 01)
01:00.1 Display controller: ATI Technologies Inc RV280 [Radeon 9200] (Secondary) (rev 01)
02:00.0 USB Controller: Advanced Micro Devices [AMD] AMD-8111 USB (rev 0b)
02:00.1 USB Controller: Advanced Micro Devices [AMD] AMD-8111 USB (rev 0b)
02:02.0 SCSI storage controller: Adaptec AHA-2940U/UW/D / AIC-7881U
02:03.0 SCSI storage controller: Adaptec AHA-2944UW / AIC-7884U (rev 01)
02:04.0 RAID bus controller: Silicon Image, Inc. SiI 3114 [SATALink/SATARaid] Serial ATA Control                                                                                                     ler (rev 02)
02:06.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5702X Gigabit Ethernet (rev 02)
02:08.0 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 50)
02:08.1 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 50)
02:08.2 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 51)
02:09.0 FireWire (IEEE 1394): VIA Technologies, Inc. IEEE 1394 Host Controller (rev 46)
02:0a.0 Ethernet controller: 3Com Corporation 3c905B 100BaseTX [Cyclone] (rev 30)

Please, let me know should you require further information.
Comment 1 Stefan Richter 2007-06-13 12:38:16 UTC
(Hmm, I sent an e-mail reply; I wonder if it will ever show up here.  What I wrote was:)

This Fedora kernel bug is tracked at
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=240771 .  You could
add your Cc there to get notified when it is fixed in a Fedora update.

If you know how to configure and install kernels from source, please
also check if the latest 2.6.22-rc kernel from kernel.org is affected.
When you configure the new kernel, you can select the new experimental
FireWire drivers to test for presence of the bug, and after that you can
select the old drivers to (hopefully) get back to a working setup.
Comment 2 Mauro M. 2007-06-13 13:04:43 UTC
OK, I will rebuild a custom kernel with the old driver stack, thank you. I have seen also the bug filed with RedHat. However I reckon that this bug report should be left open until resolved because if there are kernel drivers causing the system to freeze, these have either to be fixed or removed.
Comment 3 Stefan Richter 2007-06-13 15:05:22 UTC
What I hoped you could confirm is whether vanilla 2.6.22-rc has the bug too.  But there is a posting on linux1394-devel just now saying that Fedora's libraw1394 won't actually work with the slightly newer fw-core in kernel.org's sources.
Comment 4 Mauro M. 2007-06-17 03:23:50 UTC
For those who need to capture from their camera and do not want to wait for
Fedora, here is a fix that will restore kernel, libraries and kino former 
and working FireWire stack:

http://www.ezplanetone.com/xwiki/bin/view/KnowledgeBase/BrokenFC7FireWire
Comment 5 Stefan Richter 2007-06-17 04:15:36 UTC
About your replacement Fedora packages:  I don't know what you and Fedora precisely have in their packages, but I suspect that Fedora 7's kino and libavc1394 do not have any updates regarding the new kernel drivers, and that Fedora 7's libraw1394 works with both the old and the new drivers.

Also, the intro text is not 100% precise:  The new FireWire drivers are available in Linus' kernel since linux-2.6.22-rc1.  A difference to Fedora 7's kernel is that Linus does not distribute prebuilt kernels and the user can chose between old or new or both driver stacks when he configures the kernel.

I spent a little time to inspect the code regarding the bug but didn't recognize a cause.  My problem is that I don't have any AV/C device to reproduce the bug, but I plan to get one sooner or later.

If you are somewhat familiar with C, know how to build kernels from source, and are willing to afford the spare time and downtime to crash kernels, you could help to get closer to the cause of the bug.  As a primitive debug help, you can for example add
    printk(KERN_INFO "format string\n", ...);
at interesting places in the driver sources to get messages out via dmesg.  The entry point for userland calls to the new FireWire drivers is linux/drivers/firewire/fw-cdev.c.

PS:  Confirmation whether this bug is in Linus' tree would still be appreciated, because this is bugzilla.kernel.org, not bugzilla.redhat.com.

PPS:  Also, a test with an untainted kernel would be nice.  Some maintainers delete bug reports against tainted kernels right away.
Comment 6 Stefan Richter 2007-06-20 15:25:05 UTC
Created attachment 11834 [details]
firewire: Only set client->iso_context if allocation was successful.

Does this fix it?
Comment 7 Stefan Richter 2007-07-03 05:14:54 UTC
The patch in comment #6 has been merged to Linus' tree.  If you know how to configure and install kernels from source, try the latest 2.6.22-rc (-rc7 at the moment) or 2.6.22 which will be released soon.
Comment 8 Stefan Richter 2007-07-03 05:17:53 UTC
Fedora users see also https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=240771#c14
Comment 9 Stefan Richter 2007-09-04 08:28:34 UTC
Isochronous I/O is still not implemented in firewire-ohci for OHCI 1.0 controllers such as VIA VT6306 rev 46.  Iso I/O requires an OHCI 1.1 controller.  Kristian plans to implement an OHCI 1.0 compatible mode.

However I assume that at least the oops was fixed in kernel.org's 2.6.22.  Please reopen this bug if you still get the oops, but please test 2.6.23-rc first.  Open a new bug if you need tracking for the OHCI 1.0 compatibility feature.  Thanks.

Note You need to log in before you can comment on or make changes to this bug.