Bug 15077
Summary: | firewire-net: panic in prio_tree_left (was in fwnet_write_complete) | ||
---|---|---|---|
Product: | Drivers | Reporter: | Stefan Richter (stefanr) |
Component: | IEEE1394 | Assignee: | drivers_ieee1394 |
Status: | CLOSED CODE_FIX | ||
Severity: | normal | CC: | basinilya |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.32 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
screenshot of panic in fwnet_write_complete
new screenshot not a patch dmesg with comments |
Description
Stefan Richter
2010-01-17 15:11:59 UTC
Created attachment 24608 [details]
new screenshot
Created attachment 24609 [details] not a patch Hi. Thanks for creating the bug ticket. I followed your recommendation partially. Instead of INIT_LIST_HEAD(&ptask->pt_link) I initialize prev and next to 0, so I know the ptask is broken. Note, the attached patch fixes nothing, just shows what is wrong. see the new screenshot http://bugzilla.kernel.org/attachment.cgi?id=24608 Proposed patch: http://lkml.org/lkml/2010/1/18/438 Tried the patch. Problem stays, symptoms changed: 1) At first I started rsync on tty1. All worked fine until I switched to X. It showed me the desktop and the system hung. Keyboard leds not blinking. 2) started rsync in gnome-terminal. All worked fine until I switched to tty1. It flooded several screens with call traces and hung. 3) After the 3rd restart /sys/class/firewire0 not created and dmesg says nothing, although firewire_net loaded with no error messages. ...still playing. From the beginning there was another problem; I planned to report it separately, but maybe they're connected: very often firewire_net not auto-loaded and when I try to modprobe it, or to ifconfig firewire0 or to ping, dmesg says: firewire_ohci: isochronous cycle inconsistent or firewire_core: giving up on config rom for node id ffc1 and the only solution is to reboot and try again. Reply-To: stefanr@s5r6.in-berlin.de bugzilla-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=15077 > > > --- Comment #4 from leniviy <basinilya@gmail.com> 2010-01-19 19:20:24 --- > Tried the patch. Problem stays, symptoms changed: > 1) > At first I started rsync on tty1. All worked fine until I switched to X. It > showed me the desktop and the system hung. Keyboard leds not blinking. > > 2) started rsync in gnome-terminal. All worked fine until I switched to tty1. > It flooded several screens with call traces and hung. Could both be the same kmemcache corruption bug which I saw, or could be anything else. > 3) After the 3rd restart /sys/class/firewire0 not created and dmesg says > nothing, although firewire_net loaded with no error messages. Most likely unrelated. > ...still playing. > >>From the beginning there was another problem; I planned to report it > separately, but maybe they're connected: very often firewire_net not > auto-loaded and when I try to modprobe it, or to ifconfig firewire0 or to > ping, > dmesg says: > firewire_ohci: isochronous cycle inconsistent Shouldn't be an issue if it only happens on bus topology changes. It is known to happen when the cycle master changes. It should not affect firewire-net. > or > firewire_core: giving up on config rom for node id ffc1 > and the only solution is to reboot and try again. In this situation, # modprobe -r firewire-ohci # modprobe firewire-ohci debug=7 could give more information. This could be unreliable hardware. The PC which gave up had the Agere FW322/323, right? What controller is on the peer? Created attachment 24691 [details] dmesg with comments I think http://lkml.org/lkml/2010/1/18/438 fixes this very bug. But you were probably right about memory corruption. Fixing this bug just revealed that bug. fwnet_write_complete fix from comment 3 was merged into 2.6.33. Renaming this bug to the one in prio_tree_left per comment 4/ comment 6 (attachment 24691 [details]). Also note the new crash in cache_free_debugcheck per comment 5 (http://lkml.org/lkml/2010/1/18/488). Candidate fixes by Clemens Ladisch: http://thread.gmane.org/gmane.linux.kernel.firewire.devel/14502 The patches from comment 8 work for me. firewire-net is still not performing very well and sometimes connections break down entirely (workaround: reload firewire-ohci), but there are no crashes anymore. Fixes merged into mainline, to appear in 2.6.37-rc2, also submitted for inclusion into currently active stable series. |