|Summary:||eth1394 stops, keyboard hangs|
|Product:||Drivers||Reporter:||Stefan Richter (stefanr)|
|Component:||IEEE1394||Assignee:||Stefan Richter (stefanr)|
|Kernel Version:||2.6.22 - 2.6.25-rc||Tree:||Mainline|
|Bug Depends on:|
Description Stefan Richter 2008-03-22 03:46:15 UTC
Latest working kernel version: 2.6.21 (had other eth1394 bugs) Earliest failing kernel version: 2.6.22 Software Environment: uniprocessor PREEMPT_NONE kernel - tlabel consumer eth1394 (IPv4 over FireWire) grabs lots of tlabels in soft IRQ context. - tlabel recycler khpsbpkt (a kthread of ieee1394) sleeps even though it could start putting tlabels back into the pool. - eth1394 can't get tlabels anymore, stops the transmit queue, schedules a workqueue job. - eth1394's workqueue job (run by the events kthread) tries to acquire a tlabel. It does so in non-atomic context and hence sleeps in hpsb_get_tlabel() until the tlabel pool is nonempty again. It would then wake up the eth1394 transmit queue again. - Normally, khpsbpkt would have been woken up by now and would have released a lot of now unused tlabels back into the pool again. However, on UP preempt_none kernels, khpsbpkt continues to sleep. (The 1394 stack's lower level runing in IRQ context or perhaps tasklet context wakes up khpsbpkt.) - Since it doesn't get a tlabel, eth1394's workqueue jobs sleeps forever as well. Result is that all other tasks of the shared workqueue can't be serviced, notably the keyboard is stuck, and that the eth1394 connection breaks down. CONFIG_PREEMPT=y avoids the problem. Reported by andrey.aleksandrovich at googlemail http://thread.gmane.org/gmane.linux.kernel.firewire.user/3144 "eth1394, Connection between PC and Laptop disrupts" Caused by: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=7a97bc03e089d1a75dc533f0fe69ec8dac672916 "ieee1394: eth1394: handle tlabel exhaustion"
Comment 1 Stefan Richter 2008-03-24 14:41:42 UTC
Tested with ftp between Core2Duo i686 SMP PREEMPT and Core2Duo x86-64 *UP* PREEMPT_NONE, wasn't able to reproduce the bug with it. Need to try a different test PC, and maybe with scp instead of ftp.
Comment 2 Stefan Richter 2008-12-13 03:13:51 UTC
This should be fixed eventually by providing a IP over 1394 implementation in the new firewire driver stack.