Latest working kernel version: 2.6.21 (had other eth1394 bugs)
Earliest failing kernel version: 2.6.22
Software Environment: uniprocessor PREEMPT_NONE kernel
- tlabel consumer eth1394 (IPv4 over FireWire) grabs lots of tlabels
in soft IRQ context.
- tlabel recycler khpsbpkt (a kthread of ieee1394) sleeps even though
it could start putting tlabels back into the pool.
- eth1394 can't get tlabels anymore, stops the transmit queue,
schedules a workqueue job.
- eth1394's workqueue job (run by the events kthread) tries to acquire
a tlabel. It does so in non-atomic context and hence sleeps in
hpsb_get_tlabel() until the tlabel pool is nonempty again. It would
then wake up the eth1394 transmit queue again.
- Normally, khpsbpkt would have been woken up by now and would have
released a lot of now unused tlabels back into the pool again.
However, on UP preempt_none kernels, khpsbpkt continues to sleep.
(The 1394 stack's lower level runing in IRQ context or perhaps
tasklet context wakes up khpsbpkt.)
- Since it doesn't get a tlabel, eth1394's workqueue jobs sleeps
forever as well.
Result is that all other tasks of the shared workqueue can't be serviced, notably the keyboard is stuck, and that the eth1394 connection breaks down.
CONFIG_PREEMPT=y avoids the problem.
Reported by andrey.aleksandrovich at googlemail
"eth1394, Connection between PC and Laptop disrupts"
"ieee1394: eth1394: handle tlabel exhaustion"
Tested with ftp between Core2Duo i686 SMP PREEMPT and Core2Duo x86-64 *UP* PREEMPT_NONE, wasn't able to reproduce the bug with it. Need to try a different test PC, and maybe with scp instead of ftp.
This should be fixed eventually by providing a IP over 1394 implementation in the new firewire driver stack.