Bug 83661
Summary: | CPU hangs on pppd disconnect | ||
---|---|---|---|
Product: | Networking | Reporter: | Alexander Kurilo (alex) |
Component: | Other | Assignee: | Stephen Hemminger (stephen) |
Status: | NEW --- | ||
Severity: | high | CC: | asilva, bastienphilbert, szg00000, vimusov |
Priority: | P1 | ||
Hardware: | x86-64 | ||
OS: | Linux | ||
Kernel Version: | 3.16.1-1-ARCH | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: | First attempt to fix the bug |
Description
Alexander Kurilo
2014-09-01 09:43:26 UTC
I think I've got some useful information about this bug. In my case it occures when user cancel L2TP/IPsec connection on login/password verification step. I made some debug and found that problem is in double spinlock. Here is the code: =[begin]= drivers/net/ppp_generic.c: void ppp_unregister_channel(struct ppp_channel *chan) { ... down_write(&pch->chan_sem); spin_lock_bh(&pch->downl); /* <== that's the point! */ =[end]= Kernel version: 2.6.32-504.8.1.el6 (from CentOS 6.6 updates). That how it works in my opinion and according to backtrace: 1) Function `ppp_channel_push' locks the spinlock `downl'. 2) At the same time function `ppp_unregister_channel' locks that spinlock one more time. I'm going to try fix this issue and publish a patch here. Also there is one more bug. During L2TP/IPsec connection cancel function `ppp_unregister_channel' is calling twice. And at the first time it goes here: =[begin]= ff (!pch) return; /* should never happen */ =[end]= Created attachment 198371 [details]
First attempt to fix the bug
First attempt to fix the bug (for kernel 2.6.32-504.8.1). It looks very ugly but it works. I couldn't make it better because I don't understand completely how ppp subsystem works. I hope this patch may be useful.
Also this patch can be applied on kernel 3.16.1 but I didn't test it with this kernel version.
Seems the locking may be made to use write_lock_bh(&pch->upl);/write_unlock_bh(&pch->upl); around critical region as reading seems ok but writing to the same critical region at this time is dangerous and may stop the deadlock. Can you try changing to use the write lock rather then the spinlock for bottom halves embedded in the structure for ppp channels supported by the hardware. |