Bug 99301
Summary: | socket shutdown of L2CAP ERTM channel causes hung tasks when S or I frame ACK is pending | ||
---|---|---|---|
Product: | Drivers | Reporter: | Dean Jenkins (Dean_Jenkins) |
Component: | Bluetooth | Assignee: | linux-bluetooth (linux-bluetooth) |
Status: | NEW --- | ||
Severity: | blocking | CC: | Dean_Jenkins, szg00000 |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | verified on kernels 3.8 to 4.0.4 and bluetooth-next | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
Avoids hung task based on kernel 3.18.14 (RPi)
Uses hold and put functions to protect sk and chan structures based on 3.18.14 (RPi) l2cap_ertm_shutdown_deadlock_patches.tgz |
Description
Dean Jenkins
2015-06-01 08:19:13 UTC
Usual case: Unit under test: Mobile Phone Establish ERTM channel ... I frame: Media Browsing request ---> <--- S (or I) frame ACK ... <--- I frame: Media data S (or I) frame ACK ---> ... Close ERTM channel Normally, the service will wait for the media data to arrive before closing the channel is requested. This case is OK Failure case: Unit under test: Mobile phone Establish ERTM channel ... I frame: Media Browsing request ---> Close ERTM channel <--- S (or I) frame ACK The failure occurs when the userland decides to close the ERTM channel before the S (or I) frame ACK is processed. This means that userland is performing a scenario that causes the mediaplayer browsing request to be aborted. So you can see that the window for failure is small as timing is critical. Processor loading probably influences the size of the failure window. The l2test testcase mimics the failure case by closing the channel after sending some data and not waiting for the ACK; no return data is used. The failure is repeatable 100% so far using l2test as there is a design flaw in the architecture of the locking mechanism used in the L2CAP ERTM channel closure procedure via sock shutdown. Created attachment 178541 [details]
Avoids hung task based on kernel 3.18.14 (RPi)
This is a proposal but has side effects as memory potentially could be freed whilst the locks are not held.
Created attachment 178551 [details]
Uses hold and put functions to protect sk and chan structures based on 3.18.14 (RPi)
This patch improves the solution by preventing sk and chan structures from being freed during the execution of l2cap_sock_shutdown().
However, the conn structure is at potential risk of freeing from l2cap_conn_del() which needs further analysis.
Created attachment 178881 [details]
l2cap_ertm_shutdown_deadlock_patches.tgz
Based on bluetooth-next, these are 7 patches proposed to fix the L2CAP ERTM shutdown deadlock issue. These patches are forward-ported and untested.
Testing is ongoing on kernel 3.8 so and so far the deadlock does not occur.
I have put the proposed patches into bugzilla so that people are aware that we are working on a fix.
v2 proposal for a solution has now been posted to http://marc.info/?l=linux-bluetooth&m=143507884817418&w=2 |