Bug 15994

Summary: Kernel OOPS with Null pointer dereference after closing rfcomm connection to device which had gone away
Product: Drivers Reporter: vlad (vladisslav)
Component: BluetoothAssignee: drivers_bluetooth (drivers_bluetooth)
Status: CLOSED CODE_FIX    
Severity: normal CC: 8an, florian, vladisslav
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.32-2.6.34 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: revert commit 9e726b17422bade75fba94e625cd35fd1353e682

Description vlad 2010-05-18 08:53:11 UTC
Created attachment 26420 [details]
revert commit 9e726b17422bade75fba94e625cd35fd1353e682

How to reproduce:
1. Connect to mobile phone via bluetooth DUN service using wvdial
2. Move mobile phone away from bluetooth dongle receiver range and wait till it reports that connection was closed.
3. wvdial will hand with output like this:

--> WvDial: Internet dialer version 1.60
--> Cannot get information for serial port.
--> Initializing modem.
--> Sending: ATZ
ATZ
OK
--> Sending: at+cgdcont=1,"IP","internet.beeline.ru"
at+cgdcont=1,"IP","internet.beeline.ru"
OK
--> Modem initialized.
--> Sending: ATDT*99***1#
--> Waiting for carrier.
ATDT*99***1#
CONNECT
~[7f]}#@!}!} } }2}#}$@#}!}$}%\}"}&} }*} } g}%~
--> Carrier detected.  Waiting for prompt.
~[7f]}#@!}!} } }2}#}$@#}!}$}%\}"}&} }*} } g}%~
--> PPP negotiation detected.
--> Starting pppd at Tue May 18 12:15:15 2010
--> Warning: Could not modify /etc/ppp/pap-secrets: Permission denied
--> --> PAP (Password Authentication Protocol) may be flaky.
--> Warning: Could not modify /etc/ppp/chap-secrets: Permission denied
--> --> CHAP (Challenge Handshake) may be flaky.
--> Pid of pppd: 6714
--> Using interface ppp0
--> pppd: �k[06][08]�k[06][08]ligent PPP dialer.[0b]Optional SECTION arguments refer to sections in configuration file (usually)
--> /etc [01]
--> pppd: �k[06][08]�k[06][08]ligent PPP dialer.[0b]Optional SECTION arguments refer to sections in configuration file (usually)
--> /etc [01]
--> pppd: �k[06][08]�k[06][08]ligent PPP dialer.[0b]Optional SECTION arguments refer to sections in configuration file (usually)
--> /etc [01]
--> pppd: �k[06][08]�k[06][08]ligent PPP dialer.[0b]Optional SECTION arguments refer to sections in configuration file (usually)
--> /etc [01]
--> local  IP address 10.198.21.112
--> pppd: �k[06][08]�k[06][08]ligent PPP dialer.[0b]Optional SECTION arguments refer to sections in configuration file (usually)
--> /etc [01]
--> remote IP address 10.6.6.6
--> pppd: �k[06][08]�k[06][08]ligent PPP dialer.[0b]Optional SECTION arguments refer to sections in configuration file (usually)
--> /etc [01]
--> primary   DNS address 217.118.66.243
--> pppd: �k[06][08]�k[06][08]ligent PPP dialer.[0b]Optional SECTION arguments refer to sections in configuration file (usually)
--> /etc [01]
--> secondary DNS address 217.118.66.244
--> pppd: �k[06][08]�k[06][08]ligent PPP dialer.[0b]Optional SECTION arguments refer to sections in configuration file (usually)
--> /etc [01]
--> pppd: �k[06][08]�k[06][08]ligent PPP dialer.[0b]Optional SECTION arguments refer to sections in configuration file (usually)
--> /etc [01]
--> Connect time 0.1 minutes.
--> pppd: �k[06][08]�k[06][08]ligent PPP dialer.[0b]Optional SECTION arguments refer to sections in configuration file (usually)
--> /etc [01]
--> pppd: �k[06][08]�k[06][08]ligent PPP dialer.[0b]Optional SECTION arguments refer to sections in configuration file (usually)
--> /etc [01]
--> pppd: �k[06][08]�k[06][08]ligent PPP dialer.[0b]Optional SECTION arguments refer to sections in configuration file (usually)
--> /etc [01]
--> pppd: �k[06][08]�k[06][08]ligent PPP dialer.[0b]Optional SECTION arguments refer to sections in configuration file (usually)
--> /etc [01]
--> Disconnecting at Tue May 18 12:15:25 2010
--> The PPP daemon has died: A modem hung up the phone (exit code = 16)
--> man pppd explains pppd error codes in more detail.
--> Try again and look into /var/log/messages and the wvdial and pppd man pages for more information.
--> Auto Reconnect will be attempted in 5 seconds
--> Cannot get information for serial port.
--> Initializing modem.
--> Sending: ATZ
ATZ
OK
--> Sending: at+cgdcont=1,"IP","internet.beeline.ru"
at+cgdcont=1,"IP","internet.beeline.ru"
OK
--> Modem initialized.
--> Cannot open /dev/rfcomm0: Input/output error

4. Kill wvdial by pressing Ctrl+C
5. Kernel will OOPS with NULL pointer dereference

[ 1113.789211] BUG: unable to handle kernel NULL pointer dereference at 00000038
[ 1113.789239] IP: [<c0485dd0>] sock_sendmsg+0xb0/0xf0
[ 1113.789271] *pde = 00000000 
[ 1113.789287] Oops: 0000 [#1] SMP 
[ 1113.789301] last sysfs file: /sys/devices/pci0000:00/0000:00:1c.0/0000:02:00.0/ieee80211/phy0/rfkill1/state
[ 1113.789319] Modules linked in: ppp_deflate zlib_deflate bsd_comp ppp_async crc_ccitt ppp_generic slhc i915 drm_kms_helper drm i2c_algo_bit binfmt_misc bridge stp llc bnep rfcomm sco l2cap crc16 sbs sbshc power_meter container acpi_cpufreq cpufreq_userspace cpufreq_stats cpufreq_powersave cpufreq_conservative dm_crypt dm_mod arc4 ecb snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_pcm_oss snd_mixer_oss snd_pcm snd_seq_dummy snd_seq_oss snd_seq_midi snd_rawmidi snd_seq_midi_event ath5k snd_seq snd_timer snd_seq_device mac80211 uvcvideo ath snd videodev v4l1_compat video backlight soundcore rtc_cmos cfg80211 intel_agp btusb rtc_core rtc_lib output serio_raw i2c_core evdev processor snd_page_alloc battery joydev ac button pcspkr agpgart bluetooth led_class sg sd_mod ata_generic ata_piix libata scsi_mod ide_pci_generic ehci_hcd sky2 uhci_hcd usbcore thermal fan thermal_sys fuse
[ 1113.789648] 
[ 1113.789668] Pid: 8335, comm: wvdial Not tainted 2.6.34-vanilla #3 NC10                       /NC10                       
[ 1113.789688] EIP: 0060:[<c0485dd0>] EFLAGS: 00210282 CPU: 0
[ 1113.789706] EIP is at sock_sendmsg+0xb0/0xf0
[ 1113.789717] EAX: f2105c30 EBX: 00000000 ECX: f2105d04 EDX: f256ec40
[ 1113.789727] ESI: f2105d04 EDI: f2105c30 EBP: f6640308 ESP: f2105c2c
[ 1113.789738]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[ 1113.789751] Process wvdial (pid: 8335, ti=f2104000 task=f2d19560 task.ti=f2104000)
[ 1113.789760] Stack:
[ 1113.789767]  00000004 c2015450 c201544c 00000000 00000001 ffffffff 00000000 00000000
[ 1113.789790] <0> 00000000 00000000 00000000 f2d19560 00000000 00000000 c014a06b 00000000
[ 1113.789814] <0> f2105ca0 c2015440 f3afc0b8 00200296 f2105cc0 c014a4a5 f3afc0c8 c06d9a04
[ 1113.789841] Call Trace:
[ 1113.789861]  [<c014a06b>] ? insert_work+0x7b/0x90
[ 1113.789877]  [<c014a4a5>] ? __queue_work+0x25/0x40
[ 1113.789892]  [<c0140ef8>] ? run_timer_softirq+0x168/0x270
[ 1113.789910]  [<c0179754>] ? __rcu_process_callbacks+0x44/0x2b0
[ 1113.789925]  [<c01799f3>] ? rcu_process_callbacks+0x33/0x40
[ 1113.789941]  [<c01399c3>] ? __do_softirq+0x103/0x180
[ 1113.789955]  [<c0485e3f>] ? kernel_sendmsg+0x2f/0x50
[ 1113.789980]  [<f85d412a>] ? rfcomm_send_frame+0x3a/0x40 [rfcomm]
[ 1113.790001]  [<f85d4188>] ? rfcomm_send_disc+0x58/0x60 [rfcomm]
[ 1113.790020]  [<f85d480a>] ? __rfcomm_dlc_close+0x1fa/0x230 [rfcomm]
[ 1113.790026]  [<c01d47d8>] ? fasync_helper+0x28/0x110
[ 1113.790026]  [<f85d4d51>] ? rfcomm_dlc_close+0x21/0x40 [rfcomm]
[ 1113.790026]  [<f85d8d23>] ? rfcomm_tty_close+0x43/0x180 [rfcomm]
[ 1113.790026]  [<c0427694>] ? tty_release+0x174/0x600
[ 1113.790026]  [<f85d90b9>] ? rfcomm_tty_open+0x259/0x370 [rfcomm]
[ 1113.790026]  [<c012f190>] ? default_wake_function+0x0/0x10
[ 1113.790026]  [<c0428382>] ? tty_open+0x512/0x590
[ 1113.790026]  [<c01cb316>] ? chrdev_open+0x126/0x1d0
[ 1113.790026]  [<c01c67a5>] ? __dentry_open+0xf5/0x2b0
[ 1113.790026]  [<c01d0627>] ? generic_permission+0x17/0xb0
[ 1113.790026]  [<c01c6a47>] ? nameidata_to_filp+0x47/0x50
[ 1113.790026]  [<c01cb1f0>] ? chrdev_open+0x0/0x1d0
[ 1113.790026]  [<c01d24d7>] ? do_last+0x397/0x630
[ 1113.790026]  [<c01d43a4>] ? do_filp_open+0x1d4/0x560
[ 1113.790026]  [<c01d38ed>] ? do_unlinkat+0x4d/0x160
[ 1113.790026]  [<c01c651d>] ? do_sys_open+0x6d/0x140
[ 1113.790026]  [<c01c666c>] ? sys_open+0x2c/0x40
[ 1113.790026]  [<c0102bd0>] ? sysenter_do_call+0x12/0x26
[ 1113.790026] Code: 44 24 30 00 00 00 00 c7 44 24 34 00 00 00 00 c7 84 24 8c 00 00 00 00 00 00 00 89 b4 24 90 00 00 00 8b 58 24 89 f8 89 0c 24 89 f1 <ff> 53 38 3d ef fd ff ff 74 1e 8b 9c 24 b4 00 00 00 8b b4 24 b8 
[ 1113.790026] EIP: [<c0485dd0>] sock_sendmsg+0xb0/0xf0 SS:ESP 0068:f2105c2c
[ 1113.790026] CR2: 0000000000000038
[ 1113.790548] ---[ end trace a45d69946fff6e7a ]---

After some investigation I have found that after reverting commit 9e726b17422bade75fba94e625cd35fd1353e682 kernel Oopses are gone and wvdial starts working as expected (reconnects when device goes back in range or exit after some timeout expired and device did not go back)
Patch included.
Comment 1 Lukáš Turek 2010-12-29 23:43:07 UTC
I seem to have a similar problem while synchronizing calendar in my phone (in 2.6.36). The attached patch fixes that too.

BUG: unable to handle kernel NULL pointer dereference at 00000018
IP: [<c117e40e>] sock_sendmsg+0x97/0xba
*pde = 00000000 
Oops: 0000 [#1] 
last sysfs file: /sys/devices/pci0000:00/0000:00:1d.0/usb2/2-2/2-2:1.0/bluetooth/hci0/uevent
Modules linked in: ipv6 lm85 hwmon_vid hwmon rfcomm l2cap crc16 arc4 ecb crypto_blkcipher cryptomgr aead crypto_algapi ath9k ath5k ath9k_common ath9k_hw mac80211 ath i2c_i801 btusb cfg80211 e100 bluetooth led_class mii i2c_core

Pid: 5227, comm: syncml-ds-tool Not tainted 2.6.36-gentoo-r5 #1 D865PERL                       /        
EIP: 0060:[<c117e40e>] EFLAGS: 00010282 CPU: 0
EIP is at sock_sendmsg+0x97/0xba
EAX: 00000000 EBX: cb215c7c ECX: 00000004 EDX: cb215d54
ESI: cb215000 EDI: cb215d70 EBP: cb215d34 ESP: cb215c7c
 DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
Process syncml-ds-tool (pid: 5227, ti=cb215000 task=df848d40 task.ti=cb215000)
Stack:
 00000000 00000000 00000000 00000001 ffffffff 00000000 00000000 00000000
<0> 00000000 00000000 df848d40 00000000 00000000 00000000 00000000 cb215cec
<0> 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Call Trace:
 [<c117e459>] ? kernel_sendmsg+0x28/0x37
 [<e0dc80e9>] ? rfcomm_send_frame+0x2f/0x37 [rfcomm]
 [<e0dc8144>] ? rfcomm_send_disc+0x53/0x55 [rfcomm]
 [<e0dc84de>] ? __rfcomm_dlc_close+0x50/0x1ba [rfcomm]
 [<e0dc89c3>] ? rfcomm_dlc_close+0x20/0x32 [rfcomm]
 [<e0dca852>] ? __rfcomm_sock_close+0x5e/0x6a [rfcomm]
 [<e0dca889>] ? rfcomm_sock_shutdown+0x2b/0x5e [rfcomm]
 [<e0dca8d4>] ? rfcomm_sock_release+0x18/0x51 [rfcomm]
 [<c117f1d4>] ? sock_release+0x14/0x5a
 [<c117f236>] ? sock_close+0x1c/0x20
 [<c1064f42>] ? fput+0xcb/0x170
 [<c1062d8a>] ? filp_close+0x51/0x5b
 [<c101fa53>] ? put_files_struct+0x57/0x9b
 [<c101fab5>] ? exit_files+0x1e/0x20
 [<c1020b3e>] ? do_exit+0x15a/0x4eb
 [<c102722a>] ? dequeue_signal+0xa1/0xf0
 [<c1020f11>] ? do_group_exit+0x42/0x63
 [<c1027e21>] ? get_signal_to_deliver+0x282/0x292
 [<c1001a2c>] ? do_signal+0x58/0x668
 [<c1011ab1>] ? lapic_next_event+0x16/0x1a
 [<c1037094>] ? clockevents_program_event+0xc6/0xd8
 [<c1037c52>] ? tick_dev_program_event+0x2e/0x10e
 [<c10345a7>] ? ktime_get_ts+0xba/0xc4
 [<c100205f>] ? do_notify_resume+0x23/0x42
 [<c11d3e86>] ? work_notifysig+0x13/0x19
Code: ff ff 00 00 00 00 c7 85 6c ff ff ff 00 00 00 00 c7 85 74 ff ff ff 00 00 00 00 c7 85 78 ff ff ff 00 00 00 00 c7 45 d0 00 00 00 00 <8b> 70 18 51 89 d1 89 c2 89 d8 ff 56 38 5e 3d ef fd ff ff 75 07 
EIP: [<c117e40e>] sock_sendmsg+0x97/0xba SS:ESP 0068:cb215c7c
CR2: 0000000000000018
---[ end trace 80e8fb7d0121785e ]---
Comment 2 Lukáš Turek 2010-12-30 23:36:04 UTC
It looks like a bug in rfcomm_session reference counting, because it is used after it was freed (and something probably overwrote the socket pointer):

rfcomm_process_dlcs: session de266600 state 8
rfcomm_session_put: session de266600 socket df0cc080 state 8
rfcomm_session_del: session de266600 socket df0cc080 state 8
rfcomm_session_clear_timer: session de266600 socket df0cc080 state 8
rfcomm_session_del: session de266600 was freed!
rfcomm_l2state_change: dc844800 state 9, session 000000ff socket 000000ff
rfcomm_sock_release: sock df0cab00, sk de0c7000, session de266600 socket   (null)
rfcomm_sock_shutdown: sock df0cab00, sk de0c7000
__rfcomm_sock_close: sk de0c7000 state 5 socket df0cab00
__rfcomm_dlc_close: dlc dc84a980 state 7 dlci 8 err 0 session de266600 socket   (null)
Comment 3 Florian Mickler 2011-01-29 14:18:02 UTC
This bugreport got mentioned in the changelog of a commit merged for 2.6.38-rc3: 

commit 683d949a7fbf33c244670e34d35c460e0d6558cb
Author: Lukáš Turek <8an@praha12.net>
Date:   Wed Jan 5 02:43:59 2011 +0100

    Bluetooth: Never deallocate a session when some DLC points to it