Bug 109431 - Crash at USB audio device removal
Summary: Crash at USB audio device removal
Status: CLOSED CODE_FIX
Alias: None
Product: Drivers
Classification: Unclassified
Component: Sound(ALSA) (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Jaroslav Kysela
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-12-15 17:47 UTC by Christian Casteyde
Modified: 2016-07-10 11:21 UTC (History)
1 user (show)

See Also:
Kernel Version: 4.3.3
Subsystem:
Regression: No
Bisected commit-id:


Attachments
Test fix patch (401 bytes, patch)
2016-01-12 08:58 UTC, Takashi Iwai
Details | Diff
Patch to add timer refcount (1.73 KB, patch)
2016-01-12 14:33 UTC, Takashi Iwai
Details | Diff
Fix patch #2 (6.96 KB, patch)
2016-01-21 16:33 UTC, Takashi Iwai
Details | Diff
Revised fix patch #2 (6.18 KB, patch)
2016-01-21 16:50 UTC, Takashi Iwai
Details | Diff

Description Christian Casteyde 2015-12-15 17:47:12 UTC
I have a USB audio stick.
I was playing a CD on this audio device with vlc. I removed the device while it was opened by VLC.
a. VLC was freezed.
b. Kill -9 on it leaves it "Zl" (zombi).
c. I closed the X session.
d. I noticed the following crash dump on the console:
usb 1-2: ep 0x3 - rounding interval to 64 microframes, ep desc says 80 microframes
input: SYNIC SYNIC Wireless Audio as /devices/pci0000:00/0000:00:1c.3/0000:05:00.0/usb1/1-2/1-2:1.3/0003:1A1D:0002.0006/input/
input18
hid-generic 0003:1A1D:0002.0006: input,hidraw4: USB HID v1.00 Device [SYNIC SYNIC Wireless Audio] on usb-0000:05:00.0-2/input3
usb 1-2: USB disconnect, device number 3
BUG: unable to handle kernel NULL pointer dereference at 0000000000000190
IP: [<ffffffff815ae7ef>] usb_autopm_put_interface+0xf/0x30
PGD 0 
Oops: 0002 [#1] PREEMPT SMP 
Modules linked in:
CPU: 6 PID: 8207 Comm: vlc Not tainted 4.3.3 #10
Hardware name: Acer Aspire 7750G/JE70_HR, BIOS V1.07 03/02/2011
task: ffff880070901700 ti: ffff8800712c8000 task.ti: ffff8800712c8000
RIP: 0010:[<ffffffff815ae7ef>]  [<ffffffff815ae7ef>] usb_autopm_put_interface+0xf/0x30
RSP: 0018:ffff8800712cbb10  EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff880191123a18 RCX: 000000018040003b
RDX: 0000000100721458 RSI: 0000000000000001 RDI: ffff8801c6fcc400
RBP: ffff8800712cbb10 R08: 0000000000000000 R09: ffff8801cf5974c0
R10: ffffea00060c3fc0 R11: ffff8801c8803b00 R12: ffff880191123a10
R13: ffff880071026200 R14: ffff8801c7af7cd0 R15: ffff8800aaaf56e8
FS:  00007fe0647d7700(0000) GS:ffff8801cf580000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000190 CR3: 0000000001c80000 CR4: 00000000000406e0
Stack:
 ffff8800712cbb20 ffffffff8173bd94 ffff8800712cbb40 ffffffff817461bc
 ffff8801c6fc9000 ffff8800aaaf5000 ffff8800712cbb50 ffffffff8174621f
 ffff8800712cbb70 ffffffff816ef58a ffff8801c6fc9000 ffff8800aaaf5000
Call Trace:
 [<ffffffff8173bd94>] snd_usb_autosuspend+0x14/0x20
 [<ffffffff817461bc>] snd_usb_pcm_close.isra.14+0x5c/0x90
 [<ffffffff8174621f>] snd_usb_playback_close+0xf/0x20
 [<ffffffff816ef58a>] snd_pcm_release_substream.part.36+0x3a/0x90
 [<ffffffff816ef6b3>] snd_pcm_release+0xa3/0xb0
 [<ffffffff816debb0>] snd_disconnect_release+0xd0/0xe0
 [<ffffffff8114d417>] __fput+0x97/0x1d0
 [<ffffffff8114d589>] ____fput+0x9/0x10
 [<ffffffff8109e452>] task_work_run+0x72/0x90
 [<ffffffff81088510>] do_exit+0x280/0xa80
 [<ffffffff8108996a>] do_group_exit+0x3a/0xa0
 [<ffffffff8109261f>] get_signal+0x1df/0x540
 [<ffffffff81040903>] do_signal+0x23/0x620
 [<ffffffff8114c128>] ? do_readv_writev+0x128/0x200
 [<ffffffff810012e1>] prepare_exit_to_usermode+0x91/0xd0
 [<ffffffff810013ba>] syscall_return_slowpath+0x9a/0x120
 [<ffffffff817587cd>] ? __sys_recvmsg+0x5d/0x70
 [<ffffffff810d2765>] ? ktime_get_ts64+0x45/0xe0
 [<ffffffff8115dea0>] ? SyS_poll+0x60/0xf0
 [<ffffffff818d2327>] int_ret_from_sys_call+0x25/0x8f
Code: 90 00 00 00 48 89 e5 e8 70 b7 f3 ff 5d c3 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 8b 15 18 f8 6c 00 48 8b 47 30 48 89 e5 <48> 89 90 90 01 00 00 f0 ff 8f e0 02 00 00 be 04 00 00 00 48 83 
RIP  [<ffffffff815ae7ef>] usb_autopm_put_interface+0xf/0x30
 RSP <ffff8800712cbb10>
CR2: 0000000000000190
---[ end trace 5475a2d2d7ca39fd ]---
Fixing recursive fault but reboot is needed!
Comment 1 Takashi Iwai 2016-01-12 08:57:56 UTC
Does the patch below fix the issue?
Comment 2 Takashi Iwai 2016-01-12 08:58:24 UTC
Created attachment 199421 [details]
Test fix patch
Comment 3 Christian Casteyde 2016-01-12 12:11:43 UTC
I've managed to reproduce once with Kernel 4.4.0 without the patch, and I haven't managed to reproduce with the attached patch after 10 usb disconnect.
So the patch seems to fix the problem.
Vlc still do not shutdown correctly after disconnection, and when killing vlc after disconnect I sometimes get the following log in dmesg:

ALSA: timer ffff8800a89d7200 is busy?

but this may be correct.
Comment 4 Takashi Iwai 2016-01-12 14:32:40 UTC
You're using dmix?  This might be because we haven't do a proper refcount for timer device access.  Could you try the patch below in addition?

Meanwhile I submitted the USB-audio fix patch.  It'll be likely included in the next pull request and will be backported eventually.
Comment 5 Takashi Iwai 2016-01-12 14:33:09 UTC
Created attachment 199431 [details]
Patch to add timer refcount
Comment 6 Christian Casteyde 2016-01-13 20:44:32 UTC
I don't know if I'm really using dmix. I do not have ALSA config file (neither in /etc or in home), but I believed dmix was always on by default now.

Anyway, with the refcount patch, I do not get the timer is busy warning, but vlc stays in zombie state forever now. After that, it is impossible to restart another one, and I cannot kill its parent either (kde4 init). Even reboot fails (/home and / not properly unmounted because "in use"), and reboot blocks at last until hard reset.

It seems the process is definitively blocked, so it's worse as far as the user is concerned.
Comment 7 Takashi Iwai 2016-01-16 17:37:12 UTC
OK, then could you tell the exact way how to reproduce the issue?  Which application, how it is started, and how is it setup?  It's pretty important to know how the audio device is accessed, either dmix, PA, direct hw, or whatever.
Comment 8 Christian Casteyde 2016-01-17 14:41:37 UTC
OK, I'll describe the exact manipulation.
I'm running Slackware 14.1 64bits, which doesn't use any fancy things such as dmix, etc.
I have no /etc/asound.conf nor ~/.asoundrc file, so I'm using ALSA with default configuration.
I'm running KDE, and playing sounds with VLC.

The way to reproduce the problem is:
1) Plug in the USB audio transmitter.
2) Insert a CD in the drive.
3) Starts playing the CD with VLC (from the KDE notification popup menu). This starts vlc fiving it the CD device to read.
4) VLC starts playing the first title, on the laptop soundcard.
5) I switch the audio output to the USB card, using VLC menu "Audio | Audio device" to select the USB devince (default).
VLC is configured to do "SRC resampling" with "Sinc function (best quality)" since ALSA do not resample correctly (cracks and cute sounds) to the USB card sample rate (48kHz whereas the CD is at 44kHz).
6) After playing the CD a while, I simply disconnect the USB drive, ***before*** closing VLC.
7) Then I try to close VLC: its windows is indeed closed, but VLC remains in the task list.
8) If I try to restart it, it doesn't since it's also configured as single instance (option "Interface|Use only one instance when started from file manager"). So I try to kill it with "killall -9 vlc".
9) VLC is then Zombie, with some KDE process as parent, but actually the kernel is already in a mess. Closing KDE session takes a while, reboot is endless: after a warning saying home and / partition are in use, the kernel freezes.
It seems VLC is stucked while sending to the removed interface (in kernel space).

Please note that sometimes, it works. But I can reproduce say 1 over 3 tries.
Of course, closing VLC before removing the USB sticks works fine.
Comment 9 Takashi Iwai 2016-01-21 16:33:00 UTC
Thanks, I could manage to reproduce it locally, too.

Below is the fix patch.  It works on my machine.  VLC keeps spewing some error messages after disconnection, but it's a problem in the application side, basically.
Comment 10 Takashi Iwai 2016-01-21 16:33:31 UTC
Created attachment 200711 [details]
Fix patch #2
Comment 11 Takashi Iwai 2016-01-21 16:50:53 UTC
Created attachment 200741 [details]
Revised fix patch #2
Comment 12 Christian Casteyde 2016-07-10 11:21:01 UTC
Has been fixed but not closed, so closing.

Note You need to log in before you can comment on or make changes to this bug.