Bug 194819 - rtsx_usb_ms prevents my computer from resuming from suspend - Dell Inspiron 5558
Summary: rtsx_usb_ms prevents my computer from resuming from suspend - Dell Inspiron 5558
Status: RESOLVED CODE_FIX
Alias: None
Product: Drivers
Classification: Unclassified
Component: USB (show other bugs)
Hardware: Intel Linux
: P1 normal
Assignee: Zhang Rui
URL:
Keywords:
: 119401 (view as bug list)
Depends on:
Blocks:
 
Reported: 2017-03-08 19:57 UTC by Diego Viola
Modified: 2017-04-03 13:09 UTC (History)
4 users (show)

See Also:
Kernel Version: 4.9.11-1-ARCH
Subsystem:
Regression: No
Bisected commit-id:


Attachments
lspci (1.23 KB, text/plain)
2017-03-08 19:58 UTC, Diego Viola
Details
dmesg 1 (12.70 KB, text/plain)
2017-03-13 23:34 UTC, Diego Viola
Details
dmesg 2 (14.84 KB, text/plain)
2017-03-13 23:34 UTC, Diego Viola
Details
trace after the hang with no_console_suspend=1 (2.42 MB, image/jpeg)
2017-03-13 23:39 UTC, Diego Viola
Details
trace after the hang with no_console_suspend=1 (2.12 MB, image/jpeg)
2017-03-13 23:41 UTC, Diego Viola
Details
Picture of my desktop (1.61 MB, image/jpeg)
2017-03-13 23:53 UTC, Diego Viola
Details
Uninterruptible processes after booting system (2.31 MB, image/jpeg)
2017-03-14 17:35 UTC, Diego Viola
Details
lsusb (632 bytes, text/plain)
2017-03-14 21:34 UTC, Diego Viola
Details
dmidecode (22.63 KB, text/plain)
2017-03-14 21:34 UTC, Diego Viola
Details
lsusb -t (833 bytes, text/plain)
2017-03-16 15:37 UTC, Diego Viola
Details
lsusb -t with USB 3.0 disabled on BIOS (730 bytes, text/plain)
2017-03-16 15:50 UTC, Diego Viola
Details
netconsole dmesg capture with xhci_hcd.dyndbg no_console_suspend=1 (597.93 KB, text/plain)
2017-03-17 15:44 UTC, Diego Viola
Details
/sys/kernel/debug/tracing/trace (1.19 MB, text/plain)
2017-03-20 15:33 UTC, Diego Viola
Details
/sys/kernel/debug/tracing/trace with successful suspend/resume (703.64 KB, text/plain)
2017-03-20 18:21 UTC, Diego Viola
Details
dmesg with ftrace at the time of the crash. (829.45 KB, text/plain)
2017-03-21 22:47 UTC, Diego Viola
Details

Description Diego Viola 2017-03-08 19:57:08 UTC
My Dell Inspiron 5558 hangs on resume from suspend if I have USB 3.0 enabled on the BIOS, it works fine with USB 2.0 (ehci_hcd).

The way I reproduce the problem is with this command:

$ i3lock && systemctl suspend

This is what I see on the screen when it hangs:

https://dl.dropboxusercontent.com/u/6005119/dell/IMG_20170308_095000.jpg
https://dl.dropboxusercontent.com/u/6005119/dell/IMG_20170307_133928.jpg

Some logs:

https://dl.dropboxusercontent.com/u/6005119/dell/dmesg1.txt
https://dl.dropboxusercontent.com/u/6005119/dell/dmesg2.txt

I'm on Arch Linux x86_64, kernel 4.9.11-1-ARCH.

I also tried Linux 4.10.1 and I could reproduce this problem there as well.

Please let me know if I could provide more info.

Thanks,
Diego
Comment 1 Diego Viola 2017-03-08 19:58:39 UTC
Created attachment 255137 [details]
lspci
Comment 2 Chen Yu 2017-03-13 10:29:49 UTC
could you please upload the jpg and dmesg onto bugzilla as attachment, thx.
Comment 3 Diego Viola 2017-03-13 23:34:31 UTC
Created attachment 255227 [details]
dmesg 1
Comment 4 Diego Viola 2017-03-13 23:34:51 UTC
Created attachment 255229 [details]
dmesg 2
Comment 5 Diego Viola 2017-03-13 23:39:31 UTC
Created attachment 255231 [details]
trace after the hang with no_console_suspend=1
Comment 6 Diego Viola 2017-03-13 23:41:36 UTC
Created attachment 255233 [details]
trace after the hang with no_console_suspend=1
Comment 7 Diego Viola 2017-03-13 23:42:21 UTC
(In reply to Chen Yu from comment #2)
> could you please upload the jpg and dmesg onto bugzilla as attachment, thx.

Done.
Comment 8 Diego Viola 2017-03-13 23:47:59 UTC
Please note I'm not exactly sure USB3 is the problem in this case, as I said in my previous reply, I've found that disabling USB3 makes the problem go away, but I also noticed that after disabling HDMI (the external monitor I'm using), the problem also goes away.

So I'm suspecting the problem could be related to power or I/O in some way, the laptop has a 5400 RPM hard drive and I'm not sure if the cause of the hang is due to pushing a lot of I/O and the drive/machine not being able to handle it.

I'm not sure how I can further debug this problem, any advice welcome.
Comment 9 Diego Viola 2017-03-13 23:53:20 UTC
Created attachment 255235 [details]
Picture of my desktop

Here is a picture of my desktop/setup, I'm using the external monitor over HDMI.
Comment 10 Diego Viola 2017-03-14 00:08:49 UTC
I also tried Linux 4.4.52 and the problem was also still there, which makes me suspect it's a hardware problem.
Comment 11 Diego Viola 2017-03-14 17:24:26 UTC
I've found something interesting and what it seems to be the cause of my problem.

As soon as I boot my system I can see the following process being in the D-state:

[root@myhost ~]# ps aux | grep " D"
root       269  0.0  0.0      0     0 ?        D    14:11   0:00 [rtsx_usb_ms_2]
root      1490  0.0  0.0  10788  2200 pts/2    S+   14:23   0:00 grep  D
[root@myhost ~]# 

I'm not exactly sure why that process is in the D-state, but if I do a "modinfo rtsx_pci_ms" the problem is gone, I already tried suspending/resuming ~40 times after I disabled that module and the suspend/resume problem is gone.
Comment 12 Diego Viola 2017-03-14 17:25:43 UTC
(In reply to Diego Viola from comment #11)
> I've found something interesting and what it seems to be the cause of my
> problem.
> 
> As soon as I boot my system I can see the following process being in the
> D-state:
> 
> [root@myhost ~]# ps aux | grep " D"
> root       269  0.0  0.0      0     0 ?        D    14:11   0:00
> [rtsx_usb_ms_2]
> root      1490  0.0  0.0  10788  2200 pts/2    S+   14:23   0:00 grep  D
> [root@myhost ~]# 
> 
> I'm not exactly sure why that process is in the D-state, but if I do a
> "modinfo rtsx_pci_ms" the problem is gone, I already tried
> suspending/resuming ~40 times after I disabled that module and the
> suspend/resume problem is gone.

rmmod rtsx_pci_ms, not modinfo, sorry.
Comment 13 Diego Viola 2017-03-14 17:35:06 UTC
Created attachment 255245 [details]
Uninterruptible processes after booting system

After booting the system those appear as D-state processes, only [rtsx_usb_ms_2] remains there and it seems to be the cause of my problem.
Comment 14 Diego Viola 2017-03-14 21:34:00 UTC
Created attachment 255251 [details]
lsusb
Comment 15 Diego Viola 2017-03-14 21:34:53 UTC
Created attachment 255253 [details]
dmidecode
Comment 16 Diego Viola 2017-03-15 14:01:59 UTC
According to this document:

http://downloads.dell.com/manuals/all-products/esuprt_laptop/esuprt_inspiron_laptop/inspiron-15-5558-laptop_reference%20guide_en-us.pdf

My computer only has a SD card slot and no MEMSTICK slot.

lsusb says this though:

Bus 001 Device 005: ID 0bda:0129 Realtek Semiconductor Corp. RTS5129 Card Reader Controller

Maybe the driver gets locked up looking for the MEMSTICK slot?
Comment 17 Chen Yu 2017-03-15 14:38:22 UTC
(In reply to Diego Viola from comment #13)
> Created attachment 255245 [details]
> Uninterruptible processes after booting system
> 
> After booting the system those appear as D-state processes, only
> [rtsx_usb_ms_2] remains there and it seems to be the cause of my problem.
Nice catch. Although I can not open your attachment, I suggest you can check the kernel stack via 'cat /proc/269/stack, this will tell us where the code was blocked at in kernel space, thus we can get some clue.
Comment 18 Chen Yu 2017-03-15 14:39:33 UTC
(In reply to Chen Yu from comment #17)
> (In reply to Diego Viola from comment #13)
> > Created attachment 255245 [details]
> > Uninterruptible processes after booting system
> > 
> > After booting the system those appear as D-state processes, only
> > [rtsx_usb_ms_2] remains there and it seems to be the cause of my problem.
> Nice catch. Although I can not open your attachment, I suggest you can check
> the kernel stack via 'cat /proc/269/stack, this will tell us where the code
> was blocked at in kernel space, thus we can get some clue.
269 is the pid of the thread rtsx_usb_ms_2
Comment 19 Diego Viola 2017-03-15 14:47:03 UTC
[root@myhost ~]# ps aux | grep " D"
root       214  0.0  0.0      0     0 ?        D    10:03   0:02 [kworker/0:3]
root      8579  0.0  0.0      0     0 ?        D    11:45   0:00 [rtsx_usb_ms_2]
root      8615  0.0  0.0  10788  2220 pts/8    S+   11:46   0:00 grep  D
[root@myhost ~]# cat /proc/8579/stack 
[<ffffffffa062039e>] rtsx_usb_detect_ms_card+0x6e/0x120 [rtsx_usb_ms]
[<ffffffff810a3361>] kthread+0x101/0x140
[<ffffffff8162eb9c>] ret_from_fork+0x2c/0x40
[<ffffffffffffffff>] 0xffffffffffffffff
[root@myhost ~]#
Comment 20 Diego Viola 2017-03-15 14:48:32 UTC
I'm confused as to why rtsx_usb_ms is being loaded at all if I don't have a MEMSTICK slot.
Comment 21 Diego Viola 2017-03-15 15:21:02 UTC
(In reply to Diego Viola from comment #19)
> [root@myhost ~]# ps aux | grep " D"
> root       214  0.0  0.0      0     0 ?        D    10:03   0:02
> [kworker/0:3]
> root      8579  0.0  0.0      0     0 ?        D    11:45   0:00
> [rtsx_usb_ms_2]
> root      8615  0.0  0.0  10788  2220 pts/8    S+   11:46   0:00 grep  D
> [root@myhost ~]# cat /proc/8579/stack 
> [<ffffffffa062039e>] rtsx_usb_detect_ms_card+0x6e/0x120 [rtsx_usb_ms]
> [<ffffffff810a3361>] kthread+0x101/0x140
> [<ffffffff8162eb9c>] ret_from_fork+0x2c/0x40
> [<ffffffffffffffff>] 0xffffffffffffffff
> [root@myhost ~]#

Please note I ran this command while using the system normally (before suspend/resume).

I can't do this after the system hangs because I won't be able to connect via ssh, the machine doesn't respond after it hangs.
Comment 22 Chen Yu 2017-03-15 15:27:10 UTC
I think it blocked here:
 mutex_lock(&ucr->dev_mutex);
but I don't know why, maybe we should let expert to deal with it,
please help send an email to Roger Tseng <rogerable@realtek.com>
Comment 23 Diego Viola 2017-03-15 20:59:09 UTC
(In reply to Chen Yu from comment #22)
> I think it blocked here:
>  mutex_lock(&ucr->dev_mutex);
> but I don't know why, maybe we should let expert to deal with it,
> please help send an email to Roger Tseng <rogerable@realtek.com>

Done.
Comment 24 Diego Viola 2017-03-16 15:37:54 UTC
Created attachment 255301 [details]
lsusb -t
Comment 25 Diego Viola 2017-03-16 15:50:06 UTC
Created attachment 255303 [details]
lsusb -t with USB 3.0 disabled on BIOS
Comment 26 Diego Viola 2017-03-16 16:07:12 UTC
I've noticed rtsx_usb_ms is still being loaded after USB 3.0 is disabled on the BIOS, and at that point the hang doesn't occur on resume from suspend.
Comment 27 Oleksandr Natalenko 2017-03-16 19:56:41 UTC
// responding here as per author email request

I cannot confirm the issue with my hardware:

Bus 001 Device 003: ID 0bda:0129 Realtek Semiconductor Corp. RTS5129 Card Reader Controller
Device Descriptor:
...
  bcdUSB               2.00
...
  idVendor           0x0bda Realtek Semiconductor Corp.
  idProduct          0x0129 RTS5129 Card Reader Controller
...

rtsx_usb_ms_%d poller kthread is OK to be in D state. Mine:

  416 ?        D      0:56 [rtsx_usb_ms_1]

It sleeps on schedule_timeout_idle(HZ), which is kinda normal sleep.

Suspend works for me just fine.

Please note my card reader reports USB 2.0 as highest supportable, so I assume there is some issue with USB 3.0 sleep on your system.
Comment 28 Diego Viola 2017-03-17 04:42:18 UTC
(In reply to Oleksandr Natalenko from comment #27)
> // responding here as per author email request
> 
> I cannot confirm the issue with my hardware:
> 
> Bus 001 Device 003: ID 0bda:0129 Realtek Semiconductor Corp. RTS5129 Card
> Reader Controller
> Device Descriptor:
> ...
>   bcdUSB               2.00
> ...
>   idVendor           0x0bda Realtek Semiconductor Corp.
>   idProduct          0x0129 RTS5129 Card Reader Controller
> ...
> 
> rtsx_usb_ms_%d poller kthread is OK to be in D state. Mine:
> 
>   416 ?        D      0:56 [rtsx_usb_ms_1]
> 
> It sleeps on schedule_timeout_idle(HZ), which is kinda normal sleep.
> 
> Suspend works for me just fine.
> 
> Please note my card reader reports USB 2.0 as highest supportable, so I
> assume there is some issue with USB 3.0 sleep on your system.

I can only make it hang with rtsx_usb_ms + USB 3.0 (xhci_hcd) and not with rtsx_usb_ms + USB 2.0 (ehci_hcd).
Comment 29 Diego Viola 2017-03-17 15:44:40 UTC
Created attachment 255309 [details]
netconsole dmesg capture with xhci_hcd.dyndbg no_console_suspend=1
Comment 30 Diego Viola 2017-03-17 16:22:58 UTC
(In reply to Diego Viola from comment #29)
> Created attachment 255309 [details]
> netconsole dmesg capture with xhci_hcd.dyndbg no_console_suspend=1

In this attachment I did a couple of suspend/resume with i3lock, e.g.

$ i3lock && systemctl suspend

It hung after the third or so attempt at resuming from suspend.
Comment 31 Joost van Zwieten 2017-03-20 12:30:18 UTC
*** Bug 119401 has been marked as a duplicate of this bug. ***
Comment 32 Diego Viola 2017-03-20 15:33:40 UTC
Created attachment 255367 [details]
/sys/kernel/debug/tracing/trace

I built Linux 4.11.0-rc3-ARCH and captured this trace with:

mount -t debugfs none /sys/kernel/debug
echo xhci-hcd >> /sys/kernel/debug/tracing/set_event

cat /sys/kernel/debug/tracing/trace

USB mouse/keyboard was unplugged before booting the machine.

This trace was requested by the USB 3 maintainer on the LKML (linux-kernel, linux-usb lists).
Comment 33 Diego Viola 2017-03-20 18:21:32 UTC
Created attachment 255369 [details]
/sys/kernel/debug/tracing/trace with successful suspend/resume
Comment 34 Diego Viola 2017-03-21 22:47:34 UTC
Created attachment 255419 [details]
dmesg with ftrace at the time of the crash.

kernel parameters used for getting the trace:

hung_task_panic=1 no_console_suspend=1 ftrace_dump_on_oops

USB keyboard and mouse were plugged.

The module netconsole was loaded/configured after the system was already running.
Comment 35 Diego Viola 2017-03-25 14:47:19 UTC
Fixed with this patch:

https://patchwork.kernel.org/patch/9642795/
Comment 36 Diego Viola 2017-04-03 13:09:51 UTC
The patch has already been upstreamed and is already in Linux 4.11-rc5.

Note You need to log in before you can comment on or make changes to this bug.