Created attachment 287591 [details] Dmesg log when disconnecting dock Hello, I have a Lenovo C940 under Manjaro with unstable packages and Kernel 5.6-rc2 using a OWC Thunderbolt3 dock. If I connect the dock before boot, it will work correctly. However, if a connect/disconnect it after boot it won't be recognize and will lead soon after to system crash. The gnome icons start to disappear and when using commands on terminal input/output failure errors appear. I found another bug which seems related to mine, but in my case it leads to system crash: https://bugzilla.kernel.org/show_bug.cgi?id=206459. I am encountering another problem with USB-C drive on thunderbolt3 ports working if connected before boot and not otherwise. But without leading to system crash: https://bugzilla.kernel.org/show_bug.cgi?id=206649 I joined dmesg where disconnection starts at time 185 with some errors occuring just after like: [ 184.038519] usb usb6: USB disconnect, device number 1 [ 184.038523] usb 6-3: USB disconnect, device number 2 [ 184.167256] usb 5-4: Not enough bandwidth for altsetting 1 [ 184.167260] usb 5-4: 1:1: usb_set_interface failed (-19) Also the journalctl shows some errors: fév 23 17:55:06 pc-de-user kernel: usb usb6: USB disconnect, device number 1 fév 23 17:55:06 pc-de-user kernel: usb 6-3: USB disconnect, device number 2 fév 23 17:55:06 pc-de-user kernel: usb 5-4: Not enough bandwidth for altsetting 1 fév 23 17:55:06 pc-de-user kernel: usb 5-4: 1:1: usb_set_interface failed (-19) Then the system crashes showing "input/output failures" just by using some commands like "ls" or "cat".
Created attachment 287593 [details] Journalctl when unplug of the dock
Hi, There seem to be a bunch of issues with these recent Lenovos. Can you attach full dmesg when you boot without device connected and then plug a device (assuming it does not crash)?
(In reply to Mika Westerberg from comment #2) > Hi, > > There seem to be a bunch of issues with these recent Lenovos. Can you attach > full dmesg when you boot without device connected and then plug a device > (assuming it does not crash)? Hello Mika, thanks for your reply. I took three dmesg logs: - after boot without any device connected -> boot-without-dock-5.6-rc3.log - just after connecting the dock -> connecting-dock-5.6-rc3.log - just before the system crash (gnome unresponsive with icon disappearing, input/output message failures on terminal, system freeze) -> dock-connected-before-crash.log I also put the result of lscpi -vnnt in the lscpi.txt file. Hope it can help.
Created attachment 287609 [details] Dmesg once booted without any device connected
Created attachment 287611 [details] Dmesg just after dock connected
Created attachment 287613 [details] Dmesg with dock connected before OS crashes
Created attachment 287615 [details] lspci -vnnt
Created attachment 287617 [details] Dmesg once booted with kernel 5.4.22 with dock connected Dmesg once booted with kernel 5.4.22 with dock connected from start and working.
Thanks for the logs. This looks like similar issue as in bug 206459. The PCIe resource allocation fails for the dock and then bad things start happen. There is a temporary hack patch in bug 206459, comment 46. I wonder if you could try it out as well?
Thanks a lot ! It seems it did the trick. Here the dmesg logs: - boot-without-dock-5.6-rc3-patched1.log: the log after boot without dock. - dock-connected-5.6-rc3-patched1.log: the log a few minutes after having connected the log - dock-reconnection-5.6-rc3-patched1.log: the log after disconnection and reconnection of the dock. It is working but several errors occured: [ 393.595374] usb 5-4: Not enough bandwidth for altsetting 1 [ 393.595378] usb 5-4: 1:1: usb_set_interface failed (-19) [ 393.705539] xhci_hcd 0000:03:00.0: Host halt failed, -19 [ 393.705542] xhci_hcd 0000:03:00.0: Host not accessible, reset failed. [ 393.706487] pcieport 0000:02:00.0: can't change power state from D0 to D3hot (config space inaccessible) [ 393.706498] pcieport 0000:02:00.0: PME# disabled [ 393.706502] pcieport 0000:02:00.0: can't change power state from D3cold to D0 (config space inaccessible) [ 393.706755] pcieport 0000:01:00.0: PME# disabled [ 505.290804] pci 0000:02:01.0: BAR 13: no space for [io size 0x1000] [ 505.290805] pci 0000:02:01.0: BAR 13: failed to assign [io size 0x1000] [ 505.290806] pci 0000:02:02.0: BAR 13: no space for [io size 0x1000] [ 505.290806] pci 0000:02:02.0: BAR 13: failed to assign [io size 0x1000] [ 505.290807] pci 0000:02:03.0: BAR 13: no space for [io size 0x1000] [ 505.290807] pci 0000:02:03.0: BAR 13: failed to assign [io size 0x1000] [ 505.290808] pci 0000:02:04.0: BAR 13: no space for [io size 0x1000] [ 505.290808] pci 0000:02:04.0: BAR 13: failed to assign [io size 0x1000]
Created attachment 287645 [details] the log after boot without dock
Created attachment 287647 [details] the log a few minutes after having connected the log
Created attachment 287649 [details] the log after disconnection and reconnection of the dock.
Looks like you don't have TBT driver enabled, or at least I don't see any messages emitted by it. Can you check if you have "CONFIG_USB4" set to y/m in your .config and if not please enable it. While there check also "CONFIG_PCI_DEBUG" and enable it as well. It looks like the TBT link goes down which could indicate either an issue on PM side or firmware.
Indeed Manjaro Kernel 5.6-rc3 does not have CONFIG_USB4 at all. Should it be reported to the manjaro kernel maintainers or is it something going upstream ?
It was changed from CONFIG_THUNDERBOLT -> CONFIG_USB4 in v5.6-rc1. Distro builders should see a prompt asking about it but I guess they missed it. If you know how to contact Manjaro people then definitely makes sense to mention this one. BTW, I attached a slightly better patch for the resource allocation on that other bug if you have time please try it out as well (a drop the hack patch).
I rebuilded the kernel with CONFIG_USB4 and PCI_DEBUG and the patch. I attached the log of dmesg when connecting the dock after boot and also on deconnection. Still have these messages: [ 91.235272] pci 0000:02:00.0: BAR 13: assigned [io 0x4000-0x4fff] [ 91.235273] pci 0000:02:01.0: BAR 13: no space for [io size 0x1000] [ 91.235274] pci 0000:02:01.0: BAR 13: failed to assign [io size 0x1000] [ 91.235276] pci 0000:02:02.0: BAR 13: no space for [io size 0x1000] [ 91.235276] pci 0000:02:02.0: BAR 13: failed to assign [io size 0x1000] [ 91.235278] pci 0000:02:03.0: BAR 13: no space for [io size 0x1000] [ 91.235279] pci 0000:02:03.0: BAR 13: failed to assign [io size 0x1000] [ 91.235280] pci 0000:02:04.0: BAR 13: no space for [io size 0x1000] [ 91.235281] pci 0000:02:04.0: BAR 13: failed to assign [io size 0x1000] On deconnection: [ 329.891837] pcieport 0000:02:03.0: can't change power state from D0 to D3hot (config space inaccessible) [ 329.891901] pcieport 0000:02:03.0: PME# disabled [ 329.891905] pcieport 0000:02:03.0: can't change power state from D3cold to D0 (config space inaccessible) [ 329.892276] xhci_hcd 0000:05:00.0: PME# disabled And a large bunch of these messages: [ 493.181713] usb 5-4: Not enough bandwidth for altsetting 1 [ 493.181714] usb 5-4: 2:1: usb_set_interface failed (-19) [ 493.181736] usb 5-4: Not enough bandwidth for altsetting 1 [ 493.181737] usb 5-4: 2:1: usb_set_interface failed (-19) [ 493.181798] usb 5-4: Not enough bandwidth for altsetting 1 [ 493.181799] usb 5-4: 2:1: usb_set_interface failed (-19) [ 493.181865] systemd-journald[331]: /dev/kmsg buffer overrun, some messages lost. [ 493.181866] usb 5-4: Not enough bandwidth for altsetting 1 [ 493.181867] usb 5-4: 2:1: usb_set_interface failed (-19) Which I think comes from the usb drive disconnection. But quiet a lot of these logs are generated which I believe is not normal. Otherwise dock seems to work perfectly ! Thanks !
Created attachment 287677 [details] Dmesg just after dock connected
Created attachment 287679 [details] Dmesg log when disconnecting dock
I just encountered several successive unexpected disconnection/reconnection of the dock interface while writing this message. I joined two dmesg log took right after the deconnection/reconnection, hopefully you will find some hints on what happened.
Created attachment 287681 [details] Connection to dock droped and recovered
Created attachment 287683 [details] Second successive connection to dock droped and recovered
(In reply to A from comment #17) > I rebuilded the kernel with CONFIG_USB4 and PCI_DEBUG and the patch. I > attached the log of dmesg when connecting the dock after boot and also on > deconnection. Still have these messages: > > [ 91.235272] pci 0000:02:00.0: BAR 13: assigned [io 0x4000-0x4fff] > [ 91.235273] pci 0000:02:01.0: BAR 13: no space for [io size 0x1000] > [ 91.235274] pci 0000:02:01.0: BAR 13: failed to assign [io size 0x1000] > [ 91.235276] pci 0000:02:02.0: BAR 13: no space for [io size 0x1000] > [ 91.235276] pci 0000:02:02.0: BAR 13: failed to assign [io size 0x1000] > [ 91.235278] pci 0000:02:03.0: BAR 13: no space for [io size 0x1000] > [ 91.235279] pci 0000:02:03.0: BAR 13: failed to assign [io size 0x1000] > [ 91.235280] pci 0000:02:04.0: BAR 13: no space for [io size 0x1000] > [ 91.235281] pci 0000:02:04.0: BAR 13: failed to assign [io size 0x1000] These are fine. I/O space is not expected to used with PCIe devices and there is not room in the root port either. > On deconnection: > [ 329.891837] pcieport 0000:02:03.0: can't change power state from D0 to > D3hot (config space inaccessible) > [ 329.891901] pcieport 0000:02:03.0: PME# disabled > [ 329.891905] pcieport 0000:02:03.0: can't change power state from D3cold > to D0 (config space inaccessible) > [ 329.892276] xhci_hcd 0000:05:00.0: PME# disabled You mean when you unplug the device? Yes, these are expected as the PCIe devices disappear the PCI stack cannot really do much except log these and tear down the stack. > And a large bunch of these messages: > [ 493.181713] usb 5-4: Not enough bandwidth for altsetting 1 > [ 493.181714] usb 5-4: 2:1: usb_set_interface failed (-19) > [ 493.181736] usb 5-4: Not enough bandwidth for altsetting 1 > [ 493.181737] usb 5-4: 2:1: usb_set_interface failed (-19) > [ 493.181798] usb 5-4: Not enough bandwidth for altsetting 1 > [ 493.181799] usb 5-4: 2:1: usb_set_interface failed (-19) > [ 493.181865] systemd-journald[331]: /dev/kmsg buffer overrun, some > messages lost. > [ 493.181866] usb 5-4: Not enough bandwidth for altsetting 1 > [ 493.181867] usb 5-4: 2:1: usb_set_interface failed (-19) > > Which I think comes from the usb drive disconnection. But quiet a lot of > these logs are generated which I believe is not normal. Are these coming from a certain USB device? Do you see these if you connect it directly to the host (not to the dock)? In any case this is different issue most likely related to USB, not PCI/TBT.
Hmm, I still don't see any messages from Thunderbolt driver in your latest logs. It should at least log the connect/disconnect but I don't see them. Can you double check that you have the driver loaded? You can run for example 'lspci -k' and it shows the drivers bound to the PCI devices.
Here the result of the lspci -k command: 00:00.0 Host bridge: Intel Corporation Device 8a12 (rev 03) Subsystem: Lenovo Device 3801 Kernel driver in use: icl_uncore 00:02.0 VGA compatible controller: Intel Corporation Iris Plus Graphics G7 (rev 07) Subsystem: Lenovo Iris Plus Graphics G7 Kernel driver in use: i915 Kernel modules: i915 00:04.0 Signal processing controller: Intel Corporation Device 8a03 (rev 03) Subsystem: Lenovo Device 3802 Kernel driver in use: proc_thermal Kernel modules: processor_thermal_device 00:07.0 PCI bridge: Intel Corporation Ice Lake Thunderbolt 3 PCI Express Root Port #0 (rev 03) Kernel driver in use: pcieport 00:07.1 PCI bridge: Intel Corporation Ice Lake Thunderbolt 3 PCI Express Root Port #1 (rev 03) Kernel driver in use: pcieport 00:0d.0 USB controller: Intel Corporation Ice Lake Thunderbolt 3 USB Controller (rev 03) Subsystem: Lenovo Ice Lake Thunderbolt 3 USB Controller Kernel driver in use: xhci_hcd Kernel modules: xhci_pci 00:0d.2 System peripheral: Intel Corporation Ice Lake Thunderbolt 3 NHI #0 (rev 03) Kernel driver in use: thunderbolt Kernel modules: thunderbolt 00:12.0 Serial controller: Intel Corporation Device 34fc (rev 30) Subsystem: Lenovo Device 384d Kernel driver in use: intel_ish_ipc Kernel modules: intel_ish_ipc 00:14.0 USB controller: Intel Corporation Ice Lake-LP USB 3.1 xHCI Host Controller (rev 30) Subsystem: Lenovo Ice Lake-LP USB 3.1 xHCI Host Controller Kernel driver in use: xhci_hcd Kernel modules: xhci_pci 00:14.2 RAM memory: Intel Corporation Device 34ef (rev 30) Subsystem: Lenovo Device 3846 00:14.3 Network controller: Intel Corporation Killer Wi-Fi 6 AX1650i 160MHz Wireless Network Adapter (201NGW) (rev 30) Subsystem: Intel Corporation Killer Wi-Fi 6 AX1650i 160MHz Wireless Network Adapter (201NGW) Kernel driver in use: iwlwifi Kernel modules: iwlwifi 00:15.0 Serial bus controller [0c80]: Intel Corporation Ice Lake-LP Serial IO I2C Controller #0 (rev 30) Subsystem: Lenovo Ice Lake-LP Serial IO I2C Controller Kernel driver in use: intel-lpss Kernel modules: intel_lpss_pci 00:15.1 Serial bus controller [0c80]: Intel Corporation Ice Lake-LP Serial IO I2C Controller #1 (rev 30) Subsystem: Lenovo Ice Lake-LP Serial IO I2C Controller Kernel driver in use: intel-lpss Kernel modules: intel_lpss_pci 00:15.2 Serial bus controller [0c80]: Intel Corporation Ice Lake-LP Serial IO I2C Controller #2 (rev 30) Subsystem: Lenovo Ice Lake-LP Serial IO I2C Controller Kernel driver in use: intel-lpss Kernel modules: intel_lpss_pci 00:16.0 Communication controller: Intel Corporation Management Engine Interface (rev 30) Subsystem: Lenovo Management Engine Interface Kernel driver in use: mei_me Kernel modules: mei_me 00:1d.0 PCI bridge: Intel Corporation Ice Lake-LP PCI Express Root Port #9 (rev 30) Kernel driver in use: pcieport 00:1f.0 ISA bridge: Intel Corporation Ice Lake-LP LPC Controller (rev 30) Subsystem: Lenovo Ice Lake-LP LPC Controller 00:1f.3 Multimedia audio controller: Intel Corporation Smart Sound Technology Audio Controller (rev 30) Subsystem: Lenovo Smart Sound Technology Audio Controller Kernel driver in use: sof-audio-pci Kernel modules: snd_hda_intel, snd_sof_pci 00:1f.4 SMBus: Intel Corporation Ice Lake-LP SMBus Controller (rev 30) Subsystem: Lenovo Ice Lake-LP SMBus Controller Kernel driver in use: i801_smbus Kernel modules: i2c_i801 00:1f.5 Serial bus controller [0c80]: Intel Corporation Ice Lake-LP SPI Controller (rev 30) Subsystem: Lenovo Ice Lake-LP SPI Controller Kernel driver in use: intel-spi Kernel modules: intel_spi_pci 01:00.0 PCI bridge: Intel Corporation JHL6540 Thunderbolt 3 Bridge (C step) [Alpine Ridge 4C 2016] (rev 02) Kernel driver in use: pcieport 02:00.0 PCI bridge: Intel Corporation JHL6540 Thunderbolt 3 Bridge (C step) [Alpine Ridge 4C 2016] (rev 02) Kernel driver in use: pcieport 02:01.0 PCI bridge: Intel Corporation JHL6540 Thunderbolt 3 Bridge (C step) [Alpine Ridge 4C 2016] (rev 02) Kernel driver in use: pcieport 02:02.0 PCI bridge: Intel Corporation JHL6540 Thunderbolt 3 Bridge (C step) [Alpine Ridge 4C 2016] (rev 02) Kernel driver in use: pcieport 02:03.0 PCI bridge: Intel Corporation JHL6540 Thunderbolt 3 Bridge (C step) [Alpine Ridge 4C 2016] (rev 02) Kernel driver in use: pcieport 02:04.0 PCI bridge: Intel Corporation JHL6540 Thunderbolt 3 Bridge (C step) [Alpine Ridge 4C 2016] (rev 02) Kernel driver in use: pcieport 03:00.0 USB controller: Fresco Logic FL1100 USB 3.0 Host Controller (rev 10) Subsystem: Device 1c7a:0018 Kernel driver in use: xhci_hcd Kernel modules: xhci_pci 04:00.0 USB controller: ASMedia Technology Inc. ASM1142 USB 3.1 Host Controller Subsystem: Device 1c7a:0017 Kernel driver in use: xhci_hcd Kernel modules: xhci_pci 05:00.0 USB controller: Fresco Logic FL1100 USB 3.0 Host Controller (rev 10) Subsystem: Device 1c7a:0016 Kernel driver in use: xhci_hcd Kernel modules: xhci_pci 06:00.0 Ethernet controller: Intel Corporation I210 Gigabit Network Connection (rev 03) Subsystem: Device 1c7a:0019 Kernel driver in use: igb Kernel modules: igb 07:00.0 USB controller: Intel Corporation JHL6540 Thunderbolt 3 USB Controller (C step) [Alpine Ridge 4C 2016] (rev 02) Subsystem: Device 1c7a:0015 Kernel driver in use: xhci_hcd Kernel modules: xhci_pci 55:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983 Subsystem: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983 Kernel driver in use: nvme
(In reply to Mika Westerberg from comment #24) > Hmm, I still don't see any messages from Thunderbolt driver in your latest > logs. It should at least log the connect/disconnect but I don't see them. > Can you double check that you have the driver loaded? You can run for > example 'lspci -k' and it shows the drivers bound to the PCI devices. Here another dmesg from 5.6-rc3, I recompiled the kernel using the Configure Gui instead of directly editing the config files. If not ok, not sure how to do it otherwise... But it shows thunderbolt driver enabled. Also today I encountered a lot of connection/disconnection of the dock. The dmesg contain one sequence of first connection/disconnection/connection. But it continues like that several time per minute.
Created attachment 287751 [details] Kernel 5.6-rc3 patched for thunderbolt.
Created attachment 287753 [details] Kernel 5.6-rc3 patched for thunderbolt.
Now I can see the Thunderbolt driver messages as well. What I was looking is a disconnect from Thunderbolt driver along with the PCIe hot-removal, and that seems to be the case here. I can see three reasons for this: 1. The type-C cable is wrong/broken 2. The dock (PD) firmware has an issue 3. Power management breaks things When you did not have the driver loaded and the issue still happened means that it is unlikely to be 3. because the type-C subsystem never enters D3cold if the Thunderbolt driver is not loaded. Are you able to try this on Windows and see if the disconnect/connect happens there as well?
I am using Windows 10 and only Manjaro for testing right now. I have no issue at all with the dock, usb-c connector neither with the USB3 drive connected to the dock with windows. Speed and stability are very good. I will have to test more these connection/disconnection because during first attempt with the patch I did not had so much disconnection I think. More like one every several minutes, while now it is several per minutes.
If it works under Windows then it is likely that the firmwares are OK. You may also try to disable runtime PM of the root ports, xHCI and TBT: # echo on > /sys/bus/pci/devices/0000:00:07.0/power/control # echo on > /sys/bus/pci/devices/0000:00:07.1/power/control # echo on > /sys/bus/pci/devices/0000:00:0d.0/power/control # echo on > /sys/bus/pci/devices/0000:00:0d.2/power/control and see if that helps at all.
Thanks Mika for your suggestions, but unfortunately neither worked. However, back to the beginning, if I start the laptop with the dock already plugged then no connectivity problem arise. Even when disconnecting/reconnecting the dock several time. But if I start with no dock, then connection/disconnection continuously occur...
Tested again, wrong cable connection as you correctly suggested... No luck I suppose when it worked during my tests under windows... Thanks for your help, will test under rc4 once released with Manjaro.
Can you elaborate a bit? You tried with another cable and it did not help, or it did? How about Windows? I'm asking because if it is firmware issue on the dock side, then you should see the same problem under Windows and then it might help if you upgrade the dock firmware.
Yep sorry was very short in explanation. When first set of connection/disconnection occurred under Linux I did not pay attention I had my phone connected to the dock. I had same problem when I restarted under windows and see same connection/disconnection happening. It stopped when I disconnected the phone. I retested under Linux, with no phone, no connection/disconnection problem. When plugging back the phone, connection/disconnection problem occurred again. Checked, finally my phone has a bad connector socket. So I think for this part it is ok, or should I test anything else ?
OK, so no disconnect/connect problem under Linux or Windows if the phone is not connected? Then the only problem is the resource issue that is fixed by the patch attached to bug 206657. I submitted it upstream but it is not yet merged to the mainline.
Yes exactly no disconnect/connect problem under Linux or Windows while the phone is not connected. Ok super nice, I will test that once merged ! I have identified another bug most probably related with thunderbolt which denies Linux to reboot correctly. It seems Linux shutdowns correctly until journal daemon is stopped, but then the screen stay black. I was able to restart after a REISUB. This is a bit weird and I still need to investigate. But to summarize: It happens when I plug the dock (still need to test with other thunderbolt/usb-c devices). If I start with nothing connected it will be ok. But, the problem will remain between reboot/shutdown. It even persists, or start if I boot under windows with no dock, plug the dock, unplug it under windows, reboot, start linux without dock, reboot -> get stuck. The only wait I found to go out of this loop is to plug at some point my usb-c laptop charger. I think the steps was force stop the laptop, plug the charger, start the laptop, boot under linux, then reboot should work. I will probably open a new bug report for that once I have more consistent elements. Except if you prefer to continue on this thread ?
(In reply to A from comment #37) > Yes exactly no disconnect/connect problem under Linux or Windows while the > phone is not connected. Ok super nice, I will test that once merged ! Great :) > I have identified another bug most probably related with thunderbolt which > denies Linux to reboot correctly. It seems Linux shutdowns correctly until > journal daemon is stopped, but then the screen stay black. I was able to > restart after a REISUB. > This is a bit weird and I still need to investigate. But to summarize: > It happens when I plug the dock (still need to test with other > thunderbolt/usb-c devices). If I start with nothing connected it will be ok. > > But, the problem will remain between reboot/shutdown. It even persists, or > start if I boot under windows with no dock, plug the dock, unplug it under > windows, reboot, start linux without dock, reboot -> get stuck. > > The only wait I found to go out of this loop is to plug at some point my > usb-c laptop charger. I think the steps was force stop the laptop, plug the > charger, start the laptop, boot under linux, then reboot should work. > > I will probably open a new bug report for that once I have more consistent > elements. Except if you prefer to continue on this thread ? Yes, please open another bug report about that one.