Bug 200587

Summary: [Intel GFX CI] xhci_hcd 0000:6c:00.0: Host halt failed, -19
Product: Drivers Reporter: Martin Peres (martin.peres)
Component: USBAssignee: Greg Kroah-Hartman (greg)
Status: NEW ---    
Severity: normal CC: lakshminarayana.vudum, mathias.nyman
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 4.18.0-rc4 Subsystem:
Regression: No Bisected commit-id:

Description Martin Peres 2018-07-18 13:52:17 UTC
While executing suspend tests on a CFL-u platform, the system generated the following WARNING:

[  231.883370] xhci_hcd 0000:6c:00.0: Host halt failed, -19
[  231.883400] xhci_hcd 0000:6c:00.0: Host not accessible, reset failed.
[  231.885188] ------------[ cut here ]------------
[  231.885190] xhci_hcd 0000:6c:00.0: disabling already-disabled device
[  231.885202] WARNING: CPU: 3 PID: 2841 at drivers/pci/pci.c:1658 pci_disable_device+0x90/0xb0
[  231.885203] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic i915 btusb btrtl btbcm x86_pkg_temp_thermal intel_powerclamp btintel coretemp crct10dif_pclmul crc32_pclmul bluetooth snd_hda_intel ghash_clmulni_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm e1000e ecdh_generic mei_me mei prime_numbers
[  231.885238] CPU: 3 PID: 2841 Comm: kworker/u8:17 Tainted: G     U            4.18.0-rc4-CI-CI_DRM_4500+ #1
[  231.885239] Hardware name: Intel Corporation NUC8i3BEH/NUC8BEB, BIOS BECFL357.86A.0037.2018.0614.2204 06/14/2018
[  231.885242] Workqueue: pciehp-8 pciehp_power_thread
[  231.885246] RIP: 0010:pci_disable_device+0x90/0xb0
[  231.885247] Code: c6 05 da 1f e0 00 01 48 85 ed 74 34 48 8d bb a0 00 00 00 e8 d2 1a 12 00 48 89 ea 48 89 c6 48 c7 c7 c0 c9 0d 82 e8 f0 d7 b7 ff <0f> 0b eb 84 48 89 df e8 f4 fe ff ff 80 a3 a1 07 00 00 f7 5b 5d c3 
[  231.885317] RSP: 0000:ffffc90000577d08 EFLAGS: 00010282
[  231.885320] RAX: 0000000000000000 RBX: ffff8802b333a2a8 RCX: 0000000000000001
[  231.885321] RDX: 0000000080000001 RSI: ffffffff82127c93 RDI: 00000000ffffffff
[  231.885322] RBP: ffff8802b33129a8 R08: 0000000068f6c5d4 R09: 0000000000000000
[  231.885324] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff822c05a0
[  231.885325] R13: ffffffff822c0610 R14: 0000000000000060 R15: 0000000000000000
[  231.885327] FS:  0000000000000000(0000) GS:ffff8802bdd80000(0000) knlGS:0000000000000000
[  231.885328] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  231.885330] CR2: 0000000000000000 CR3: 0000000005210001 CR4: 00000000003606e0
[  231.885331] Call Trace:
[  231.885335]  pci_device_remove+0x36/0xb0
[  231.885339]  device_release_driver_internal+0x185/0x250
[  231.885342]  pci_stop_bus_device+0x78/0xa0
[  231.885345]  pci_stop_bus_device+0x26/0xa0
[  231.885348]  pci_stop_bus_device+0x26/0xa0
[  231.885351]  pci_stop_and_remove_bus_device+0x9/0x20
[  231.885353]  pciehp_unconfigure_device+0xb6/0x160
[  231.885356]  pciehp_disable_slot+0x52/0xe0
[  231.885360]  pciehp_power_thread+0x86/0xa0
[  231.885364]  process_one_work+0x248/0x6c0
[  231.885370]  worker_thread+0x37/0x380
[  231.885374]  ? process_one_work+0x6c0/0x6c0
[  231.885376]  kthread+0x119/0x130
[  231.885378]  ? kthread_flush_work_fn+0x10/0x10
[  231.885383]  ret_from_fork+0x3a/0x50
[  231.885390] irq event stamp: 9518
[  231.885393] hardirqs last  enabled at (9517): [<ffffffff810f9e7c>] vprintk_emit+0x4bc/0x4d0
[  231.885396] hardirqs last disabled at (9518): [<ffffffff81a0111c>] error_entry+0x7c/0x100
[  231.885398] softirqs last  enabled at (8500): [<ffffffff81c0034f>] __do_softirq+0x34f/0x505
[  231.885401] softirqs last disabled at (8493): [<ffffffff8108c7b9>] irq_exit+0xa9/0xc0
[  231.885403] WARNING: CPU: 3 PID: 2841 at drivers/pci/pci.c:1658 pci_disable_device+0x90/0xb0
[  231.885404] ---[ end trace 3a8bd570d8b6d4f6 ]---

The suspend did however succeed, but it is possible that the USB subsystem may not have recovered from that failure.

Boot logs: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_4500/fi-cfl-8109u/boot0.log
Logs during the execution of the tests: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_4500/fi-cfl-8109u/dmesg0.log
Comment 1 Greg Kroah-Hartman 2018-07-18 14:21:00 UTC
On Wed, Jul 18, 2018 at 01:52:17PM +0000, bugzilla-daemon@bugzilla.kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=200587
> 
>             Bug ID: 200587
>            Summary: [Intel GFX CI] xhci_hcd 0000:6c:00.0: Host halt
>                     failed, -19
>            Product: Drivers
>            Version: 2.5
>     Kernel Version: 4.18.0-rc4

All USB bugs should be sent to the linux-usb@vger.kernel.org mailing
list, and not entered into bugzilla.  Please bring this issue up there,
if it is still a problem in the latest kernel release.
Comment 2 Martin Peres 2019-03-08 17:05:43 UTC
Not really happening anymore. Waiting a couple more CI runs before closing (drmtip_297).
Comment 3 Lakshminarayana Vudum 2019-06-14 12:26:42 UTC
Current drmtip is 305.
On an average this failure used to happen once in 11 drmtip runs, last seen drmtip 184. This bug can be closed as WORKSFORME.