Bug 99841

Summary: thunderbolt hot unplug goes haywire
Product: Drivers Reporter: Jarod Wilson (jarod)
Component: PCIAssignee: drivers_pci (drivers_pci)
Status: NEEDINFO ---    
Severity: normal CC: bjorn, rjw
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 4.1-rc7 Subsystem:
Regression: No Bisected commit-id:
Attachments: dmesg following unplug of thunderbolt NIC
normal working hot unplug, 4.1-rc7
lspci -vvv with no adapter connected
lspci -vvv with tbolt ethernet adapter connected
debug patch and message improvements
hot unplug dmesg with debugging and ~0 detection patch

Description Jarod Wilson 2015-06-11 20:01:20 UTC
Created attachment 179691 [details]
dmesg following unplug of thunderbolt NIC

With commit e705c2959b from the pci tree's pci/hotplug branch added on top of 4.1-rc7, hot unplug of a thunderbolt network adapter goes haywire, usually leading to a kernel panic, due to random memory corruption, I believe. So while this patch fixed expresscard hotplug detection on my hp zbook g2, its actually somehow destabilized thunderbolt on the same system.
Comment 1 Jarod Wilson 2015-06-11 20:38:40 UTC
Building with these tacked on, to see if they help:

a5dd4b4 PCI: pciehp: Wait for hotplug command completion where necessary
0824965 PCI: Propagate the "ignore hotplug" setting to parent
d41be34 ACPI / hotplug / PCI: Check ignore_hotplug for all downstream devices
2af31f4 PCI: pciehp: Drop pointless label from pciehp_probe()

The log on unplug contains this, just before things get really weird:

[   89.276805] pciehp 0000:06:00.0:pcie24: Timeout on hotplug command 0x1038 (issued 18276 msec ago)

...which makes me think maybe Alex's patch might have some impact here.

The changes to trim_stale_devices() also look possibly relevant, since its stale devices here that we're trying to reactivate or something.

(Really just shooting in the dark right now, I don't see yet how the expresscard fix is causing this).
Comment 2 Jarod Wilson 2015-06-11 20:58:54 UTC
The situation does not improve with the patches mentioned in comment #1.

[   56.247267] pciehp 0000:03:03.0:pcie24: pcie_isr: intr_loc 108
[   56.247279] pciehp 0000:03:03.0:pcie24: Presence/Notify input change
[   56.247286] pciehp 0000:03:03.0:pcie24: Card not present on Slot(3)
[   56.247300] pciehp 0000:03:03.0:pcie24: Data Link Layer State change
[   56.247309] pciehp 0000:03:03.0:pcie24: slot(3): Link Down event
[   56.247361] pciehp 0000:03:03.0:pcie24: Surprise Removal
[   56.247370] pciehp 0000:03:03.0:pcie24: Link Down event ignored on slot(3): already powering off
[   56.247413] pciehp 0000:03:03.0:pcie24: Disabling domain:bus:device=0000:05:00
[   56.247418] pciehp 0000:03:03.0:pcie24: pciehp_unconfigure_device: domain:bus:dev = 0000:05:00
[   57.467619] tg3 0000:07:00.0: tg3_abort_hw timed out, TX_MODE_ENABLE will not clear MAC_TX_MODE=ffffffff
[   57.682996] pciehp 0000:06:00.0:pcie24: unloading service driver pciehp
[   57.683001] pciehp 0000:06:00.0:pcie24: release_slot: physical_slot = 9
[   57.683922] pciehp 0000:06:00.0:pcie24: Timeout on hotplug command 0x1038 (issued 13114 msec ago)
[   57.683930] pciehp 0000:06:00.0:pcie24: pcie_disable_notification: SLOTCTRL d8 write cmd 0
[   57.683938] pciehp 0000:06:00.0:pcie24: pcie_isr: intr_loc 11f
[   57.683939] pciehp 0000:06:00.0:pcie24: Switch interrupt received
[   57.683942] pciehp 0000:06:00.0:pcie24: Latch open on Slot(\xffffffb8%ut^D\xffffff88\xffffffff\xffffffff\xffffffc0%ut^D\xffffff88\xffffffff\xffffffff\xffffffe0%ut^D\xffffff88\xffffffff\xffffffff\xfffffff0%ut^D\xffffff88\xffffffff\xffffffff\xffffffc8&ut^D\xffffff88\xffffffff\xffffffff\xffffffe8%ut^D\xffffff88\xffffffff\xffffffff)
[   57.683944] pciehp 0000:06:00.0:pcie24: Attention button interrupt received
[   57.683945] pciehp 0000:06:00.0:pcie24: Button pressed on Slot(\xffffffb8%ut^D\xffffff88\xffffffff\xffffffff\xffffffc0%ut^D\xffffff88\xffffffff\xffffffff\xffffffe0%ut^D\xffffff88\xffffffff\xffffffff\xfffffff0%ut^D\xffffff88\xffffffff\xffffffff\xffffffc8&ut^D\xffffff88\xffffffff\xffffffff\xffffffe8%ut^D\xffffff88\xffffffff\xffffffff)
[   57.683947] pciehp 0000:06:00.0:pcie24: Presence/Notify input change
[   57.683949] pciehp 0000:06:00.0:pcie24: Card present on Slot(\xffffffb8%ut^D\xffffff88\xffffffff\xffffffff\xffffffc0%ut^D\xffffff88\xffffffff\xffffffff\xffffffe0%ut^D\xffffff88\xffffffff\xffffffff\xfffffff0%ut^D\xffffff88\xffffffff\xffffffff\xffffffc8&ut^D\xffffff88\xffffffff\xffffffff\xffffffe8%ut^D\xffffff88\xffffffff\xffffffff)
[   57.683951] pciehp 0000:06:00.0:pcie24: Power fault interrupt received
[   57.683952] pciehp 0000:06:00.0:pcie24: Power fault on slot \xffffffb8%ut^D\xffffff88\xffffffff\xffffffff\xffffffc0%ut^D\xffffff88\xffffffff\xffffffff\xffffffe0%ut^D\xffffff88\xffffffff\xffffffff\xfffffff0%ut^D\xffffff88\xffffffff\xffffffff\xffffffc8&ut^D\xffffff88\xffffffff\xffffffff\xffffffe8%ut^D\xffffff88\xffffffff\xffffffff
[   57.683985] pciehp 0000:06:00.0:pcie24: Power fault bit 0 set
[   57.683986] pciehp 0000:06:00.0:pcie24: Data Link Layer State change
[   57.683989] pciehp 0000:06:00.0:pcie24: pciehp_check_link_active: lnk_status = ffff
[   57.683990] pciehp 0000:06:00.0:pcie24: slot(\xffffffb8%ut^D\xffffff88\xffffffff\xffffffff\xffffffc0%ut^D\xffffff88\xffffffff\xffffffff\xffffffe0%ut^D\xffffff88\xffffffff\xffffffff\xfffffff0%ut^D\xffffff88\xffffffff\xffffffff\xffffffc8&ut^D\xffffff88\xffffffff\xffffffff\xffffffe8%ut^D\xffffff88\xffffffff\xffffffff): Link Up event
[   57.683998] pciehp 0000:06:00.0:pcie24: pciehp_get_power_status: SLOTCTRL d8 value read ffff
[   57.683999] pciehp 0000:06:00.0:pcie24: PCI slot #\xffffffb8%ut^D\xffffff88\xffffffff\xffffffff\xffffffc0%ut^D\xffffff88\xffffffff\xffffffff\xffffffe0%ut^D\xffffff88\xffffffff\xffffffff\xfffffff0%ut^D\xffffff88\xffffffff\xffffffff\xffffffc8&ut^D\xffffff88\xffffffff\xffffffff\xffffffe8%ut^D\xffffff88\xffffffff\xffffffff - powering on due to button press
[   57.684001] pciehp 0000:06:00.0:pcie24: Surprise Insertion
[   57.684004] pciehp 0000:06:00.0:pcie24: Link Up event ignored on slot(\xffffffb8%ut^D\xffffff88\xffffffff\xffffffff\xffffffc0%ut^D\xffffff88\xffffffff\xffffffff\xffffffe0%ut^D\xffffff88\xffffffff\xffffffff\xfffffff0%ut^D\xffffff88\xffffffff\xffffffff\xffffffc8&ut^D\xffffff88\xffffffff\xffffffff\xffffffe8%ut^D\xffffff88\xffffffff\xffffffff): already powering on
[   57.684008] pciehp 0000:06:00.0:pcie24: Enabling domain:bus:device=0000:07:00
[   57.684013] pciehp 0000:06:00.0:pcie24: pciehp_check_link_active: lnk_status = ffff
[   58.836290] pci 0000:07:00.0 id reading try 50 times with interval 20 ms to get ffffffff
[   58.836298] pciehp 0000:06:00.0:pcie24: pciehp_check_link_status: lnk_status = ffff
[   58.836300] pciehp 0000:06:00.0:pcie24: Link Training Error occurs
[   58.836325] pciehp 0000:06:00.0:pcie24: Failed to check link status
[   58.848427] pci_bus 0000:07: busn_res: [bus 07] is released
[   58.848483] pci_bus 0000:06: busn_res: [bus 06-07] is released
[   58.848642] pci_bus 0000:30: [bus 30-38] partially hidden behind bridge 0000:03 [bus 03-0a]
[   58.848652] pci_bus 0000:39: [bus 39] partially hidden behind bridge 0000:03 [bus 03-0a]
[   58.848661] pci_bus 0000:3a: [bus 3a] partially hidden behind bridge 0000:03 [bus 03-0a]
[   58.848697] pci_bus 0000:03: Allocating resources

<panic, NULL pointer in get_next_timer_interrupt()>
Comment 3 Rafael J. Wysocki 2015-06-11 21:02:25 UTC
It looks like now PCIeHP tries to operate the Thunderbolt bridge too.

Can you please add a dmesg for the working case too?
Comment 4 Rafael J. Wysocki 2015-06-11 21:08:34 UTC
Also the output of "lspci -vvv" from the affected machine might help.
Comment 5 Rafael J. Wysocki 2015-06-11 21:13:51 UTC
(In reply to Jarod Wilson from comment #1)
> Building with these tacked on, to see if they help:

I really doubt it.

[cut]

> (Really just shooting in the dark right now, I don't see yet how the
> expresscard fix is causing this).

Well, we removed code that prevented pciehp_probe() from succeeding in some cases, so now it may be trying to handle things it shouldn't touch.
Comment 6 Jarod Wilson 2015-06-11 22:07:11 UTC
Created attachment 179731 [details]
normal working hot unplug, 4.1-rc7
Comment 7 Jarod Wilson 2015-06-11 22:09:02 UTC
Created attachment 179741 [details]
lspci -vvv with no adapter connected
Comment 8 Jarod Wilson 2015-06-11 22:09:28 UTC
Created attachment 179751 [details]
lspci -vvv with tbolt ethernet adapter connected
Comment 9 Jarod Wilson 2015-06-15 15:25:13 UTC
So I see pciehp binding to 0000:03:00.0, 0000:03:03.0, 0000:03:04.0, 0000:03:05.0, 0000:03:06.0 and 0000:06:00.0 with the patch, where without it, they aren't. These devices disappear in a hurry when nothing is plugged into the thunderbolt port, and we're still trying to read from them.
Comment 10 Jarod Wilson 2015-06-15 18:38:54 UTC
A bit more thorough testing shows that things actually behave the same way without the patch and pciehp_detect_mode=pcie, so it isn't really this patch causing any regression, its just more directly exposing another bug.
Comment 11 Rafael J. Wysocki 2015-06-15 22:10:32 UTC
Well, pciehp_detect_mode=pcie does the same thing as the patch. :-)

Can you please try booting with pcie_ports=compat in the kernel command line (with the patch applied) and see if (a) Thunderbolt works then and (b) the expresscard slot is functional (my bet is that it isn't, but please confirm).
Comment 12 Bjorn Helgaas 2015-06-15 23:17:36 UTC
Created attachment 180031 [details]
debug patch and message improvements

Hi Jarod, per Rafael, it sounds like we're not supposed to use pciehp for Thunderbolt devices.  But I don't think we know how pciehp is supposed to know that it shouldn't claim them.

In the meantime, until that gets figured out, I want to tidy up some of the pciehp messages, and maybe we can make pciehp at least notice when it reads ~0 from config space because the bridge has gone away.

Would you mind trying this patch and attaching the dmesg log?  This isn't in final form yet, but I'd like to see if the messages are more intelligible and get an idea of whether it's worth changing pcie_isr() to look for the ~0.
Comment 13 Jarod Wilson 2015-06-15 23:22:30 UTC
(In reply to Rafael J. Wysocki from comment #11)
> Well, pciehp_detect_mode=pcie does the same thing as the patch. :-)
> 
> Can you please try booting with pcie_ports=compat in the kernel command line
> (with the patch applied) and see if (a) Thunderbolt works then and (b) the
> expresscard slot is functional (my bet is that it isn't, but please confirm).

Yeah, I figured that *should* be the case, and I swore I'd tried that before, but I think I was completely ignoring the fact the expresscard wasn't working, only focusing on the thunderbolt port. Trying with pcie_ports=compat now...
Comment 14 Jarod Wilson 2015-06-16 05:14:10 UTC
(In reply to Rafael J. Wysocki from comment #11)
> Can you please try booting with pcie_ports=compat in the kernel command line
> (with the patch applied) and see if (a) Thunderbolt works then and (b) the
> expresscard slot is functional (my bet is that it isn't, but please confirm).

Confirmed. Thunderbolt works, expresscard doesn't.
Comment 15 Jarod Wilson 2015-06-16 12:48:57 UTC
Created attachment 180061 [details]
hot unplug dmesg with debugging and ~0 detection patch

Here's dmesg from a 4.1-rc8 kernel build with Bjorn's "PCI: pciehp: Clean up debug logging" patch (which also looks for ~0 in pcie_isr()). Things actually seem to behave fairly sanely here. I still see "Timeout on hotplug command ..." messages for 06:00.0, but also "no response from device", and things appear to get cleaned up correctly.

That said, after a couple of plug/unplug cycles of both the thunderbolt ethernet adapter and a usb3 expresscard, I'm hitting some spew, but its from sysfs, warning about creating duplicate filenames.
Comment 16 Jarod Wilson 2015-06-16 13:08:59 UTC
This repeats a few times over, was triggered by at least this sequence:
1) plug thunderbolt ethernet
2) plug usb3 expresscard
3) unplug usb3 expresscard
4) unplug thunderbolt ethernet
5) plug usb3 expresscard
6) plug thunderbolt ethernet

[ 1542.244597] ------------[ cut here ]------------
[ 1542.244611] WARNING: CPU: 0 PID: 254 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x68/0x80()
[ 1542.244615] sysfs: cannot create duplicate filename '/class/pci_bus/0000:08'
[ 1542.244618] Modules linked in: tg3 xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw bnep tpm_infineon snd_hda_codec_hdmi intel_rapl iosf_mbi x86_pkg_temp_thermal coretemp kvm_intel kvm iTCO_wdt ppdev iTCO_vendor_support snd_hda_codec_realtek hp_wmi snd_hda_codec_generic sparse_keymap snd_hda_intel crct10dif_pclmul snd_hda_controller crc32_pclmul snd_hda_codec crc32c_intel iwlwifi snd_hda_core snd_hwdep btusb snd_seq btbcm btintel
[ 1542.244722]  snd_seq_device bluetooth ghash_clmulni_intel cfg80211 uvcvideo snd_pcm videobuf2_vmalloc videobuf2_core videobuf2_memops v4l2_common videodev serio_raw snd_timer snd parport_pc rtsx_pci_ms media memstick i2c_i801 lpc_ich rfkill soundcore hp_accel mei_me parport tpm_tis lis3lv02d mei tpm input_polldev ie31200_edac shpchp edac_core hp_wireless xfs libcrc32c nouveau rtsx_pci_sdmmc mmc_core i2c_algo_bit drm_kms_helper ttm drm mxm_wmi e1000e rtsx_pci mfd_core ptp pps_core video wmi
[ 1542.244803] CPU: 0 PID: 254 Comm: kworker/0:2 Tainted: G        W       4.1.0-0.rc8.git0.1.pciehp.el7.x86_64 #1
[ 1542.244808] Hardware name: Hewlett-Packard HP ZBook 15 G2/2253, BIOS M70 Ver. 01.07 02/26/2015
[ 1542.244818] Workqueue: pciehp-0 pciehp_power_thread
[ 1542.244823]  0000000000000000 00000000103da9e7 ffff8804879e39b8 ffffffff817891cf
[ 1542.244831]  0000000000000000 ffff8804879e3a10 ffff8804879e39f8 ffffffff810a08ca
[ 1542.244838]  00007356879e39f8 ffff8804890bc000 ffff880474794258 ffff88048b310000
[ 1542.244845] Call Trace:
[ 1542.244857]  [<ffffffff817891cf>] dump_stack+0x45/0x57
[ 1542.244865]  [<ffffffff810a08ca>] warn_slowpath_common+0x8a/0xc0
[ 1542.244872]  [<ffffffff810a0955>] warn_slowpath_fmt+0x55/0x70
[ 1542.244881]  [<ffffffff8129dbd8>] ? kernfs_path+0x48/0x60
[ 1542.244889]  [<ffffffff812a13c8>] sysfs_warn_dup+0x68/0x80
[ 1542.244897]  [<ffffffff812a1734>] sysfs_do_create_link_sd.isra.2+0xb4/0xc0
[ 1542.244905]  [<ffffffff812a1765>] sysfs_create_link+0x25/0x50
[ 1542.244913]  [<ffffffff814d9bb4>] device_add+0x2c4/0x650
[ 1542.244919]  [<ffffffff814d9f5e>] device_register+0x1e/0x30
[ 1542.244928]  [<ffffffff813e6516>] pci_add_new_bus+0x176/0x430
[ 1542.244936]  [<ffffffff813ef59f>] ? pci_find_next_bus+0x4f/0x80
[ 1542.244944]  [<ffffffff813e7481>] pci_scan_bridge+0x5b1/0x6a0
[ 1542.244952]  [<ffffffff813e7626>] pci_scan_child_bus+0xb6/0x160
[ 1542.244959]  [<ffffffff813e73f6>] pci_scan_bridge+0x526/0x6a0
[ 1542.244969]  [<ffffffff8178ec86>] ? down_read+0x16/0x30
[ 1542.244977]  [<ffffffff81406eef>] pci_hp_add_bridge+0x6f/0xc0
[ 1542.244984]  [<ffffffff814035fb>] pciehp_configure_device+0x7b/0x140
[ 1542.244990]  [<ffffffff81403083>] pciehp_enable_slot+0x193/0x2c0
[ 1542.244997]  [<ffffffff814032bb>] pciehp_power_thread+0x3b/0xc0
[ 1542.245009]  [<ffffffff810b9ac8>] process_one_work+0x148/0x3f0
[ 1542.245015]  [<ffffffff810ba1ab>] worker_thread+0x11b/0x470
[ 1542.245022]  [<ffffffff810ba090>] ? rescuer_thread+0x320/0x320
[ 1542.245030]  [<ffffffff810bf838>] kthread+0xd8/0xf0
[ 1542.245039]  [<ffffffff810bf760>] ? kthread_create_on_node+0x1b0/0x1b0
[ 1542.245045]  [<ffffffff81790ba2>] ret_from_fork+0x42/0x70
[ 1542.245053]  [<ffffffff810bf760>] ? kthread_create_on_node+0x1b0/0x1b0
[ 1542.245059] ---[ end trace 38a89ed22c1769df ]---
[ 1542.245112] ------------[ cut here ]------------
Comment 17 Jarod Wilson 2015-06-29 22:15:13 UTC
(In reply to Bjorn Helgaas from comment #12)
> Created attachment 180031 [details]
> debug patch and message improvements
...
> Would you mind trying this patch and attaching the dmesg log?  This isn't in
> final form yet, but I'd like to see if the messages are more intelligible
> and get an idea of whether it's worth changing pcie_isr() to look for the ~0.

Things are definitely better with the change to pcie_isr(), and I'm wondering if page 443 of the pci express 4.0 rev0.3 doc (PCI_Express_Base_4.0_Rev0.3_February19-2014.pdf) even says this is correct behavior:

IMPLEMENTATION NOTE
Data Value of All 1’s
Many platforms, including those supporting RP Extensions for DPC, can return a data value of all 1’s to software when an error is associated with a PCI Express Configuration, I/O, or Memory Read Request. During DPC, the Downstream Port discards Requests destined for the Link and completes them with an error (i.e., either with an Unsupported Request (UR) or Completer Abort (CA) Completion Status). By ending a series of MMIO or configuration space operations with a read to an address with a known data value not equal to all 1’s, software may determine if a Completer has been removed or DPC has been triggered.