Bug 81431

Summary: pci=realloc fails to modify bridge windows, causing devices to fail BAR allocation
Product: Drivers Reporter: TJ (linux)
Component: PCIAssignee: drivers_pci (drivers_pci)
Status: NEW ---    
Severity: high CC: alan, bjorn, oliver.greg, szg00000, yinghai
Priority: P1    
Hardware: x86-64   
OS: Linux   
Kernel Version: 3.16 -> 4.15-rc9 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: v3.15.7 Successful bridge window reallocation
v3.16rc7 Unsuccessful bridge window reallocation
v3.17rc4 Unsuccessful bridge window reallocation
diff -u of dmesg logs between 3.15.7 and 3.17rc4
v4.1rc7 dmesg pci=realloc,use_crs
v4.1rc7 dmesg pci=realloc,nocrs
v4.1.0-rc8 with CONFIG_PCI_DEBUG=y
v4.1.0-rc8 + for-pci-v4.1-rc8 with CONFIG_PCI_DEBUG=y
correct check with old size
correct alignment
correct alignment v2 for multiple bridges
correct alignment v3 for multiple bridges
v4.1.0-rc8 + c15a69d PCI: get correct bridge mmio size ...
v4.1.0-rc8 + 39c93b1 PCI: Optimize bus mem sizing ...
v4.1.0-rc8 + c15a69d + 39c93b1
dmesg v4.1-rc8 + c15a69d + 39c93b1 + 7ebfda8
dmesg v4.1-rc8 + b39e731 + 725530f + a9a729b
dmesg v4.2-rc1 + for-pci-v4.2-rc1
boot dmesg 4.4.0-rc8 + for-pci-v4.5-next
lspci -xxvvvnnk 4.4.0-rc8 + for-pci-v4.5-next
hotplug kern.log 4.4.0-rc8 + for-pci-v4.5-next
working boot dmesg 4.4.0-rc8 + for-pci-v4.5-next
failing boot dmesg 4.4.0-rc8 + for-pci-v4.5-next
dmesg v4.15-rc9 failing
dmesg 4.15.0-rc9 PCI debug (boot, before device insertion)
dmesg 4.15.0-rc9 PCI debug (device insertion)

Description TJ 2014-07-31 07:28:29 UTC
Created attachment 144781 [details]
v3.15.7 Successful bridge window reallocation

Up until v3.15.7 booting with "pci=realloc,use_crs" successfully modifies the PCI bridge windows to allow an external ExpressCard <> ViDock4 + Nvidia Quadro NVS420 with its 2 GPUs to be configured.

With v3.16rc6 and rc7 it fails to do this, leaving the NVS420 inoperable.

I am attaching 2 dmesg captures. They are mainline builds packaged by Ubuntu. I've been using these builds for several years; this is the first time I've found a regression of this kind.
Comment 1 TJ 2014-07-31 07:29:27 UTC
Created attachment 144791 [details]
v3.16rc7 Unsuccessful bridge window reallocation
Comment 2 TJ 2014-08-08 15:42:58 UTC
No fix; regression affects 3.16
Comment 3 TJ 2014-09-09 01:52:55 UTC
Created attachment 149561 [details]
v3.17rc4 Unsuccessful bridge window reallocation

Still affects v3.17rc4.

I'm unable to do bisect runs at present but this seems to be a pretty significant regression that PCI experts ought to be able to point a finger at suspect commits.
Comment 4 TJ 2014-09-09 02:25:23 UTC
Created attachment 149581 [details]
diff -u of dmesg logs between 3.15.7 and 3.17rc4

Diff between 3.15.7 and 3.17rc4 with time-stamps removed using:

diff -u <(sed 's/^[\[ [:digit:]\.\]*] //' /var/log/dmesg) <(sed 's/^[\[ [:digit:]\.\]*] //' /var/log/dmesg.0)
Comment 5 TJ 2015-06-12 04:30:49 UTC
Created attachment 179761 [details]
v4.1rc7 dmesg pci=realloc,use_crs
Comment 6 TJ 2015-06-12 04:35:36 UTC
Created attachment 179771 [details]
v4.1rc7 dmesg pci=realloc,nocrs
Comment 7 TJ 2015-06-12 04:36:18 UTC
Still affects v4.1rc7
Comment 8 TJ 2015-06-17 17:52:24 UTC
Regression caused by:

5b28541552ef5eeffc41d6936105f38c2508e566] PCI: Restrict 64-bit prefetchable bridge windows to 64-bit resources


git bisect log
# bad: [7171511eaec5bf23fb06078f59784a3a0626b38f] Linux 3.16-rc1
# good: [8e56aed0b0579b667489bcb1d94c223726f0eaa1] PCI: hotplug: Remove unnecessary "dev->bus" test
git bisect start '7171511' '8e56aed' '--' 'drivers/pci/'
# bad: [d785260e2f57d87de5c059de2dabc3cd31b745f0] Merge branch 'pci/host-generic' into next
git bisect bad d785260e2f57d87de5c059de2dabc3cd31b745f0
# bad: [e5558d1a516fa6924fa8d53152b665d4c26f142e] Merge branches 'dma-api', 'pci/virtualization', 'pci/msi', 'pci/misc' and 'pci/resource' into next
git bisect bad e5558d1a516fa6924fa8d53152b665d4c26f142e
# good: [518a6a34f645897ec3440e5cbcf53ced3493ee1c] Merge branches 'pci/hotplug', 'pci/msi', 'pci/virtualization' and 'pci/misc' into next
git bisect good 518a6a34f645897ec3440e5cbcf53ced3493ee1c
# bad: [5b28541552ef5eeffc41d6936105f38c2508e566] PCI: Restrict 64-bit prefetchable bridge windows to 64-bit resources
git bisect bad 5b28541552ef5eeffc41d6936105f38c2508e566
# good: [31e9dd2565a6e27a3e698d7e3adf929db8d6c767] PCI: Don't set BAR to zero if dma_addr_t is too small
git bisect good 31e9dd2565a6e27a3e698d7e3adf929db8d6c767
# good: [d739a099d0248c78d374b1b610cdb679c7bc052d] PCI: Don't add disabled subtractive decode bus resources
git bisect good d739a099d0248c78d374b1b610cdb679c7bc052d
# good: [14c8530dbc1b7cd5020c44b391e34bdb731fd098] PCI: Support BAR sizes up to 8GB
git bisect good 14c8530dbc1b7cd5020c44b391e34bdb731fd098
# first bad commit: [5b28541552ef5eeffc41d6936105f38c2508e566] PCI: Restrict 64-bit prefetchable bridge windows to 64-bit resources
Comment 9 Yinghai Lu 2015-06-18 00:44:03 UTC
(In reply to TJ from comment #5)
> Created attachment 179761 [details]
> v4.1rc7 dmesg pci=realloc,use_crs

It has 3.15-7 instead of v4.1-rc7.
Please attach correct log.
Comment 10 Yinghai Lu 2015-06-18 01:06:12 UTC
Please check if 
git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linux-yinghai.git for-pci-v4.1-rc8

works or not.
Comment 11 TJ 2015-06-18 01:08:25 UTC
Created attachment 180241 [details]
v4.1.0-rc8 with CONFIG_PCI_DEBUG=y

I've just done another run with v4.1.0-rc8 and CONFIG_PCI_DEBUG=y which might be more useful.
Comment 12 TJ 2015-06-18 01:15:21 UTC
@Yinghai Lu: Our comments on v4.1-rc8 crossed. To avoid confusion the v4.1.0-rc8 dmesg log is for Linus' tree and doesn't include your branch 'for-pci-v4.1-rc8'.

Your branch is building now and I'll upload a log and progress report once it has been tested.
Comment 13 Yinghai Lu 2015-06-18 01:30:18 UTC
v3.15-6
[    0.304561] pci 0000:00:1c.4: res[14]=[mem 0x01000000-0x06ffffff] get_res_add_size add_size 1000000
[    0.304564] pci 0000:00:1c.4: BAR 14: assigned [mem 0xe0000000-0xe6ffffff]

v3.16-rc7:
[    0.281232] pci 0000:00:1c.4: res[14]=[mem 0x01000000-0x08ffffff] get_res_add_size add_size 1000000
[    0.281237] pci 0000:00:1c.4: BAR 14: can't assign mem (size 0x9000000)
[    0.281313] pci 0000:00:1c.4: BAR 14: can't assign mem (size 0x8000000)

must+optional:
0x6000000+0x1000000 change to 0x8000000+0x1000000

so it request 0x2000000 more for must after that commit.
Comment 14 TJ 2015-06-18 02:38:43 UTC
Created attachment 180251 [details]
v4.1.0-rc8 + for-pci-v4.1-rc8 with CONFIG_PCI_DEBUG=y

That did it, thanks.

Which commit (or commits) are required to fix this. I'd like to get them cherry-picked to the Debian and Ubuntu kernels.
Comment 15 Yinghai Lu 2015-06-18 03:45:12 UTC
Created attachment 180271 [details]
correct check with old size
Comment 16 Yinghai Lu 2015-06-18 03:46:09 UTC
Created attachment 180281 [details]
correct alignment
Comment 17 Yinghai Lu 2015-06-18 03:46:37 UTC
please check those two patches one by one.
Comment 18 Yinghai Lu 2015-06-18 21:29:51 UTC
Created attachment 180321 [details]
correct alignment v2 for multiple bridges

please use v2.
Comment 19 Yinghai Lu 2015-06-18 21:53:13 UTC
Created attachment 180331 [details]
correct alignment v3 for multiple bridges
Comment 20 TJ 2015-06-19 08:06:53 UTC
Created attachment 180351 [details]
v4.1.0-rc8 + c15a69d PCI: get correct bridge mmio size ...

v4.1.0-rc8 + c15a69d

PCI: get correct bridge mmio size with old size checking

Didn't work
Comment 21 TJ 2015-06-19 08:15:35 UTC
Created attachment 180361 [details]
v4.1.0-rc8 + 39c93b1 PCI: Optimize bus mem sizing ...

v4.1.0-rc8 + 39c93b1

PCI: Optimize bus mem sizing to small size

Doesn't work.
Comment 22 TJ 2015-06-19 08:17:24 UTC
Created attachment 180371 [details]
v4.1.0-rc8 + c15a69d + 39c93b1

v4.1.0-rc8 + c15a69d + 39c93b1

PCI: get correct bridge mmio size with old size checking
PCI: Optimize bus mem sizing to small size

Doesn't work.
Comment 23 TJ 2015-06-19 08:24:04 UTC
Just noticed your for-pci-v4.1-rc8 branch has been rebased and commit hashes have changed so the commit hashes here don't always match yours. These are the current hashes of the applied patches:

6aebe85  PCI: get correct bridge mmio size with old size checking
98674c2  Optimize bus mem sizing to small size
Comment 24 TJ 2015-06-19 13:25:49 UTC
Currently doing a build with 7ebfda8372fb8 added on top of the other 2 commits.

7ebfda8372fb8 PCI: don't release fixed resource for pci=realloc
Comment 25 TJ 2015-06-19 20:25:32 UTC
Created attachment 180421 [details]
dmesg v4.1-rc8 + c15a69d + 39c93b1 + 7ebfda8

v4.1-rc8 + c15a69d + 39c93b1 + 7ebfda8

PCI: get correct bridge mmio size with old size checking
PCI: Optimize bus mem sizing to small size
PCI: don't release fixed resource for pci=realloc

Doesn't work.
Comment 26 Yinghai Lu 2015-06-19 21:33:08 UTC
please check 
git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linux-yinghai.git for-pci-v4.1-rc8
again.

I dropped PCI: Optimize bus mem sizing to small size v3
from the branch.

wonder if the first and third patch in the branch would fix
the problem.
Comment 27 Yinghai Lu 2015-06-19 22:11:36 UTC
you can try:
v4.1-rc8
+ 
b39e7316 (PCI: get correct bridge mmio size with old size checking)
+
725530f4 (PCI: check pref compatible bit for mem64 resource of pcie device)

or

v4.1-rc8
+ 
b39e7316 (PCI: get correct bridge mmio size with old size checking)
+
a9a729bf (PCI: get new realloc size for bridge that does not have children)

They should make your system work again.
Comment 28 Yinghai Lu 2015-06-19 22:35:39 UTC
never mind: 
a9a729bf (PCI: get new realloc size for bridge that does not have children)
may not help.
Comment 29 TJ 2015-06-20 03:25:33 UTC
v4.1-rc8 + b39e7316 + 725530f4 doesn't work.

I'm currently trying a build checked out at e26677c PCI: Don't shrink too much for hotplug bridge. If this works I'll reverse bisect to find the sweet spot.
Comment 30 TJ 2015-06-21 20:18:37 UTC
I've done tests of many combinations from your branch. The first good commit is

v4.1-rc8 -> a9a729b PCI: get new realloc size for bridge that does not have children

I've not been able to identify a minimal sub-set of commits that fix it so far, having tried various combinations which all fail:

v4.1-rc8 -> 3b7ccb3
v4.1-rc8 -> 5698a97
v4.1-rc8 -> 3d3184a
v4.1-rc8 + b39e7316 + a9a729bf
v4.1-rc8 + b39e7316 + b2bbf93
Comment 31 Yinghai Lu 2015-06-21 21:07:04 UTC
I reorder patches sequence in the branch.
That could let your find the patches solve the problem easier.
Comment 32 TJ 2015-06-21 21:50:08 UTC
Created attachment 180541 [details]
dmesg v4.1-rc8 + b39e731 + 725530f + a9a729b

I've identified the minimal set of commits required:

b39e731 PCI: get correct bridge mmio size with old size checking
725530f PCI: check pref compatible bit for mem64 resource of pcie device
a9a729b PCI: get new realloc size for bridge that does not have children

Thanks very much for addressing this issue.
Comment 33 TJ 2015-06-25 08:17:33 UTC
I don't see anything merged into mainline so far. Will these patches be making it into v4.2?
Comment 34 Yinghai Lu 2015-06-25 18:13:39 UTC
Not for v4.2.

Should submit for v4.3 after v4.2-rc1 is released, and they will be marked to stable.

BTW, please check the 
git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linux-yinghai.git for-pci-v4.2-rc1

is still working on your setup.

Thanks
Comment 35 Yinghai Lu 2015-06-30 02:10:01 UTC
please check the 
git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linux-yinghai.git for-pci-v4.2-rc1 

as one patch is dropped.
  PCI: get correct bridge mmio size with old size checking
Comment 36 TJ 2015-07-07 03:45:23 UTC
Sorry for the delay in getting back to this issue.

Testing latest mainline HEAD + for-pci-v4.2-rc1 failed.

mainline HEAD @ 1c4c715 Merge tag 'ext4_for_linus_stable'

I was unable to grab the dmesg output because I accidentally built a defconfig which didn't include the dm_crypt/cryptseup modules in the initrd. I'm running another build and will report back later.
Comment 37 TJ 2015-07-07 13:09:43 UTC
Created attachment 182081 [details]
dmesg v4.2-rc1 + for-pci-v4.2-rc1

Log of failed v4.2-rc1 + for-pci-v4.2-rc1
Comment 38 Yinghai Lu 2015-07-07 16:39:40 UTC
can you post output from "cat /proc/iomem" ?
Comment 39 Yinghai Lu 2015-07-07 17:02:40 UTC
but the allocation result is the same as that in comment32.

they both can not assign some ROM bars at last.
Comment 40 TJ 2015-07-07 21:47:09 UTC
Apologies, it is working!

I was misled due to the regions reported as 'disabled' (by lspci), the nvidia driver is not build-able against v4.2 due to a EXPORT_GPL_SYMBOL issue and therefore its tell-tale messages were not available, and the nouveau driver loaded but didn't attach to the external GPUs. Turned out I had to explicitly do "modprobe nouveau modeset=1".
Comment 41 Yinghai Lu 2015-07-08 05:45:41 UTC
If you really want to get rid of ROM bar assign problem, you 
can boot with "pci=realloc,assign_pref_bars".
Comment 42 Greg Oliver 2015-09-11 23:05:33 UTC
I hate to resurrect an old thread, but Yinghai - I appear to be having the same issue with PCI through a thunderbolt connection.  I have pulled down your for4.2-rc1 tree and was going to use it, but wanted to check if it has been committed upstream yet, and/or if there is a better revision to use recently before I get started?
Comment 43 Yinghai Lu 2015-09-12 02:22:13 UTC
It is not in upstream yet. could be v4.4, as I have to post it after 4.3-rc1.

so please try 

git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linux-yinghai.git for-pci-v4.3
Comment 44 TJ 2016-01-08 15:28:25 UTC
Created attachment 198991 [details]
boot dmesg 4.4.0-rc8 + for-pci-v4.5-next

This has been broken since August 2014 - can we *please* get the fixes into mainline?!

The -for-pci-v4.5-next branch of patches applied to the current v4.4-rc8 master (not sure about revisions since I reported a solution on 2015-07-07) cause even worse failure than before. Now, lspci cannot report the devices due to:

0c:00.0 PCI bridge [0604]: NVIDIA Corporation NF200 PCIe 2.0 switch for Quadro Plex S4 / Tesla S870 / Tesla S1070 / Tesla S2050 [10de:05be] (rev ff) (prog-if ff)
        !!! Unknown header type 7f
        Kernel driver in use: pcieport

A hot-plug of the device reports, amongst other things:

pcieport 0000:0c:00.0: Refused to change power state, currently in D3
pcieport 0000:0d:00.0: Refused to change power state, currently in D3
pcieport 0000:0d:02.0: Refused to change power state, currently in D3
pci_bus 0000:0d: busn_res: [bus 0d] is released

I attach boot dmesg, lspci, and hotplug kern.log.
Comment 45 TJ 2016-01-08 15:29:21 UTC
Created attachment 199001 [details]
lspci -xxvvvnnk 4.4.0-rc8 + for-pci-v4.5-next
Comment 46 TJ 2016-01-08 15:29:57 UTC
Created attachment 199011 [details]
hotplug kern.log 4.4.0-rc8 + for-pci-v4.5-next
Comment 47 TJ 2016-01-08 16:41:39 UTC
Created attachment 199021 [details]
working boot dmesg 4.4.0-rc8 + for-pci-v4.5-next

Ignore my last report about it being broken with v4.4 + for-pci-v4.5-next. I can't pinpoint the cause but it seems there was some kind of hardware glitch that survived multiple reboots and power downs of the external PCIe/NVS420 device.

$ uname -r
4.4.0-rc8+

$ lspci -tvvvvnn
-[0000:00]-+-00.0  Intel Corporation Mobile PM965/GM965/GL960 Memory Controller Hub [8086:2a00]
           +-01.0-[01]----00.0  NVIDIA Corporation G84M [GeForce 8600M GT] [10de:0407]
           +-1a.0  Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #4 [8086:2834]
           +-1a.1  Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #5 [8086:2835]
           +-1a.7  Intel Corporation 82801H (ICH8 Family) USB2 EHCI Controller #2 [8086:283a]
           +-1b.0  Intel Corporation 82801H (ICH8 Family) HD Audio Controller [8086:284b]
           +-1c.0-[09]----00.0  Marvell Technology Group Ltd. 88E8040 PCI-E Fast Ethernet Controller [11ab:4354]
           +-1c.1-[0b]----00.0  Intel Corporation PRO/Wireless 4965 AG or AGN [Kedron] Network Connection [8086:4229]
           +-1c.4-[0c-0f]----00.0-[0d-0f]--+-00.0-[0e]----00.0  NVIDIA Corporation G98 [Quadro NVS 420] [10de:06f8]
           |                               \-02.0-[0f]----00.0  NVIDIA Corporation G98 [Quadro NVS 420] [10de:06f8]
           +-1d.0  Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #1 [8086:2830]
           +-1d.1  Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #2 [8086:2831]
           +-1d.2  Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #3 [8086:2832]
           +-1d.7  Intel Corporation 82801H (ICH8 Family) USB2 EHCI Controller #1 [8086:2836]
           +-1e.0-[03]--+-09.0  Ricoh Co Ltd R5C832 IEEE 1394 Controller [1180:0832]
           |            +-09.1  Ricoh Co Ltd R5C822 SD/SDIO/MMC/MS/MSPro Host Adapter [1180:0822]
           |            +-09.2  Ricoh Co Ltd R5C592 Memory Stick Bus Host Adapter [1180:0592]
           |            \-09.3  Ricoh Co Ltd xD-Picture Card Controller [1180:0852]
           +-1f.0  Intel Corporation 82801HM (ICH8M) LPC Interface Controller [8086:2815]
           +-1f.1  Intel Corporation 82801HM/HEM (ICH8M/ICH8M-E) IDE Controller [8086:2850]
           +-1f.2  Intel Corporation 82801HM/HEM (ICH8M/ICH8M-E) SATA Controller [AHCI mode] [8086:2829]
           \-1f.3  Intel Corporation 82801H (ICH8 Family) SMBus Controller [8086:283e]
Comment 48 TJ 2016-01-08 19:25:14 UTC
Created attachment 199041 [details]
failing boot dmesg 4.4.0-rc8 + for-pci-v4.5-next

Unfortunately I was too optimistic. It seems it will work about 1 boot in 15 or so, and requires a complete power off in order to have a chance of working after a reboot.

When it fails none of the bridge windows for the GPUs is activated, leaving just:

0c:00.0 PCI bridge [0604]: NVIDIA Corporation NF200 PCIe 2.0 switch for Quadro Plex S4 / Tesla S870 / Tesla S1070 / Tesla S2050 [10de:05be] (rev a3)

I've also tried adding "hpmemsize=600M" to try and get hotplug to work to avoid the power/reboot cycling, and also triggering a rescan of the 0c:00.0 bridge via sysfs, but those don't help either.
Comment 49 TJ 2018-01-24 18:13:37 UTC
This is still a regression affecting 4.15-rc9. The patches by Yinghai Lu seem to have gone AWOL.
Comment 50 TJ 2018-01-24 18:14:18 UTC
Created attachment 273847 [details]
dmesg v4.15-rc9 failing
Comment 51 TJ 2018-01-26 19:46:47 UTC
Created attachment 273883 [details]
dmesg 4.15.0-rc9 PCI debug (boot, before device insertion)

Attaching a couple of logs gathered with PCI_DEBUG=y and loglevel debug.

First log is the system booting.
Second log is when the device is probed (ExpressCard/34 inserted)
Comment 52 TJ 2018-01-26 19:47:30 UTC
Created attachment 273885 [details]
dmesg 4.15.0-rc9 PCI debug (device insertion)