Booting v3.14-rc1 on an (outdated) ThinkPad X41 triggers a kernel
pci 0000:00:02.0: can't ioremap flush page - no chipset flushing
That is this pci device:
lspci | grep 00:02.0
00:02.0 VGA compatible controller: Intel Corporation Mobile 915GM/GMS/910GML Express Graphics Controller (rev 03)
I can't remember seeing that error before. It is apparently printed by
Reported at https://lkml.org/lkml/2014/2/8/201
Bjorn Helgaas asked me to "attach complete dmesg logs of the working and broken kernels to it". But the logs of the working kernel (v3.13.y based) and broken kernel (v3.14-rcy based) are identical up to that error.
Created attachment 128401 [details]
dmesg of v3.14-rc5 boot
At Bjorn's request I added (the first 30 seconds) of dmesg of a v3.14-rc5 boot.
Since the dmesg of v3.13.2 (the last v3.13 I have installed) and v3.14-rc5 are identical until the error we're investigating here I've not bothered to attach the v3.13.2 dmesg.
Created attachment 128461 [details]
Created attachment 128471 [details]
The diff between /proc/iomem on v3.13.2 and v3.14-rc1 is:
--- iomem-3.13.2 2014-02-08 21:14:30.214030591 +0100
+++ iomem-3.14-rc1 2014-02-08 21:07:22.041189158 +0100
@@ -11,16 +11,13 @@
000e0000-000effff : Extension ROM
000f0000-000fffff : System ROM
00100000-7f6dffff : System RAM
- 00400000-009af63a : Kernel code
- 009af63b-00c932ff : Kernel data
- 00d4f000-00e4dfff : Kernel bss
+ 00400000-009c57bf : Kernel code
+ 009c57c0-00cb6aff : Kernel data
+ 00d78000-00e74fff : Kernel bss
7f6e0000-7f6f4fff : ACPI Tables
7f6f5000-7f6fffff : ACPI Non-volatile Storage
7f700000-7fffffff : reserved
7f800000-7fffffff : Graphics Stolen Memory
-80000000-801fffff : PCI Bus 0000:02
-80200000-8027ffff : 0000:00:02.1
-80280000-80280fff : Intel Flush Page
a0000000-a003ffff : 0000:00:02.0
a0040000-a00403ff : 0000:00:1d.7
a0040000-a00403ff : ehci_hcd
Created attachment 128581 [details]
dmesg of v3.14-rc5 boot with Bjorn's debug patch
I've recompiled v3.14-rc5 with a debug patch (see https://lkml.org/lkml/2014/3/7/484 ). Log with the messages that this patch adds is attached.
It appears that commit 04f982beb900f37bc216d63c9dbc5bdddb4a3d3a ("Merge branch 'pci/msi' into next") is good while commit 96702be560374ee7e7139a34cab03554129abbb4 ("Merge branch 'pci/resource' into next") is bad. Just as Bjorn expected. I'll continue bisecting, but I suppose that by now it's clear to Bjorn where things went awry in the handful of commits between good and bad above.
Paul, you can stop bisecting. I see what the problem is, but I probably won't be able to get you a patch to fix it until Monday.
It seems this bug has been fixed in commit ac93ac7403493f8707b7734de9f40d5cb5db9045 ("PCI: Don't check resource_size() in pci_bus_alloc_resource()") and should be closed.
I saw the same message as Paul in 3.14-rc6 and can confirm that it's gone in -rc7.
Should be fixed by the following commit, which appeared in v3.14-rc7. Thanks Paul and Sven!
Here's my analysis of what I did wrong in f75b99d5a77d ("PCI: Enforce bus address limits in resource allocation"), where I introduced the regression:
The problem is basically that I used resource_size() to figure out
whether there's any available space. resource_size() is res->end -
res->start + 1, so applying it to [mem 0x00000000-0xffffffff] returns
zero in a kernel 32-bit resource addresses, i.e., with