Created attachment 37872 [details]
dmesg diff (-good, +bad)
Matthew Garrett reports that my _CRS patches break the 2530p:
Once it starts doing accelerated operations there's a high probability
that it'll suddenly display garbage all over the screen, suddenly power
down and power back on after a few seconds
Matthew found this in Fedora, introduced between these two kernel versions:
-Linux version 2.6.36-4.fc13.x86_64 (email@example.com) (gcc version 4.4.4 20100630 (Red Hat 4.4.4-10) (GCC) ) #1 SMP Tue Nov 16 17:41:39 EST 2010
+Linux version 2.6.36-4.fc15.x86_64 (firstname.lastname@example.org) (gcc version 4.5.1 20101112 (Red Hat 4.5.1-5) (GCC) ) #1 SMP Tue Nov 16 07:31:02 UTC 2010
I reported this against 2.6.37-rc1 because I think it's probably caused
by commit 1af3c2e45, which appeared in .37-rc1 but was probably backported
Created attachment 37892 [details]
/proc/iomem diff (-good, +bad)
Here's the interesting bit:
fee01000-ffffffff : PCI Bus 0000:00
+ ff900000-ffafffff : PCI Bus 0000:03
+ ffb00000-ffcfffff : PCI Bus 0000:02
+ ffd00000-ffefffff : PCI Bus 0000:01
fffa0000-fffa6fff : reserved
+ fffff000-ffffffff : Intel Flush Page
Putting the "Intel Flush Page" at the very last page before 4G is not
going to work. That page contains the ROM processor restart vector,
so I don't think accesses to it will go to the PCI bus.
Created attachment 37902 [details]
patch to avoid PCI allocations in 1M below 4GB
Handled-By : Bjorn Helgaas <email@example.com>
Created attachment 37972 [details]
v2 patch to avoid PCI allocations in 1M below 4GB
The previous version would make host bridge windows above 4GB completely
New information: there is an ACPI INT0800 device that is using the
[mjg59@2530p 00:03]$ cat id
[mjg59@2530p 00:03]$ cat resources
state = active
That's enough to tell us we shouldn't put any PCI devices in that
(It's still sort of dubious that the PNP0A08:00 window:
PNP0A08:00: host bridge window [mem 0xfee01000-0xffffffff]
overlaps the INT0800 range, but Windows must handle that, so we'll
have to manage it somehow, too.)
Unfortunately, Linux ignores the resources used by ACPI devices
(except PNP0C01 and PNP0C02 motherboard devices), so the INT0800
device didn't save us in this case. This is a Linux defect, but
it's too complicated to fix right now.
I'm still getting this error with 126.96.36.199-10.fc15.x86_64 on my 2530p which contains the above patch; I've got 4GB of RAM in this machine. Before the fixes for this the graphics would corrupt and a reboot would occur within 30 seconds of logging into the desktop. With the 2010-11-23 patch it happens about 5 minutes after doing so. mjg59 says he only has 2GB of RAM which could be why he's not seeing the problem anymore but I still am.
Thanks a lot for testing this, Dan. Can you attach your
188.8.131.52-10.fc15.x86_64 dmesg please? I just want to double-check
that the patch is doing what I expected. If it's not too much
trouble, you might also try it with the 0xfff00000 changed to
0xffe00000, since it sounds like that might be safer for really
But I doubt that 0xffe00000 will make a difference; we might just
have to bite the bullet and figure out how to avoid all ACPI devices.
mjg59 asked me for /proc/iomem diffs.
'bad' == 184.108.40.206-10.fc15.x86_64
'good' == 2.6.36-4.fc15 with csr patch reverted
[dcbw@dcbw ~]$ diff -u badiomem.txt goodiomem.txt
--- badiomem.txt 2010-12-03 09:55:39.668107911 -0600
+++ goodiomem.txt 2010-12-03 09:53:06.616013000 -0600
@@ -4,9 +4,9 @@
000a0000-000bffff : PCI Bus 0000:00
000ef000-000fffff : reserved
00100000-b4f2efff : System RAM
- 01000000-0146d3bc : Kernel code
- 0146d3bd-01b7c61f : Kernel data
- 01c6f000-01e3cf07 : Kernel bss
+ 01000000-0146c33c : Kernel code
+ 0146c33d-01b7c59f : Kernel data
+ 01c6f000-01e3cf47 : Kernel bss
b4f2f000-b4f30fff : reserved
b4f31000-b5d6ffff : System RAM
b5d70000-b5d7ffff : ACPI Non-volatile Storage
@@ -48,6 +48,10 @@
d4828000-d4828fff : 0000:00:19.0
d4828000-d4828fff : e1000e
d4829000-d4829fff : 0000:00:03.3
+ d482a000-d482afff : Intel Flush Page
+ d4900000-d4afffff : PCI Bus 0000:01
+ d4b00000-d4cfffff : PCI Bus 0000:02
+ d4d00000-d4efffff : PCI Bus 0000:03
e0000000-efffffff : PCI MMCONFIG 0000 [bus 00-ff]
e0000000-efffffff : reserved
e0000000-efffffff : pnp 00:01
@@ -68,9 +72,5 @@
fee00000-fee00fff : Local APIC
fee00000-fee00fff : reserved
fee01000-ffffffff : PCI Bus 0000:00
- ff800000-ff9fffff : PCI Bus 0000:03
- ffa00000-ffbfffff : PCI Bus 0000:02
- ffc00000-ffdfffff : PCI Bus 0000:01
- ffe7f000-ffe7ffff : Intel Flush Page
ffe80000-ffffffff : reserved
100000000-13bffffff : System RAM
Created attachment 38952 [details]
dmesg diff (-good, +bad)
Created attachment 38962 [details]
full dmesg from bad kernel 220.127.116.11-10.fc15
let me know if there's anything more you need, thanks!
Created attachment 39142 [details]
/proc/iomem from a good kernel (2.6.36-4.fc15)
Created attachment 39262 [details]
patch to avoid allocating PNP resources
Linus thought it was ugly to hack up pcibios_align_resource() to avoid
regions, and as usual, he's right. Here's another try that takes a little
different approach -- instead of avoiding the hardcoded 2MB range just
below 4GB, this avoids all PNP device resources. The INT0800 device
should be enough for this particular issue.
This is still ugly in that the PNP devices should be directly in the
iomem_resource tree like PCI devices are, but we can't do that quite yet.
Ok, I'll try to backport this to current Fedora kernels (18.104.22.168 + the CRS patch + the "never allocate pci from the last 1M below 4G" patch) and see where I get.
Just to be clear, my current proposed patches do not include a
"never allocate pci from the last 1M below 4G" patch. The patch
in comment 14 avoids PNP resources, which should be enough to
solve the 2530p problem because there's an INT0800 device that
Sorry for the backporting mess; there are many patches between
22.214.171.124 and now. If you want to point me at the patches you're
applying to 126.96.36.199, I'd be happy to give you my opinions.
Yeah, I found the "never allocate from last 1M" patch never got upstream, but it was applied to Fedora 2.6.36 kernels to fix the 2530p issue originally (and it does for mjg59's machine with only 2G). I haven't yet booted the kernel I built, but I'll attach the patches that are applied.
In the end I think we do want to backport this to 2.6.36 and send to stable@ if we can.
Created attachment 39622 [details]
Fedora 2.6.36 kernel patch fixing original 2530p issue
Created attachment 39632 [details]
Fedora 2.6.36 kernel CRS fixes patch that caused the problem originally
Created attachment 39642 [details]
My attempt to backport your PNP avoidance patch to the Fedora 188.8.131.52 kernel on top of the two above patches
Hmm, looking at my backport this morning I missed a bit in pcibios_align_resource() that should have gotten moved to arch_remove_reservations(). Other than that, comments on my backport attempt?
I think you should drop the patch from comment 18. I don't plan
to push that upstream. If we decide we do need to ignore that last
1M or 2M, I'll put that code in arch_remove_reservations() instead
of in pcibios_align_resource().
You mentioned the code that moved from
pcibios_align_resource() to arch_remove_reservations(); that change
*is* in your comment 20 patch. But of course, that patch will change
a little bit if you drop the comment 18 patch.
So I've run at least an hour with the patches posted above and had no issues so far. Without the recent PNP resource patch things would fail within 10 minutes. I'll keep running with this kernel for a day or two before we declare unconditional success :) Thanks! I'll let you know.
Still working fine after 6 hours, no issues so far. I think it's fixed.
What patches do you have tested? Just so we don't loose track.
Created attachment 40142 [details]
v3 patch to avoid top 1M below 4G
It's too late in the .37 cycle to consider the previous patch.
Here's the patch Linus proposed, an alternate way to avoid that
last 1MB. Can anybody test it?
Created attachment 40292 [details]
SeaBIOS/qemu test patch
This is just for documentation. Booting Windows 7 on qemu
with this SeaBIOS patch shows that Windows ignored the last
1MB of memory below 4GB, even if it's not mentioned in the
E820 table or any ACPI device _CRS method.
Created attachment 40462 [details]
avoid E820 regions
Boy, I hope we converge on a solution here soon...
The current proposal for .37 is to revert the top-down allocation
patches and add this additional patch.
If you're based on 2.6.36, you would want to add these patches, which
went in after .36 and will stay in .37:
a1862e3 resources: handle overflow when aligning start of available area
6909ba1 resources: ensure callback doesn't allocate outside available space
5d6b1fa resources: factor out resource_clip() to simplify find_resource()
a9cea01 resources: add a default alignf to simplify find_resource()
and add the patch I'm attaching here.
Fixed by http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=46bdfe6a50b88942f5323f837a3afd93a1c86e60 .