Created attachment 37872 [details] dmesg diff (-good, +bad) Matthew Garrett reports that my _CRS patches break the 2530p: Once it starts doing accelerated operations there's a high probability that it'll suddenly display garbage all over the screen, suddenly power down and power back on after a few seconds Matthew found this in Fedora, introduced between these two kernel versions: -Linux version 2.6.36-4.fc13.x86_64 (mjg59@ihatethathostname.lab.bos.redhat.com) (gcc version 4.4.4 20100630 (Red Hat 4.4.4-10) (GCC) ) #1 SMP Tue Nov 16 17:41:39 EST 2010 +Linux version 2.6.36-4.fc15.x86_64 (mockbuild@x86-19.phx2.fedoraproject.org) (gcc version 4.5.1 20101112 (Red Hat 4.5.1-5) (GCC) ) #1 SMP Tue Nov 16 07:31:02 UTC 2010 I reported this against 2.6.37-rc1 because I think it's probably caused by commit 1af3c2e45, which appeared in .37-rc1 but was probably backported to fc15.
Created attachment 37892 [details] /proc/iomem diff (-good, +bad) Here's the interesting bit: fee01000-ffffffff : PCI Bus 0000:00 + ff900000-ffafffff : PCI Bus 0000:03 + ffb00000-ffcfffff : PCI Bus 0000:02 + ffd00000-ffefffff : PCI Bus 0000:01 fffa0000-fffa6fff : reserved + fffff000-ffffffff : Intel Flush Page Putting the "Intel Flush Page" at the very last page before 4G is not going to work. That page contains the ROM processor restart vector, so I don't think accesses to it will go to the PCI bus.
Created attachment 37902 [details] patch to avoid PCI allocations in 1M below 4GB
Handled-By : Bjorn Helgaas <bjorn.helgaas@hp.com>
Created attachment 37972 [details] v2 patch to avoid PCI allocations in 1M below 4GB The previous version would make host bridge windows above 4GB completely unusable.
References: https://patchwork.kernel.org/patch/365092/
New information: there is an ACPI INT0800 device that is using the 0xff000000-0xffffffff range: [mjg59@2530p 00:03]$ cat id INT0800 [mjg59@2530p 00:03]$ cat resources state = active mem 0xff000000-0xffffffff That's enough to tell us we shouldn't put any PCI devices in that range. (It's still sort of dubious that the PNP0A08:00 window: PNP0A08:00: host bridge window [mem 0xfee01000-0xffffffff] overlaps the INT0800 range, but Windows must handle that, so we'll have to manage it somehow, too.) Unfortunately, Linux ignores the resources used by ACPI devices (except PNP0C01 and PNP0C02 motherboard devices), so the INT0800 device didn't save us in this case. This is a Linux defect, but it's too complicated to fix right now.
I'm still getting this error with 2.6.36.1-10.fc15.x86_64 on my 2530p which contains the above patch; I've got 4GB of RAM in this machine. Before the fixes for this the graphics would corrupt and a reboot would occur within 30 seconds of logging into the desktop. With the 2010-11-23 patch it happens about 5 minutes after doing so. mjg59 says he only has 2GB of RAM which could be why he's not seeing the problem anymore but I still am.
Thanks a lot for testing this, Dan. Can you attach your 2.6.36.1-10.fc15.x86_64 dmesg please? I just want to double-check that the patch is doing what I expected. If it's not too much trouble, you might also try it with the 0xfff00000 changed to 0xffe00000, since it sounds like that might be safer for really old machines. But I doubt that 0xffe00000 will make a difference; we might just have to bite the bullet and figure out how to avoid all ACPI devices.
mjg59 asked me for /proc/iomem diffs. 'bad' == 2.6.36.1-10.fc15.x86_64 'good' == 2.6.36-4.fc15 with csr patch reverted [dcbw@dcbw ~]$ diff -u badiomem.txt goodiomem.txt --- badiomem.txt 2010-12-03 09:55:39.668107911 -0600 +++ goodiomem.txt 2010-12-03 09:53:06.616013000 -0600 @@ -4,9 +4,9 @@ 000a0000-000bffff : PCI Bus 0000:00 000ef000-000fffff : reserved 00100000-b4f2efff : System RAM - 01000000-0146d3bc : Kernel code - 0146d3bd-01b7c61f : Kernel data - 01c6f000-01e3cf07 : Kernel bss + 01000000-0146c33c : Kernel code + 0146c33d-01b7c59f : Kernel data + 01c6f000-01e3cf47 : Kernel bss b4f2f000-b4f30fff : reserved b4f31000-b5d6ffff : System RAM b5d70000-b5d7ffff : ACPI Non-volatile Storage @@ -48,6 +48,10 @@ d4828000-d4828fff : 0000:00:19.0 d4828000-d4828fff : e1000e d4829000-d4829fff : 0000:00:03.3 + d482a000-d482afff : Intel Flush Page + d4900000-d4afffff : PCI Bus 0000:01 + d4b00000-d4cfffff : PCI Bus 0000:02 + d4d00000-d4efffff : PCI Bus 0000:03 e0000000-efffffff : PCI MMCONFIG 0000 [bus 00-ff] e0000000-efffffff : reserved e0000000-efffffff : pnp 00:01 @@ -68,9 +72,5 @@ fee00000-fee00fff : Local APIC fee00000-fee00fff : reserved fee01000-ffffffff : PCI Bus 0000:00 - ff800000-ff9fffff : PCI Bus 0000:03 - ffa00000-ffbfffff : PCI Bus 0000:02 - ffc00000-ffdfffff : PCI Bus 0000:01 - ffe7f000-ffe7ffff : Intel Flush Page ffe80000-ffffffff : reserved 100000000-13bffffff : System RAM
Created attachment 38952 [details] dmesg diff (-good, +bad)
Created attachment 38962 [details] full dmesg from bad kernel 2.6.36.1-10.fc15
let me know if there's anything more you need, thanks!
Created attachment 39142 [details] /proc/iomem from a good kernel (2.6.36-4.fc15)
Created attachment 39262 [details] patch to avoid allocating PNP resources Linus thought it was ugly to hack up pcibios_align_resource() to avoid regions, and as usual, he's right. Here's another try that takes a little different approach -- instead of avoiding the hardcoded 2MB range just below 4GB, this avoids all PNP device resources. The INT0800 device should be enough for this particular issue. This is still ugly in that the PNP devices should be directly in the iomem_resource tree like PCI devices are, but we can't do that quite yet.
Ok, I'll try to backport this to current Fedora kernels (2.6.36.1 + the CRS patch + the "never allocate pci from the last 1M below 4G" patch) and see where I get.
Just to be clear, my current proposed patches do not include a "never allocate pci from the last 1M below 4G" patch. The patch in comment 14 avoids PNP resources, which should be enough to solve the 2530p problem because there's an INT0800 device that claims 0xff000000-0xffffffff. Sorry for the backporting mess; there are many patches between 2.6.36.1 and now. If you want to point me at the patches you're applying to 2.6.36.1, I'd be happy to give you my opinions.
Yeah, I found the "never allocate from last 1M" patch never got upstream, but it was applied to Fedora 2.6.36 kernels to fix the 2530p issue originally (and it does for mjg59's machine with only 2G). I haven't yet booted the kernel I built, but I'll attach the patches that are applied. In the end I think we do want to backport this to 2.6.36 and send to stable@ if we can.
Created attachment 39622 [details] Fedora 2.6.36 kernel patch fixing original 2530p issue
Created attachment 39632 [details] Fedora 2.6.36 kernel CRS fixes patch that caused the problem originally
Created attachment 39642 [details] My attempt to backport your PNP avoidance patch to the Fedora 2.6.36.2 kernel on top of the two above patches
Hmm, looking at my backport this morning I missed a bit in pcibios_align_resource() that should have gotten moved to arch_remove_reservations(). Other than that, comments on my backport attempt?
I think you should drop the patch from comment 18. I don't plan to push that upstream. If we decide we do need to ignore that last 1M or 2M, I'll put that code in arch_remove_reservations() instead of in pcibios_align_resource(). You mentioned the code that moved from pcibios_align_resource() to arch_remove_reservations(); that change *is* in your comment 20 patch. But of course, that patch will change a little bit if you drop the comment 18 patch.
So I've run at least an hour with the patches posted above and had no issues so far. Without the recent PNP resource patch things would fail within 10 minutes. I'll keep running with this kernel for a day or two before we declare unconditional success :) Thanks! I'll let you know.
Still working fine after 6 hours, no issues so far. I think it's fixed.
What patches do you have tested? Just so we don't loose track.
Created attachment 40142 [details] v3 patch to avoid top 1M below 4G It's too late in the .37 cycle to consider the previous patch. Here's the patch Linus proposed, an alternate way to avoid that last 1MB. Can anybody test it?
Created attachment 40292 [details] SeaBIOS/qemu test patch This is just for documentation. Booting Windows 7 on qemu with this SeaBIOS patch shows that Windows ignored the last 1MB of memory below 4GB, even if it's not mentioned in the E820 table or any ACPI device _CRS method.
Created attachment 40462 [details] avoid E820 regions Boy, I hope we converge on a solution here soon... The current proposal for .37 is to revert the top-down allocation patches and add this additional patch. If you're based on 2.6.36, you would want to add these patches, which went in after .36 and will stay in .37: a1862e3 resources: handle overflow when aligning start of available area 6909ba1 resources: ensure callback doesn't allocate outside available space 5d6b1fa resources: factor out resource_clip() to simplify find_resource() a9cea01 resources: add a default alignf to simplify find_resource() and add the patch I'm attaching here.
Fixed by http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=46bdfe6a50b88942f5323f837a3afd93a1c86e60 .