Bug 16009
Summary: | ioremap error with radeon (KMS) | ||
---|---|---|---|
Product: | Drivers | Reporter: | Yannick Roehlly (yannick.roehlly) |
Component: | PCI | Assignee: | drivers_pci (drivers_pci) |
Status: | CLOSED CODE_FIX | ||
Severity: | normal | CC: | maciej.rutecki, rjw, yinghai |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.34 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Bug Depends on: | |||
Bug Blocks: | 15310 | ||
Attachments: |
dmesg output
dmesg output booting with "pci=nocrs" Memory and display info form msinfo32.txt dmesg log with Yinghai's patch applied. |
Description
Yannick Roehlly
2010-05-19 21:09:08 UTC
yinghai, is this a DRM bug in the radeon driver? Thanks. Reply-To: yinghai.lu@oracle.com looks like kernel problem. [ 0.203628] PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug [ 0.204028] ACPI: PCI Root Bridge [PCI0] (0000:00) [ 0.204058] pci_root PNP0A08:00: host bridge window [io 0x0000-0x0cf7] [ 0.204058] pci_root PNP0A08:00: host bridge window [io 0x0d00-0xffff] [ 0.204058] pci_root PNP0A08:00: host bridge window [mem 0x000a0000-0x000bffff] [ 0.204058] pci_root PNP0A08:00: host bridge window [mem 0x000d0000-0x000dffff] [ 0.204058] pci_root PNP0A08:00: host bridge window [mem 0xc0000000-0xffffffff] ... [ 0.207053] pci 0000:01:00.0: reg 10: [mem 0xc0000000-0xcfffffff pref] [ 0.207058] pci 0000:01:00.0: reg 14: [io 0x9000-0x90ff] [ 0.207063] pci 0000:01:00.0: reg 18: [mem 0xfddf0000-0xfddfffff] [ 0.207078] pci 0000:01:00.0: reg 30: [mem 0xfddc0000-0xfdddffff pref] [ 0.207095] pci 0000:01:00.0: supports D1 D2 [ 0.207104] pci 0000:00:01.0: PCI bridge to [bus 01-01] [ 0.207107] pci 0000:00:01.0: bridge window [io 0x7000-0x9fff] [ 0.207109] pci 0000:00:01.0: bridge window [mem 0xfdd00000-0xfddfffff] [ 0.207113] pci 0000:00:01.0: bridge window [mem 0xbdf00000-0xddefffff 64bit pref] [ 0.230095] pci 0000:00:01.0: no compatible bridge window for [mem 0xbdf00000-0xddefffff 64bit pref] [ 0.230099] pci 0000:01:00.0: no compatible bridge window for [mem 0xc0000000-0xcfffffff pref] ... [ 0.240916] pci 0000:00:01.0: BAR 9: can't assign mem pref (size 0x20000000) ... [ 0.240922] pci 0000:01:00.0: BAR 0: can't assign mem pref (size 0x10000000) ... [ 19.141094] [drm] Initialized drm 1.1.0 20060810 [ 19.330812] [drm] radeon kernel modesetting enabled. [ 19.330883] radeon 0000:01:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16 [ 19.330888] radeon 0000:01:00.0: setting latency timer to 64 [ 19.332730] [drm] initializing kernel modesetting (RV620 0x1002:0x95C4). [ 19.332945] [drm] register mmio base: 0xFDDF0000 [ 19.332946] [drm] register mmio size: 65536 [ 19.332975] ------------[ cut here ]------------ [ ... please try to boot with pci=nocrs Thanks I assume this is the Asus M51Se laptop, T8300, ATI Radeon HD3470 environment mentioned in https://bugzilla.kernel.org/show_bug.cgi?id=11103 I'm pretty sure this will work with "pci=nocrs", but that doesn't really tell us anything about how we should fix this. The BIOS programmed the 00:01.0 bridge to a range that starts before the upstream host bridge window. We tried to reassign the 00:01.0 window, but there wasn't enough space for its original size (0x20000000), so we just disabled it: pci_root PNP0A08:00: host bridge window [mem 0xc0000000-0xffffffff] pci 0000:00:01.0: bridge window [mem 0xbdf00000-0xddefffff 64bit pref] pci 0000:00:01.0: no compatible bridge window for [mem 0xbdf00000-0xddefffff 64bit pref] pci 0000:00:01.0: BAR 9: can't assign mem pref (size 0x20000000) pci 0000:00:01.0: bridge window [mem pref disabled] That took away the window leading to the Radeon BAR, so we disabled that, too, even though it was within the host bridge window: pci 0000:01:00.0: reg 10: [mem 0xc0000000-0xcfffffff pref] pci 0000:01:00.0: no compatible bridge window for [mem 0xc0000000-0xcfffffff pref] pci 0000:01:00.0: BAR 0: can't assign mem pref (size 0x10000000) I can imagine doing something like adjusting the host bridge window if we find a P2P bridge window that doesn't fit, or trying to reallocate the P2P bridge window based on what we actually need (we only need 0x10000000, and it would have worked to allocate that). But I would *really* like to know how Windows handles this. If we can figure that out, we can make Linux work the same way, which is much more likely to work across many machines. Yannick, is there any way you can boot Windows on this config and collect the memory map using the "System Information", a.k.a. msinfo32.exe, tool? See https://bugzilla.kernel.org/attachment.cgi?id=26066 for an example of what I'm looking for. Hi Bjorn, thank for considering the problem (thanks to Andrew and Yinghai too). > I assume this is the Asus M51Se laptop, T8300, ATI Radeon HD3470 Yes, it's the same machine. > I'm pretty sure this will work with "pci=nocrs", but that doesn't > really tell us anything about how we should fix this. Alas, no. I attach the dmesg output of booting with "pci=nocrs". > Yannick, is there any way you can boot Windows on this config and > collect the memory map using the "System Information" For once my Windows 7RC install is useful. ;-) I also attach the memory and display information from msinfo32.exe. I'm sorry, it's in French. Note, that when I faced bug #11103 I noticed that FreeBSD worked with 4GiB memory (with Vesa driver). If you need, I can do tests with it (preferably with Freesbie livecd as I'm not at easy with BSD slice partitioning). Yannick Created attachment 26445 [details]
dmesg output booting with "pci=nocrs"
Created attachment 26446 [details]
Memory and display info form msinfo32.txt
Reply-To: yinghai.lu@oracle.com [ 0.230216] pci 0000:00:01.0: address space collision: [mem 0xbdf00000-0xddefffff 64bit pref] conflicts with System RAM [mem 0x00100000-0xbff9ffff] [ 0.230216] pci 0000:01:00.0: no compatible bridge window for [mem 0xc0000000-0xcfffffff pref] the BIOS looks crazy... memory is overlapped with mmio Reply-To: yinghai.lu@oracle.com [ 0.207901] pci 0000:00:1c.4: PCI bridge to [bus 06-07] [ 0.207905] pci 0000:00:1c.4: bridge window [io 0xb000-0xbfff] [ 0.207908] pci 0000:00:1c.4: bridge window [mem 0xfe100000-0xfe8fffff] [ 0.207913] pci 0000:00:1c.4: bridge window [mem 0xddf00000-0xdfefffff 64bit pref] looks like you can clear bridge of 00:1c.4, and then kexec current kernel 00:01.0 will get resource it needed. and 00:1c.4 could be pushed to 0xf0000000 Handled-By : Yinghai Lu <yinghai.lu@oracle.com> I think these: 0xC0000000-0xFFFFFFFF Bus PCI OK 0xC0000000-0xFFFFFFFF Port racine PCI Express Mobile Intel(R) PM965/GM965/GL960/GS965 Express - 2A01 OK correspond to these Linux messages: pci_root PNP0A08:00: host bridge window [mem 0xc0000000-0xffffffff] pci 0000:00:01.0: bridge window [mem 0xbdf00000-0xddefffff 64bit pref] So my preliminary theory is that Windows changed that 00:01.0 P2P bridge window to conform to the upstream host bridge window. But I wonder if you could find that bridge in the Device Manager (look under "System Devices" for a Root Port at PCI bus 0, device 1, function 0), and write down or take a screenshot of the "Resources" tab? I want to make sure we're comparing the same device. Hi Bjorn, I couldn't find a Root Port at PCI bus 0, device 1, function 0 in the device manager of Windows 7. Nevertheless, displaying the devices "by attachment" makes the radeon card appear as attached to Port racine PCI Express Mobile Intel(R) PM965/GM965/GL960/GS965 Express - 2A01 Its resources are : Plage mémoire: 00000000FDD00000 - 00000000FDDFFFFF Plage mémoire: 00000000C0000000 - 00000000DFFFFFFF Plage d'E/S: 7000 - 9FFF IRQ : 0xFFFFFFFE(-2) Plage mémoire: 00000000000A0000 - 00000000000BFFFF Plage d'E/S: 03B0 - 03BB Plage d'E/S: 03C0 - 03DF Sincerely, Yannick I don't think this problem is related to _CRS. The system used to work without _CRS. 2.6.34 automatically turns on "pci=use_crs", but the system fails the same way when booted with "pci=nocrs". I think we should concentrate on getting things to work again with "pci=nocrs", and then we can worry about whether enabling _CRS makes any difference. The problem is the 00:01.0 bridge prefetchable memory aperture that overlaps system memory. In bug 11103, with kernel 2.6.30-rc1, the dmesg in attachment 20984 [details] shows that we reduced the size of that bridge aperture and reassigned it so it no longer overlaps system memory: Linux version 2.6.30-rc1-tip (yannick@tardis) ... BIOS-e820: 0000000000100000 - 00000000bffa0000 (usable) pci 0000:01:00.0: reg 10 32bit mmio: [0xc0000000-0xcfffffff] pci 0000:00:01.0: bridge 64bit mmio pref: [0xbdf00000-0xddefffff] (size 0x20000000) pci 0000:00:01.0: BAR 9: can't allocate resource pci 0000:01:00.0: BAR 0: can't allocate resource pci 0000:01:00.0: BAR 0: got res [0xc0000000-0xcfffffff] bus [0xc0000000-0xcfffffff] flags 0x21208 pci 0000:01:00.0: BAR 0: moved to bus [0xc0000000-0xcfffffff] flags 0x21208 pci 0000:00:01.0: PREFETCH window: 0xc0000000-0xcfffffff (size 0x10000000) In 2.6.34 (attachment 26445 [details]), we start with the same overlap, but for some reason, we can't reassign the 00:01.0 bridge aperture: Linux version 2.6.34 (yannick@tardis) ... BIOS-e820: 0000000000100000 - 00000000bffa0000 (usable) pci 0000:01:00.0: reg 10: [mem 0xc0000000-0xcfffffff pref] pci 0000:00:01.0: bridge window [mem 0xbdf00000-0xddefffff 64bit pref] pci 0000:00:01.0: address space collision: [mem 0xbdf00000-0xddefffff 64bit pref] conflicts with System RAM [mem 0x00100000-0xbff9ffff] pci 0000:01:00.0: no compatible bridge window for [mem 0xc0000000-0xcfffffff pref] pci 0000:00:01.0: BAR 9: can't assign mem pref (size 0x20000000) pci 0000:01:00.0: BAR 0: can't assign mem pref (size 0x10000000) pci 0000:00:01.0: bridge window [mem pref disabled] I looked through the drivers/pci changes between 2.6.33 and .34, and this one: d65245c PCI: don't shrink bridge resources sounds like a possibility. In 2.6.30, we reduced the size of the aperture from 0x20000000 to 0x10000000. If d65245c prevents us from reducing the size, the allocation will fail. Yannick, can you try checking out cd81e1ea1a4c (the parent of d65245c) and d65245c itself, and booting them to see whether that's what introduced the problem? If d65245c isn't it, I'm afraid the quickest way forward will be to bisect. Reply-To: yinghai.lu@oracle.com On 06/02/2010 02:36 PM, Bjorn Helgaas wrote: > I don't think this problem is related to _CRS. The system used to > work without _CRS. 2.6.34 automatically turns on "pci=use_crs", but > the system fails the same way when booted with "pci=nocrs". I think > we should concentrate on getting things to work again with "pci=nocrs", > and then we can worry about whether enabling _CRS makes any difference. > > The problem is the 00:01.0 bridge prefetchable memory aperture that > overlaps system memory. > > In bug 11103, with kernel 2.6.30-rc1, the dmesg in attachment 20984 [details] > shows that we reduced the size of that bridge aperture and reassigned > it so it no longer overlaps system memory: > > Linux version 2.6.30-rc1-tip (yannick@tardis) ... > BIOS-e820: 0000000000100000 - 00000000bffa0000 (usable) > pci 0000:01:00.0: reg 10 32bit mmio: [0xc0000000-0xcfffffff] > pci 0000:00:01.0: bridge 64bit mmio pref: [0xbdf00000-0xddefffff] (size > 0x20000000) > pci 0000:00:01.0: BAR 9: can't allocate resource > pci 0000:01:00.0: BAR 0: can't allocate resource > pci 0000:01:00.0: BAR 0: got res [0xc0000000-0xcfffffff] bus > [0xc0000000-0xcfffffff] flags 0x21208 > pci 0000:01:00.0: BAR 0: moved to bus [0xc0000000-0xcfffffff] flags 0x21208 > pci 0000:00:01.0: PREFETCH window: 0xc0000000-0xcfffffff (size > 0x10000000) > > In 2.6.34 (attachment 26445 [details]), we start with the same overlap, but for > some reason, we can't reassign the 00:01.0 bridge aperture: > > Linux version 2.6.34 (yannick@tardis) ... > BIOS-e820: 0000000000100000 - 00000000bffa0000 (usable) > pci 0000:01:00.0: reg 10: [mem 0xc0000000-0xcfffffff pref] > pci 0000:00:01.0: bridge window [mem 0xbdf00000-0xddefffff 64bit pref] > pci 0000:00:01.0: address space collision: [mem 0xbdf00000-0xddefffff 64bit > pref] conflicts with System RAM [mem 0x00100000-0xbff9ffff] > pci 0000:01:00.0: no compatible bridge window for [mem > 0xc0000000-0xcfffffff pref] > pci 0000:00:01.0: BAR 9: can't assign mem pref (size 0x20000000) > pci 0000:01:00.0: BAR 0: can't assign mem pref (size 0x10000000) > pci 0000:00:01.0: bridge window [mem pref disabled] > > I looked through the drivers/pci changes between 2.6.33 and .34, and > this one: > > d65245c PCI: don't shrink bridge resources > > sounds like a possibility. In 2.6.30, we reduced the size of the > aperture from 0x20000000 to 0x10000000. If d65245c prevents us from > reducing the size, the allocation will fail. your analyzing is right. please check if following patch is fixing the problem. [PATCH] x86, pci: clear bridge resource size if BIOS assign bad one make sure We can reject wrong size from BIOS. Signed-off-by: Yinghai Lu <yinghai@kernel.org> --- arch/x86/pci/i386.c | 1 + 1 file changed, 1 insertion(+) Index: linux-2.6/arch/x86/pci/i386.c =================================================================== --- linux-2.6.orig/arch/x86/pci/i386.c +++ linux-2.6/arch/x86/pci/i386.c @@ -136,6 +136,7 @@ static void __init pcibios_allocate_bus_ * child resource allocations in this * range. */ + r->start = r->end = 0; r->flags = 0; } } Hi everybody!
Le Thursday 03 June 2010 04:09:14 Yinghai Lu, vous avez écrit :
> please check if following patch is fixing the problem.
Yes, Yinghai, your patch solves the problem. Kudos!
Bjorn, do you want me to test the git versions you mentioned earlier to
clearly delimit the problem or this working patch is enough to make it clear?
Yannick
Yannick, would you mind attaching your dmesg log when using Yinghai's patch? I'd like to understand how that fix works, and maybe there's a clue in the log. Yinghai, if we use your patch, the changelog needs to include the URL of this bug report. Your patch affects x86, but there are several similar uses of pci_claim_resource() in other architectures. I think most of this code is actually generic and should not be architecture-specific, but until that is cleaned up, we should at least audit the other uses to see whether they need the same fix. Created attachment 26636 [details]
dmesg log with Yinghai's patch applied.
Here is the log, Bjorn.
Handled-By : Bjorn Helgaas <bjorn.helgaas@hp.com> Handled-By : Yinghai Lu <yinghai@kernel.org> Patch : https://patchwork.kernel.org/patch/104169/ Fixed by commit 837c4ef13c44296bb763a0ca0e84a076592474cf . |