Bug 5343 - IOMMU setup broken 2.6.13.2 -> 2.6.14-rc2
Summary: IOMMU setup broken 2.6.13.2 -> 2.6.14-rc2
Status: CLOSED PATCH_ALREADY_AVAILABLE
Alias: None
Product: Memory Management
Classification: Unclassified
Component: Other (show other bugs)
Hardware: i386 Linux
: P2 blocking
Assignee: Andrew Morton
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-10-01 09:46 UTC by jl-icase
Modified: 2006-08-03 05:51 UTC (History)
2 users (show)

See Also:
Kernel Version: 2.6.14-rc2
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
Correct dmesg output under good Fedora Core kernel 2.6.13-1.1526_FC4smp (20.74 KB, text/plain)
2005-10-16 09:29 UTC, jl-icase
Details

Description jl-icase 2005-10-01 09:46:29 UTC
Most recent kernel where this bug did not occur: 2.6.13.2
Distribution: kernel.org
Hardware Environment: dual-core Athlon w/4GB RAM, ASUS A8N-SLI Premium motherboard
Software Environment: FC4
Problem Description:

IOMMU handling is broken in 2.6.14-rc2, but it used to work in 2.6.13.2.

ASUS BIOS 1007 sets up memory remapping (using PAE or hardware on AMD's Rev.E
CPUs) to make all 4GB visible to the OS; this works on dual-core Athlons for
both SMP and uniprocessor kernels 2.6.13.2; but is broken in 2.6.14-rc2 SMP
kernels, which lock up on bootup unless PCI memory remapping is disabled in
BIOS.  With remapping disabled, the kernel boots, but I see only 3GB instead of 4GB.

Steps to reproduce:

Build the SMP kernel 2.6.14-rc2, enable memory remapping in BIOS, reboot.  Run
large stream benchmarks (e.g. 2GB) unless the kernel crashes.

SMP kernels based on 2.6.12 boot with memory remapping enabled in BIOS, but
lockup when >200MB gets used, e.g. by stream benchmark.  This was fixed in
2.6.13, which worked fine, e.g.:

 Linux version 2.6.13.1_smp [...]
 BIOS-provided physical RAM map:
  BIOS-e820: 0000000000000000 - 000000000009e800 (usable)
  BIOS-e820: 000000000009e800 - 00000000000a0000 (reserved)
  BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
  BIOS-e820: 0000000000100000 - 00000000bfff0000 (usable)
  BIOS-e820: 00000000bfff0000 - 00000000bfff3000 (ACPI NVS)
  BIOS-e820: 00000000bfff3000 - 00000000c0000000 (ACPI data)
  BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved)
  BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
  BIOS-e820: 0000000100000000 - 0000000140000000 (usable)
[...]
 Checking aperture...
 CPU 0: aperture @ 1a60000000 size 32 MB
 Aperture from northbridge cpu 0 too small (32 MB)
 No AGP bridge found
 Your BIOS doesn't leave a aperture memory hole
 Please enable the IOMMU option in the BIOS setup
 This costs you 64 MB of RAM
 Mapping aperture over 65536 KB of RAM @ 8000000
[...]
 PCI-DMA: Disabling AGP.
 PCI-DMA: aperture base @ 8000000 size 65536 KB
 PCI-DMA: Reserving 64MB of IOMMU area in the AGP aperture

...and the above works fine.

Switching to kernel 2.6.14-rc2, I get a total
lockup about 30 seconds after boot when PCI memory remapping is enabled in BIOS.
 This kernel boots only if BIOS remapping is disabled, but then I see only 3GB
(not 4GB) of RAM.  The IOMMU related messages become:

 Linux version 2.6.14-rc2_smp
[...]
 PCI-DMA: Disabling IOMMU.

Note that CONFIG_GART_IOMMU=y was used in both kernels (the second kernel had
fewer configuration options turned on, e.g. no NUMA since the CPU was a single
dual-core Athlon).

ASUS BIOS 1007 doesn't have any IOMMU options, but the first example used PCI
memory remapping, while the second one crashed unless remapping was off.  I
believe that AMD's IOMMU is needed to see full 4GB, so the fact that the
second example locks up unless remapping is disabled is bad news.  On today's
64-bit machines, old 3GB memory limit is unacceptable -- sometimes even 16-32GB
needs to be usable in Linux.
Comment 1 Andi Kleen 2005-10-10 12:04:00 UTC
Can you show the full message log for all cases (especially the lockup) ?
Comment 2 jl-icase 2005-10-16 09:17:39 UTC
Memory setup is still broken in 2.6.14-rc4 SMP on dual core Athlons with 4GB
physical RAM (it used to work properly in 2.6.13.2 SMP).

I cannot provide a log because 2.6.14-rc4 SMP locks up on bootup, but what I
could copy from the screen reads:

PCI-DMA: More than 4GB of RAM and no IOMMU.
PCI-DMA: 32bit PCI IO may malfunction.
PCI-DMA: Disabling IOMMU.

...and at that point the new kernel totally locks up.

FYI, ASUS BIOS uses HW memory remapping with Rev. E Athlons (and later) to make
all 4GB visible.  Also, 2.6.14-rc4 was built ("make oldconfig" and take defaults
on all new options) with Fedora Core 4 config-2.6.13-1.1526_FC4smp (this FC4
kernel also works fine).  
Comment 3 jl-icase 2005-10-16 09:29:22 UTC
Created attachment 6312 [details]
Correct dmesg output under good Fedora Core kernel 2.6.13-1.1526_FC4smp

This dmesg output results when good Fedora Core kernel 2.6.13-1.1526_FC4smp is
used (the system works fine).  Identically configured (plus defaults on new
config options) 2.6.14-rc4 SMP kernel locks up on bootup after failing to
detect IOMMU.
Comment 4 Andi Kleen 2005-10-17 03:10:42 UTC
There are actually no relevant changes between 2.6.13 and 2.6.14.
I don't know what kernel changes Fedora did. Can you verify that
2.6.13 from kernel.org also didn't show the problem? 
Comment 5 jl-icase 2005-10-17 21:27:37 UTC
2.6.13.2 from kernel.org worked fine -- and that's why 2.6.13-based FC4 kernels
also work fine.  2.6.14-rc{2,4} from kernel.org broke this.
Comment 6 jl-icase 2005-10-20 00:46:09 UTC
Another interesting memory event, possibly related, observed (only once) with
the above-mentioned (good) FC4 kernel:  I got a number of memory access faults
(non-canonical pointers?) on starting known good binaries, but upon reboot
problems went away.  Perhaps even that (good) 2.6.13.2-based FC4 SMP kernel can
have intermittent memory setup problems...  

 WARNING:  Kernel Errors Present
    RPC: error 5 connecting to ...:  2 Time(s)
    mozilla-xremote[3425] general protection rip:3325e121eb rsp:7fffffe2baa8
error:0...:  1 Time(s)
    mozilla-xremote[3444] general protection rip:3325e121eb rsp:7ffffff04c48
error:0...:  1 Time(s)
    mozilla-xremote[3485] general protection rip:3325e121eb rsp:7fffffb7a828
error:0...:  1 Time(s)
    mozilla-xremote[3504] general protection rip:3325e121eb rsp:7fffff8fdac8
error:0...:  1 Time(s)
    mozilla-xremote[3523] general protection rip:3325e121eb rsp:7ffffffae898
error:0...:  1 Time(s)
    mozilla-xremote[3537] general protection rip:3325e121eb rsp:7fffffd8ad78
error:0...:  1 Time(s)
    mozilla-xremote[3567] general protection rip:3325e121eb rsp:7fffff8acbd8
error:0...:  1 Time(s)
    mozilla-xremote[3570] general protection rip:3325e121eb rsp:7fffff9e77c8
error:0...:  1 Time(s)
    mozilla-xremote[3779] general protection rip:3325e121eb rsp:7fffffb99c78
error:0...:  1 Time(s)
    mozilla-xremote[3793] general protection rip:3325e121eb rsp:7fffffa88908
error:0...:  1 Time(s)
    mozilla-xremote[4172] general protection rip:3325e121eb rsp:7ffffff31098
error:0...:  1 Time(s)
    mozilla-xremote[4189] general protection rip:3325e121eb rsp:7ffffff39ff8
error:0...:  1 Time(s)
    mozilla-xremote[4203] general protection rip:3325e121eb rsp:7fffff97f8e8
error:0...:  1 Time(s)
    thunderbird-bin[3435] general protection rip:3325e121eb rsp:7fffffb24ae8
error:0...:  1 Time(s)
    thunderbird-bin[3454] general protection rip:3325e121eb rsp:7fffffa5eab8
error:0...:  1 Time(s)
    thunderbird-bin[3495] general protection rip:3325e121eb rsp:7fffffaf7db8
error:0...:  1 Time(s)
    thunderbird-bin[3514] general protection rip:3325e121eb rsp:7fffffb38f58
error:0...:  1 Time(s)
    xsetroot[3109] general protection rip:3325e121eb rsp:7fffffc2dff8
error:0...:  1 Time(s)
    xsetroot[3608] general protection rip:3325e121eb rsp:7fffffdd1f88
error:0...:  1 Time(s)

--------------------

Comment 7 jl-icase 2005-11-23 07:35:14 UTC
The latest 2.6.14-based FC4 kernels 2.6.14-1.1637_FC4{smp,} fail to boot,
whereas 2.6.13-based FC4 kernels 2.6.13-1.1532_FC4{smp,} worked fine.  The
machine uses a dual-core Athlon 64 CPU and ASUS A8N-SLI Premium motherboard with
the latest BIOS 1008.  The machine doesn't have AGP bus, only PCI-E.

The problem was (re?)-introduced in 2.6.14, possibly related to SWIOTLB handling
in arch/x86_64/kernel/aperture.c -- on bootup, 2.6.14-based kernels complain
about not finding IOMMU and more than 4GB of RAM, then panic.

Since the IOMMU is definitively present, this is probably just a failure to open
a suitable aperture for the IOMMU.  Doing a diff on aperture.c between 2.6.13
and 2.6.14, I find the following:

--- kernel-2.6.13/linux-2.6.13/arch/x86_64/kernel/aperture.c    2005-08-28
17:41:01.000000000 -0600
+++ kernel-2.6.14/linux-2.6.14/arch/x86_64/kernel/aperture.c    2005-10-27
18:02:08.000000000 -0600
@@ -245,6 +245,8 @@

        if (aper_alloc) {
                /* Got the aperture from the AGP bridge */
+       } else if (swiotlb && !valid_agp) {
+               /* Do nothing */
        } else if ((!no_iommu && end_pfn >= 0xffffffff>>PAGE_SHIFT) ||
                   force_iommu ||
                   valid_agp ||

Could the above change be responsible for boot failure with 2.6.14-based
kernels?  Again, 2.6.13-based kernels work fine.
Comment 8 jl-icase 2005-11-29 12:11:47 UTC
Some ideas: http://lkml.org/lkml/2005/11/6/54 suggests booting with "iommu=soft
swiotlb=65536".  The theory proposed therein says that GART is probably not
fully initialized in a system that has no AGP -- but that GART is needed anyway. 

This is a severe kernel-panics-on-bootup type bug introduced in the 2.6.13->14
transition.  Reverting to 2.6.13.2 kernels makes the bug go away.

Do you still need info, or can this be declared an official Linux kernel bug?
Comment 9 Andi Kleen 2005-11-29 12:17:13 UTC
I got my hands on an Asus board (K8N-DL) and tracked it down to a broken
MCFG table. The workaround is to boot with pci=nommconf (please type exactly
as written, I suspect quite a few people misspelled it)

I have a workaround in process to fix up the broken MCFG table.

pci=nommconf helps you, right?
Comment 10 jl-icase 2005-11-29 19:30:45 UTC
Correct!  Kernel 2.6.14-1.1637_FC4smp boots normally with the pci=nommconf boot
line option.  Since this doesn't use SWIOTLB, I assume that it should have less
overhead than soft IOMMU.

With pci=nommconf, I see the following on bootup:

[...]
Allocating PCI resources starting at c2000000 (gap: c0000000:20000000)
Checking aperture...
CPU 0: aperture @ 1a30000000 size 32 MB
Aperture from northbridge cpu 0 too small (32 MB)
No AGP bridge found
Your BIOS doesn't leave a aperture memory hole
Please enable the IOMMU option in the BIOS setup
[...My comment: There is *no* IOMMU option in ASUS BIOS setup...]
This costs you 64 MB of RAM
Mapping aperture over 65536 KB of RAM @ 8000000
Built 1 zonelists
Kernel command line: ro root=LABEL=/ rhgb quiet 3 pci=nommconf
[...]
PCI-DMA: Disabling AGP.
PCI-DMA: aperture base @ 8000000 size 65536 KB
PCI-DMA: Reserving 64MB of IOMMU area in the AGP aperture
[...My comment: There is no AGP on this system...]
Comment 11 Adrian Bunk 2006-08-03 05:13:32 UTC
Andi, what is the status of this issue?
Comment 12 Andi Kleen 2006-08-03 05:42:29 UTC
Should be all solved. Well the workarounds caused other issues that
are not solved yet, but that doesn't affect this system.

Note You need to log in before you can comment on or make changes to this bug.