Most recent kernel where this bug did not occur: n/a
Distribution: Debian sid
Hardware Environment: AMD athlon 64 X2 dual core with 4Gb RAM, chipset nForce 500 SLI
Software Environment: Linux wine.dyndns.org 220.127.116.11-gb506e24f-dirty #9 SMP Tue Dec 11 11:00:48 CET 2007 x86_64 GNU/Linux
I'm getting a crash in swsusp_save() on suspend, when it tries to access
address 0xffff810008000000 (sorry I don't have the full oops, let me know if
you want me to copy it down by hand). This address is apparently the first page
in the GART IOMMU range:
PCI-DMA: Disabling AGP.
PCI-DMA: aperture base @ 8000000 size 65536 KB
PCI-DMA: using GART IOMMU.
PCI-DMA: Reserving 64MB of IOMMU area in the AGP aperture
My guess is that the IOMMU aperture range should somehow be skipped when
Suspending works fine if I boot with iommu=soft, or with mem=3G.
So you haven't tested any kernel earlier than 2.6.23?
Yes, a copy of the oops would be great, please. You shouldn't need to write
it down - try netconsole (Documentation/networking/netconsole.txt).
It's worth setting up netconsole...
netconsole won't work at that point (devices suspended, interrupts disabled). Serial console might be useful, though, but I doubt that box has a serial port.
I guess the problem is present in all kernels to date.
Alexandre, can you attach a dmesg output, please?
Created attachment 13982 [details]
dmesg output attached.
I haven't tried other kernels, if that would be useful I could do it, any version you want me to try?
The box does have a serial port but I don't have anything to plug into it I'm afraid.
> netconsole won't work at that point (devices suspended, interrupts
> disabled). Serial console might be useful, though, but I doubt that
> box has a serial port.
> I guess the problem is present in all kernels to date.
at least as long as netconsole output going _into_ suspend goes, i
posted some really bad hacks to lkml some time ago that allow a
per-device exclusion of the suspend sequence. (the suspend_disabled
That way i was able to get a netconsole output far into the suspend, up
to the point where we do the ACPI mmio command that physically suspends
getting output from the system when it is coming out of resume is much
harder. (but this crash is about going into the suspend, right?)
(In reply to comment #4)
> getting output from the system when it is coming out of resume is much
> harder. (but this crash is about going into the suspend, right?)
Yes, but it happens in the middle of the "critical section" in which everything is supposed to be off, except for the CPU executing the code. IOW, it's very much like a resume failure ...
(In reply to comment #3)
> Created an attachment (id=13982) [details]
> dmesg output
> dmesg output attached.
> I haven't tried other kernels, if that would be useful I could do it, any
> version you want me to try?
Hm, it looks like this problem has always been present ...
> The box does have a serial port but I don't have anything to plug into it I'm
Well, it seems that the IOMMU driver should mark the aperture as "nosave" for us (it overlaps with a memory area that the image-creating code considers as useable).
Did you try to enable the IOMMU option in the BIOS setup, BTW?
(In reply to comment #6)
> Did you try to enable the IOMMU option in the BIOS setup, BTW?
There doesn't seem to be any way to configure IOMMU in my BIOS setup, or if there is one I couldn't find it... It's an ASUS M2N-E SLI, chipset nForce 500.
OK, thanks. Probably Asus doesn't think you'd need that.
Unfortunately, I'm not familiar with the IOMMU handling code, so I'm afraid it'll take some time to come up with a fix ...
hmmm, what if you boot with iommu=off?
Actually I retested and the bug is fixed now, most likely by commit 2050d45d7c32cbad7a070d04256237144a0920db.
Author: Pavel Machek <firstname.lastname@example.org>
Date: Thu Mar 13 23:05:41 2008 +0100
x86: fix long standing bug with usb after hibernation with 4GB ram
shipped in 2.6.25-rc7