Most recent kernel where this bug did not occur: long-standing bug; probably never worked Distribution: unofficial Debian Sid for AMD64 Hardware Environment: Athlon 64 FX-53, Asus A8V Deluxe Software Environment: gcc (GCC) 4.0.2 20050725 (prerelease) (Debian 4.0.1-3.0.0.1.gcc4) Problem Description: When amd64-agp is loaded as a module and swsuspend is used, a spontaneous reboot happens on resume. Resume works fine if amd64-agp is built-in into the kernel. Originally reported to LKML: http://marc.theaimsgroup.com/?l=linux-kernel&m=112180650210831&w=2 A very similar problem was discovered and fixed for i386 in April 2004: http://marc.theaimsgroup.com/?l=linux-kernel&m=108310336904458&w=2 Steps to reproduce: Boot with "init=/bin/sh". mount -t sysfs sys sys swapon -a modprobe amd64-agp echo disk > /sys/power/state Attempt to resume. Notice the sudden reboot after the saved image is successfully loaded from disk.
Can you try compiling amd64-agp into the kernel, as opposed to module? If that helps, we know it is different problem than the one on i386.
Yes, compiling amd64-agp into the kernel (not as a module) helps. That's what I meant with "Resume works fine if amd64-agp is built-in into the kernel." .
Oops, sorry, did not notice that. It should be some problem with agp, not generic like in i386 case, then. Try placing while(1) at begining and end of amd64-agp resume function; if it hangs instead of rebooting, you can find out where exactly problem lies. This should probably go to davej...
amd64-agp does not have resume function. But even if it had one, it would be of no help, because the problem occurs right in swsusp_arch_resume in arch/x86_64/kernel/suspend_asm.S during the copying of pages. It occurs between the labels "loop" and "done". It also might be of interest that the problem manifests itself either as a spontaneous reboot or an oops - it depends on kernel version and exact .config. I've tested 2.6.13-rc5-mm1 too, which reboots. At the moment I have a configuration with 2.6.13-rc6, which oopses. I'll attach the oops transcription.
Created attachment 5561 [details] oops with 2.6.13-rc6
Sanity check: does it suspend/resume when amd64-agp is not loaded at all? Not compiled in, not loaded. Does it work on in 32-bit mode? Okay, that driver is quite strange, it hooks early in the boot process, has strange CONFIG_GART_IOMMU option and works around some hardware problems. What about simply forcing it into kernel (i.e. disable module option for this module in kernel config?)
Yes, suspend/resume works when amd64-agp is not loaded at all. And yes, in 32-bit mode suspend/resume works even with amd64-agp loaded as a module.
Finally I got back home to my amd64 machine after 4 months. I am testing 2.6.15-rc7 now and I'm pleased to report that the bug seems fixed. I cannot reproduce the spontaneous reset anymore. Resuming now works.