Distribution: Fedora Core 2 with kernel 2.6.7-1.486 RPM from http://people.redhat.com/~arjanv Hardware Environment: Gigabyte 8ITXE motherboard, Pentium 4, Adaptec 29160N Software Environment: (see above) Problem Description: Due to problems in the aic7xxx driver (its I/O task refuses to go to sleep), system does not suspend to ACPI S3 - it tries, gives up, and resumes normal operation. echo 3 > /proc/acpi/sleep produces the following: PM: Preparing system for suspend Stopping tasks: ========== stopping tasks failed (1 tasks remaining) Restarting tasks...<6> Strange, ahc_dv_0 not stopped done Steps to reproduce: echo 3 > /proc/acpi/sleep produces the following:
As Arjan's 2.6.7-1.486 RPM is based on 2.6.8-RC1, the cause might be the following: http://marc.theaimsgroup.com/?l=linux-scsi&m=108306129820558&w=2 This change is present in 2.6.8-RC1 mainline
Created attachment 3385 [details] aic7xxx_swsusp.patch If I apply this patch (attached), I am now able to enter ACPI S3 suspend, but resume is still broken: The first time I try to access the hard disk after resuming, the kernel prints "Kernel panic: Loop 1"
Comment on attachment 3385 [details] aic7xxx_swsusp.patch I'm floating a patch on the mailing lists; see http://marc.theaimsgroup.com/?l=linux-scsi&m=109054640414945&w=2
has this been fixed in recent kernels?
Not that I know of. Last time I checked, the scsi midlayer in general is also missing suspend/resume support. Specifically, you need to resolve all outstanding transactions (quiesce) every device on the SCSI bus before you can put the bus adapter itself to sleep, so you have to implement the suspend/resume driver model in sd, scd drivers et cetera. I made patches for this and for aic7xxx back in the 2.6.12 timeframe but I was having trouble getting the full-time linux-scsi maintainers to take an interest in them. If someone else is interested in advocating for this functionality and resurrecting the patches, I can probably dig my work out and freshen it up.
One also needs to write code in the SD driver that handles spin-up/spin-down in a clean way. I got this working on my specific hard drive but if I remember correctly, some people were worried about the effect of the code on all configurations. I think I was sending an explicit spinup command. Getting everyone to agree upon logic to cleanly reinitialize, and spin up if necessary, a SCSI hard drive after resume is probably the major remaining issue here. Of course resume doesn't work at all right now so anything we do is probably better than nothing ;)
I got a crash on resume using SUSE 10.x having a scsi-only system on my aha2940UW, kernel 2.6.18 There is no disk activity after the finished resume op. X11 etc. are just as before suspend. Then the aic7xxx locks up with a lot of messages, so fast that I cannot read.
err PING ? This is still broken in latest 2.6.22-rc3 ( with or without scsi-misc git patches ) on my Dell Precision 530 MT ( suspend to whatever is broken ) I always get the same error(s) after resume. I attach my lspci output and I managed to log all the errors the card dumps after resume so I attach the broken dmesg as well. Does someone care at all about this issues ?
Created attachment 11611 [details] dmesg
Created attachment 11612 [details] lspci
Unfortunatly, this seems to require someone with deep SCSI knowledge to fix a couple of drivers. That said, we have suspend support in SATA drivers, so the SCSI midlayer must have been updated.
Some work does appear to have been done on the scsi midlayer, but I don't know if this has been merged yet or if it's in a state that works for more than just SATA. Of course, the SCSI midlayer needs to be sorted out before it makes sense to merge any changes for SCSI HBA drivers. See here: http://lwn.net/Articles/157057/
Looks like some parts are merged other not ( but maybe done in some other way ?! ). I guess a good idea is to ask the patch author :) Anyway if someone has some patch for this issues , whatever experimental or not I can test it.
Any updates on this bug, do the problems still exist with latest kernel? Maybe Nathan and Rafael can outline what needs to be done still and whether those are "projects" that need to be announced so someone (if not you) takes ownership. Thanks.
Yes the problems still exists in 2.6.23-rc7 and latest -mm kernel.
Problem still exists but I've taken this bug as far as I can on my own...
Created attachment 13227 [details] aic7xxx-add-suspend-resume Patch to add suspend/resume support to aic7xxx.
With the above patch it should work. We only have to take care to save some extra PCI registers (which the PCI core doesn't know about). And as we're lazy we're doing a full SCSI bus reset on resume, so we don't have to worry about the internal state of the aic7xxx anyway. Patch has been tested by Jens Axboe and accepted in scsi-misc, so we can close this.