Bug 6840
Summary: | HPA needs to be reinitilized on resume | ||
---|---|---|---|
Product: | IO/Storage | Reporter: | Lee Trager (lt73) |
Component: | IDE | Assignee: | Rafael J. Wysocki (rjwysocki) |
Status: | CLOSED CODE_FIX | ||
Severity: | high | CC: | eric, eric, federico, forrestwenner, pavel, rjwysocki |
Priority: | P2 | ||
Hardware: | i386 | ||
OS: | Linux | ||
Kernel Version: | 2.6.17-Present | Subsystem: | |
Regression: | --- | Bisected commit-id: | |
Bug Depends on: | |||
Bug Blocks: | 7216 | ||
Attachments: |
My attempt at fixing the hpa problem - its broken
Working patch against 2.6.18-rc4 Patch using correct Linux design Patch for 2.6.21 |
Description
Lee Trager
2006-07-15 18:35:03 UTC
I have the same problem with my ATA (not SATA) drives. Hardware: GigaByte K8NS-939 MB with nForce3 chipset (amd74xx ide driver) with Athlon 64 3000+, 120 GB SATA drive (/dev/sda, root) 80GB Samsung ATA drive (/dev/hda, secondary HD), NEC ATAPI DVDRW drive (/dev/hdc). 2.6.16 kernel (gentoo). The computer hibernates to RAM (ACPI S3) fine and wakes up, but ATA devices become unusable and eventually lock up the system. The SATA HDD resumes corectly and continues to work, but attempting to use the ATA HDD or ATAPI DVDRW gives many timeout errors and usually (always with the HDD, sometimes with the DVDRW) locks up the system (NumLock and CapsLock LEDs flash, system completely unresponsive). Examples of errors: Jul 15 00:40:17 [kernel] hdc: ide_intr: huh? expected NULL handler on exit Jul 15 00:40:17 [kernel] hdc: ATAPI reset complete Jul 15 00:41:53 [kernel] hdc: cdrom_decode_status: status=0x51 { DriveReady SeekComplete Error } Jul 15 00:41:53 [kernel] hdc: cdrom_decode_status: error=0x44 { AbortedCommand LastFailedSense=0x04 } Jul 15 00:41:53 [kernel] ide: failed opcode was: unknown Jul 15 00:41:59 [kernel] hdc: cdrom_decode_status: status=0x51 { DriveReady SeekComplete Error } Jul 15 00:41:59 [kernel] hdc: cdrom_decode_status: error=0x44 { AbortedCommand LastFailedSense=0x04 } Jul 15 00:41:59 [kernel] ide: failed opcode was: unknown Jul 15 00:42:06 [kernel] hdc: cdrom_decode_status: status=0x51 { DriveReady SeekComplete Error } Jul 15 00:42:06 [kernel] hdc: cdrom_decode_status: error=0x44 { AbortedCommand LastFailedSense=0x04 } Jul 15 00:42:06 [kernel] ide: failed opcode was: unknown Jul 15 00:42:12 [kernel] hdc: cdrom_decode_status: status=0x51 { DriveReady SeekComplete Error } Jul 15 00:42:12 [kernel] hdc: cdrom_decode_status: error=0x44 { AbortedCommand LastFailedSense=0x04 } Jul 15 00:42:12 [kernel] ide: failed opcode was: unknown Jul 15 00:42:12 [kernel] hdc: DMA disabled Jul 15 00:42:12 [kernel] hdc: ide_intr: huh? expected NULL handler on exit Jul 15 00:42:12 [kernel] hdc: ATAPI reset complete Jul 15 00:42:12 [kernel] ISO 9660 Extensions: Microsoft Joliet Level 3 Jul 15 00:42:12 [kernel] ISO 9660 Extensions: RRIP_1991A Jul 15 00:42:42 [kernel] hdc: tray open Jul 15 00:42:42 [kernel] end_request: I/O error, dev hdc, sector 64 Jul 15 00:42:42 [kernel] Buffer I/O error on device hdc, logical block 8 Jul 15 00:42:42 [kernel] hdc: tray open Jul 15 00:42:42 [kernel] end_request: I/O error, dev hdc, sector 64 Jul 15 00:42:42 [kernel] Buffer I/O error on device hdc, logical block 8 Jul 15 00:42:42 [kernel] hdc: tray open Jul 15 00:42:42 [kernel] end_request: I/O error, dev hdc, sector 64 Jul 15 00:42:42 [kernel] Buffer I/O error on device hdc, logical block 8 Jul 15 00:42:42 [kernel] hdc: tray open Jul 15 00:42:42 [kernel] end_request: I/O error, dev hdc, sector 64 Jul 15 00:42:42 [kernel] Buffer I/O error on device hdc, logical block 8 (hdc is the DVDRW) Jul 4 11:40:25 [kernel] hda: dma_timer_expiry: dma status == 0x21 (hda is the ATA HDD) Re-setting DMA with hdparm does not help. Restarting dbus service locks up the system. Unloading ide_cd and ide_disk modules before suspend doesn't help, the system locks up attempting to reload them after resume. I also see this bug but I have a configuration almost identical to the original reporter. Apparently I don't know my way around the system well enough to capture information while my system is in the midst of locking up, so I'm glad the original poster figured it out. Adding myself as a CC so I can track/test the fix when it rolls out. Ive spoken to a developer of the suspend2 project and apparently this is a problem with the ide driver. Ive joined the linux-ide mailing list and Ill try to figure out a fix sometime next week, works a little crazy this week. I have this bug too with a Thinkpad T41 after installing Fedora 5. However, I did not experience it with Fedora 4 even when running the same kernel build (I use Volker Braun's thinkpad kernel rpms rather than the redhat ones and tried all the old kernels he had available after experiencing the bug). Someone the the Linux Kernel IDE mailing list suggested I try libata which is included in the mm sources. There was a kernel bug unrelated to thish one that prevented me from testing it. While libata may be a solution im not sure how great of a solution it is. libata is not stable and for production use its not the best idea. Does anyone know of a way to fix the current driver? Do you have host-protected-area enabled? Hmm, this is strange: Most recent kernel where this bug did not occur: Unknown T40 definitely worked with suspend-to-disk before. Try few different (vanilla!) versions and see where it broke. Honest I used this laptop more as a mobile desktop for a year or so. Ill try early 2.6 series. Also I did try this with 2.6.17.x-2.6.18-rcx. I tried the vanilla, mm-sources, git-sources, gentoo-sources, and knoppix live cd(just to make sure it wasnt a config error), same problem on all of them. The oldest kernel I could compile was 2.6.12, probably has something to do with my gcc version(4.1.1) or glibc version(2.4). Anyway the bug is in 2.6.12 as well. How do I know if the host protected area is enabled? Ok I got googled around for it and I disabled in my BIOS and I get the same thing. Alan Cox on the LKML suggested that the problem is that the kernel does not restore HPA on resume. According to him the only two fixes for this is to format the drive and make sure to get HPA off or patch the kernel. I attempted to make a patch against 2.6.18-rc4 but all that it does now is never come out of sleep mode. Ill post the patch here incase someone wants to play with it but ill keep trying to get this to work. Created attachment 8833 [details]
My attempt at fixing the hpa problem - its broken
It would be great if someone could shed some light on why this isnt working.
Ive created a patch that fixes the problem. I submitted to the LKML and awaiting it to be submitted to 2.6.18. Created attachment 8838 [details]
Working patch against 2.6.18-rc4
Renamed for better description. Patch looks okay to me, can you push it through usual channels? (I guess we can CODE_FIX it now, and close it when patch is merged? I am going to be traveling Wednesday, 8/23, and next Monday (8/28). On Thursday (8/24) and Friday (8/25), I will be on East Coast time, and I for the most part I will not be working. I may be checking my email. I should be able to respond to brief phone conversations if needed, so please don't hesitate to call me on my cell phone (650-454-6982) if you need me. Otherwise, I'll be back in the office on Tuesday, 8/29. -Eric. <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN"> <HTML> <HEAD> <TITLE>Out of Office AutoReply: [Bug 6840] HPA needs to be reinitilized on resume</TITLE> </HEAD> <BODY> <!-- Converted from text/plain format --> <P><FONT SIZE=2>I am going to be traveling Wednesday, 8/23, and next Monday (8/28).<BR> <BR> On Thursday (8/24) and Friday (8/25), I will be on East Coast time, and I for the most part I will not be working. I may be checking my email. I should be able to respond to brief phone conversations if needed, so please don't hesitate to call me on my cell phone (650-454-6982) if you need me.<BR> <BR> Otherwise, I'll be back in the office on Tuesday, 8/29.<BR> <BR> -Eric.</FONT> </P> </BODY> </HTML> I posted the patch to the Linux-ide maling list and havn't gotten much responce. Should I post it to the Linux Kernel maling list? This is my first patch so im not sure what the proper process is. Reply-To: pavel@ucw.cz Hi! > I posted the patch to the Linux-ide maling list and havn't gotten much responce. > Should I post it to the Linux Kernel maling list? This is my first patch so im > not sure what the proper process is. Just mail the patch to B.Zolnierkiewicz@elka.pw.edu.pl, cc: linux-ide@vger.kernel.org, cc: linux-kernel, cc: andrew morton. Pavel Created attachment 8892 [details]
Patch using correct Linux design
This is a redo of the previous patch since the hole one didn't follow correct
Linux design and thus would never get into the kernel(heh its my first patch).
Anyway can someone please post if this patch works correctly for them?
Thanks,
Lee
For what it is worth, I have applied the latest patch here to my latest stable gentoo kernel sources (2.6.17-r8), and have not had a problem with sleep since then. This was in one of the mm-sources(forgot which version) but was taken out because apparently it made was some unable to resume from sleep. On a side note about a week ago my dad came home with an IBM Thinkpad T40(same as mine) and had Fedora Core 5 installed on it, with HPA(according to boot dmesg). Resume works fine on this laptop using the redhat sources. I have been unable to check out the redhat source on my laptop since I just started college. Hopefully when I get some time I'll try it out. It would be great if some other people could test this on different laptops even if they don't have the problem. Hi Lee, i have this same problem with HPA. I retest your latest patch this week on my ThinkPad A21m. Lee, the patch from Comment #20 fixes this problem for me, many thanks! Resume fails in 2.6.20 as well, but the patch does not apply cleanly to test it: root@thunk:[linux-2.6.21]# patch -p1 < ../ibm-hpa.patch patching file include/linux/ide.h Hunk #1 succeeded at 1005 (offset 18 lines). patching file drivers/ide/ide.c Hunk #1 FAILED at 1229. Hunk #2 succeeded at 1281 (offset 37 lines). 1 out of 2 hunks FAILED -- saving rejects to file drivers/ide/ide.c.rej patching file drivers/ide/ide-disk.c Created attachment 11355 [details]
Patch for 2.6.21
I updated the patch for 2.6.21 and have tested it for a day, it works just the
same as the old one does. If this patch continues to work for you guys tell me
and I'll resubmit it to the kernel.
As for libata, I'll try to port this patch over when I get a chance, currently
I'm swamped at school and don't have the time to figure it out.
For each "stable" release of the "gentoo-sources" of the kernel, I have applied this patch, and resume works. Thats for 2.6.17 - 2.6.20, which I just upgraded to yesterday. Absent the patch, resume from sleep eventually triggers failures. The patch in Comment #26 applies cleanly to 2.6.21 and 2.6.21.1 and allows me to resume after suspending, thanks Lee! I've submitted the patch into the kernel on the LKML(linux-ide). We'll see if it gets in. Can you please give us a link to the patch? Sorry, I meant a link to the LKML post with the patch, if that's not a problem. The Lee's patch has been merged before 2.6.22-rc5, AFAICS. I'm closing the bug, please reopen if necessary. |