|Summary:||STD regression rc1 -> rc234, suspend fails completely|
|Product:||Power Management||Reporter:||Rafael J. Wysocki (rjwysocki)|
|Component:||Hibernation/Suspend||Assignee:||Andreas Mohr (andi)|
|Severity:||normal||CC:||andi, hancockrwd, htejun|
|Bug Depends on:||9320|
|Bug Blocks:||7216, 9243|
2.6.24-rc4 (failing suspend) boot log (dmesg)
lspci -nn (on -rc4, JFYI)
dmesg of clean 2.6.24-rc4 plus bug9320-dbg2 patch, suspend works fine
dmesg of 2.6.24-rc4 plus db3, boot plus suspend
DSDT source of my EPOX 8K5A2+ board, latest BIOS (2003)
dmesg of clean 2.6.24-rc4 plus dbg0 patch, suspend works fine
2.6.24-rc4 with dbg6
Description Rafael J. Wysocki 2007-12-09 05:39:07 UTC
Subject : STD regression rc1 -> rc234, suspend fails completely Submitter : Andreas Mohr <email@example.com> References : http://lkml.org/lkml/2007/12/8/34 Handled-By : Robert Hancock <firstname.lastname@example.org> Tejun Heo <email@example.com>
Comment 1 Andreas Mohr 2007-12-09 11:52:57 UTC
Created attachment 13927 [details] 2.6.24-rc4 (failing suspend) boot log (dmesg)
Comment 2 Andreas Mohr 2007-12-09 11:54:07 UTC
Created attachment 13928 [details] lspci -nn (on -rc4, JFYI)
Comment 3 Andreas Mohr 2007-12-09 13:39:13 UTC
OK, problem semi-solved I think, see http://lkml.org/lkml/2007/12/9/139
Comment 4 Andreas Mohr 2007-12-09 14:07:16 UTC
#9320 is the root cause of this problem, and it seems they may be able to do something about _GTM failure.
Comment 5 Tejun Heo 2007-12-09 22:16:08 UTC
Created attachment 13933 [details] bug9320-dbg2.patch Please test this patch and report kernel log. Thanks.
Comment 6 Andreas Mohr 2007-12-10 13:36:43 UTC
Created attachment 13954 [details] dmesg of clean 2.6.24-rc4 plus bug9320-dbg2 patch, suspend works fine OK, on a cleanly remade 2.6.24-rc4 with the bug9320-dbg2 patch, suspend/resume works fine with no actually failing _GTM invocation to be seen, however methinks the _GTF handling isn't quite perfect yet (this seems to be the invocation for the primary port, though, so I'm a bit confused). I should possibly do some more ASL investigations...
Comment 7 Tejun Heo 2007-12-10 19:45:26 UTC
Created attachment 13964 [details] bug9320-dbg3.patch * Please try this patch and post the result. * Please post ASL of DSDT. Thanks.
Comment 8 Andreas Mohr 2007-12-11 15:43:10 UTC
Created attachment 13983 [details] dmesg of 2.6.24-rc4 plus db3, boot plus suspend
Comment 9 Andreas Mohr 2007-12-11 15:44:51 UTC
Created attachment 13984 [details] DSDT source of my EPOX 8K5A2+ board, latest BIOS (2003)
Comment 10 Robert Hancock 2007-12-11 16:23:03 UTC
OK, so the taskfile your BIOS is trying to send is a SET FEATURES - transfer mode command. To figure out the right mode, it looks up the values of PMUE, PMUT, and PMPT in some lookup tables. These are located in PCI config space at 0x53 (low 3 bits), 0x53 (top 4 bits) and 0x4B (all 8 bits) respectively. I'm not sure what those values hold on your system (lspci -vvvxxx would show this) but it's likely a good bet that one of them isn't in the lookup tables and so the subsequent dereference to help it figure out one of the taskfile parameters fails. This seems like a very similar braindead BIOS implementation to what Torsten has on his NVIDIA chipset board (bug 9320). I think it's again a matter of libata programming the controller with slightly different register values from what the BIOS expects to be there and then the BIOS code choking when it sees those.
Comment 11 Tejun Heo 2007-12-11 16:33:05 UTC
Those mode values are supposed to be programmed by _STM. Suspend/resume cycles look like the following from ATA-ACPI's POV. 1. _GTM is called to store the current transfer mode setting. 2. suspend. 3. resume. 4. _STM is called with the parameter saved from #1 to restore transfer mode setting. 5. _GTF is called and the resulting TFs are executed. I'll attach a debug patch. Let's see whether we're missing _STM.
Comment 12 Tejun Heo 2007-12-11 16:38:29 UTC
Okay, dbg3 already contains that. Here's excerpt from log in comment #8. ata1: XXX _GTM saved on suspend ... (suspend/resume) ata1: XXX _STM performed on resume 78:14/78:3c 15 ... ata1.00: _GTF evaluation failed (AE 0x300d) ata1.00: ACPI: _GTF invalid, disabled ata1.00: configured for UDMA/100 ata1.01: configured for UDMA/33 So, the above _STM is supposed to fill whatever _GTF needs. I'll post a debug patch to dump 0x53 and 0x4B before and after _GTM/_STM calls. Arghh...
Comment 13 Tejun Heo 2007-12-11 17:00:07 UTC
Created attachment 13985 [details] bug9530-dbg0.patch Please apply this patch on top of -rc4 and report the log. Thanks.
Comment 14 Andreas Mohr 2007-12-12 13:26:19 UTC
Created attachment 14002 [details] dmesg of clean 2.6.24-rc4 plus dbg0 patch, suspend works fine Note the I/O errors in dmesg a couple seconds after resume...
Comment 15 Robert Hancock 2007-12-12 16:03:50 UTC
It looks like after the _STM, we have reasonable values (PMPT=0x20): pata_via 0000:00:11.1: XXX PCI 0x48=0x20209999 0x50=0xf1f60707 but by the time _GTF is evaluated, they've been changed to ones not in the BIOS tables (PMPT=0x99): ata1: XXX evaluating _GTF pata_via 0000:00:11.1: XXX PCI 0x48=0x99999999 0x50=0x37370707 ACPI Exception (exoparg2-0442): AE_AML_PACKAGE_LIMIT, Index (0FFFFFFFF) is beyond end of object 
Comment 16 Tejun Heo 2007-12-13 22:38:20 UTC
OIC, thanks a lot Robert. It seems we'll have to evaluate _GTF right after _STM and cache the result. Will prep another patch.
Comment 17 Tejun Heo 2007-12-14 05:08:01 UTC
Created attachment 14031 [details] bug9320-dbg6.patch Please test this patch and report the kernel log. Thanks.
Comment 18 Andreas Mohr 2007-12-14 16:12:55 UTC
Created attachment 14045 [details] 2.6.24-rc4 with dbg6 Hmm... I think I don't want to like this ;) Now it "filtered out" any _GTF or _GTM invocation (or maybe you simply removed corresponding logging). Thanks!
Comment 19 Tejun Heo 2007-12-14 17:01:28 UTC
Actually, that's exactly they way it's intended, so no problem evaluating _STM and _GTF, great. All commands in the _GTF are SETXFERs which only disturb libata device configuration process. Filtering them out is DTRT. I'll forward the patchset upstream. Thanks a lot for testing.
Comment 20 Tejun Heo 2007-12-14 22:20:05 UTC
Patchset posted. Thanks for all the testing. http://thread.gmane.org/gmane.linux.ide/26379 Andreas, can you please resolve this bug as CODE_FIX?
Comment 21 Andreas Mohr 2007-12-15 03:44:09 UTC
Nice to hear that it works absolutely correctly! Thanks for all your hard work and very timely help! Resolving but not closing (should probably be done once fix arrived upstream).
Comment 22 Rafael J. Wysocki 2007-12-20 16:02:01 UTC
Fixed by: commit ededa4d396b15c282aa60d6aacddfc07f0142dbf Merge: 64396ac... 140b5e5... Author: Linus Torvalds <firstname.lastname@example.org> Date: Mon Dec 17 19:29:32 2007 -0800 Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=ededa4d396b15c282aa60d6aacddfc07f0142dbf
Comment 23 Andreas Mohr 2007-12-22 04:58:34 UTC
[bug already closed, JFYI] -rc6 (includes Tejun's libata patch) verified to work nicely, thanks!