Most recent kernel where this bug did not occur: I rebuild daily, if rebuild script does not fail. The last errorless boot was on october 29. The first boot after this was october 31 Distribution: slack Hardware Environment: VIA C7, CN700 IDE interface: VT82C586A/B/VT82C686/A/B/VT823x/A/C Diskless PXE boot Software Environment: Problem Description: dmesg: ata3: PATA max UDMA/133 cmd 0x1f0 ctl 0x3f6 bmdma 0xf900 irq 14 ata4: PATA max UDMA/133 cmd 0x170 ctl 0x376 bmdma 0xf908 irq 15 ACPI Exception (exoparg2-0442): AE_AML_PACKAGE_LIMIT, Index (0FFFFFFFF) is beyond end of object [20070126] ACPI Error (psparse-0537): Method parse/execution failed [\_SB_.PCI0.PATA.GTM_] (Node c1c0b420), AE_AML_PACKAGE_LIMIT ACPI Error (psparse-0537): Method parse/execution failed [\_SB_.PCI0.PATA.CHN0._GTM] (Node c1c0b228), AE_AML_PACKAGE_LIMIT ata3: ACPI get timing mode failed (AE 0x300d) ACPI Exception (exoparg2-0442): AE_AML_PACKAGE_LIMIT, Index (0FFFFFFFF) is beyond end of object [20070126] ACPI Error (psparse-0537): Method parse/execution failed [\_SB_.PCI0.PATA.GTM_] (Node c1c0b420), AE_AML_PACKAGE_LIMIT ACPI Error (psparse-0537): Method parse/execution failed [\_SB_.PCI0.PATA.CHN1._GTM] (Node c1c0b330), AE_AML_PACKAGE_LIMIT ata4: ACPI get timing mode failed (AE 0x300d)
Created attachment 13445 [details] config
Created attachment 13446 [details] dmesg
Reply-To: akpm@linux-foundation.org > On Wed, 7 Nov 2007 13:35:00 -0800 (PST) bugme-daemon@bugzilla.kernel.org > wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=9320 > > Summary: PATA scan: ACPI Exception AE_AML_PACKAGE_LIMIT... is > beyond end of object > Product: ACPI > Version: 2.5 > KernelVersion: Linux version 2.6.24-rc2 > Platform: All > OS/Version: Linux > Tree: Mainline > Status: NEW > Severity: normal > Priority: P1 > Component: Other > AssignedTo: acpi_other@kernel-bugs.osdl.org > ReportedBy: bruinjm@xs4all.nl > > > Most recent kernel where this bug did not occur: > I rebuild daily, if rebuild script does not fail. The last errorless boot > was on october 29. The first boot after this was october 31 > > Distribution: slack > Hardware Environment: > > VIA C7, CN700 IDE interface: VT82C586A/B/VT82C686/A/B/VT823x/A/C > Diskless PXE boot > > Software Environment: > Problem Description: > > dmesg: > ata3: PATA max UDMA/133 cmd 0x1f0 ctl 0x3f6 bmdma 0xf900 irq 14 > ata4: PATA max UDMA/133 cmd 0x170 ctl 0x376 bmdma 0xf908 irq 15 > ACPI Exception (exoparg2-0442): AE_AML_PACKAGE_LIMIT, Index (0FFFFFFFF) is > beyond end of object [20070126] > ACPI Error (psparse-0537): Method parse/execution failed > [\_SB_.PCI0.PATA.GTM_] > (Node c1c0b420), AE_AML_PACKAGE_LIMIT > ACPI Error (psparse-0537): Method parse/execution failed > [\_SB_.PCI0.PATA.CHN0._GTM] (Node c1c0b228), AE_AML_PACKAGE_LIMIT > ata3: ACPI get timing mode failed (AE 0x300d) > ACPI Exception (exoparg2-0442): AE_AML_PACKAGE_LIMIT, Index (0FFFFFFFF) is > beyond end of object [20070126] > ACPI Error (psparse-0537): Method parse/execution failed > [\_SB_.PCI0.PATA.GTM_] > (Node c1c0b420), AE_AML_PACKAGE_LIMIT > ACPI Error (psparse-0537): Method parse/execution failed > [\_SB_.PCI0.PATA.CHN1._GTM] (Node c1c0b330), AE_AML_PACKAGE_LIMIT > ata4: ACPI get timing mode failed (AE 0x300d) > Seems to be another post-2.6.23 regression. Is it an acpi thing or an ata thing, of just something which the new acpi+ata stuff exposed??
Reply-To: mjg59@srcf.ucam.org On Wed, Nov 07, 2007 at 02:07:54PM -0800, Andrew Morton wrote: > Seems to be another post-2.6.23 regression. Is it an acpi thing or an ata > thing, of just something which the new acpi+ata stuff exposed?? I suspect that this is just because acpi is enabled by default on ata now. Is it actually causing any problems?
Please post the acpidump for this machine.
I's not causing any problems, the dmesg just looks different. Just to be sure I attached a disk, and it can be read from an written to.
Created attachment 13470 [details] dmesg with pata disk
Created attachment 13471 [details] acpidump
Here is the offending code (called from _GTM): Method (GTM, 6, Serialized) { ... Store (Match (DerefOf (Index (TIM0, 0x01)), MEQ, Arg0, MTR, 0x00, 0x00), Local6) Store (DerefOf (Index (DerefOf (Index (TIM0, 0x00)), Local6)), Local7) It appears to me that the Match is failing and returning 0xFFFFFFFF, which in turn is blindly used as an index into a package in the second Store statement. I seem to remember there was an issue where _STM (Set Timing Mode) was not being called before _GTM (in the ata driver), and things within the ASL/AML were not getting initialized properly.
Should I reasign this bug to: product : io/storage component : ide ?
I see the same problem: [ 44.260000] scsi8 : pata_amd [ 44.260000] scsi9 : pata_amd [ 44.260000] ata9: PATA max UDMA/133 cmd 0x1f0 ctl 0x3f6 bmdma 0xffa0 irq 14 [ 44.260000] ata10: PATA max UDMA/133 cmd 0x170 ctl 0x376 bmdma 0xffa8 irq 15 [ 44.460000] ACPI Exception (exoparg2-0442): AE_AML_PACKAGE_LIMIT, Index (0FFFFFFFF) is beyond end of object [20070126] [ 44.460000] ACPI Error (psparse-0537): Method parse/execution failed [\_SB_.PCI0.IDE0.GTM_] (Node ffff81010031aa20), AE_AML_PACKAGE_LIMIT [ 44.460000] ACPI Error (psparse-0537): Method parse/execution failed [\_SB_.PCI0.IDE0.CHN0._GTM] (Node ffff81010031a7c0), AE_AML_PACKAGE_LIMIT [ 44.460000] ata9: ACPI get timing mode failed (AE 0x300d) [ 44.470000] ata9.01: ATA-7: Maxtor 6L250R0, BAH41G10, max UDMA/133 [ 44.470000] ata9.01: 490234752 sectors, multi 16: LBA48 [ 44.470000] ata9.01: limited to UDMA/33 due to 40-wire cable [ 44.510000] ata9.01: configured for UDMA/33 [ 44.510000] ata10: port disabled. ignoring. But my hardware is nothing like Hans': It's an Opteron board with an nVidia MCP55 chipset, so this might be BIOS specific? dmidecode says: Vendor: American Megatrends Inc. BIOS Revision: 8.12 Except for the message the drive works normal. Even the detection of the 40-wire cable is correct. Form the disassembled ACPI: Device (CHN0) { [snip] Method (_GTM, 0, NotSerialized) { Store ("GTM_CHN0", Debug) Return (GTM (PMPT, PMUE, PMUT, PSPT, PSUE, PSUT)) } [snip] Method (GTM, 6, Serialized) { [snip] Store (Match (DerefOf (Index (TIM0, One)), MEQ, Arg0, MTR, Zero, Zero), Local6) Store (DerefOf (Index (DerefOf (Index (TIM0, Zero)), Local6)), Local7) Store (Local7, DMA0) Store (Local7, PIO0) Store (Match (DerefOf (Index (TIM0, One)), MEQ, Arg3, MTR, Zero, Zero), Local6) Store (DerefOf (Index (DerefOf (Index (TIM0, Zero)), Local6)), Local7) Store (Local7, DMA1) Store (Local7, PIO1) ... seems to be the same code. Arg0 and Arg0 aka PMPT and PSPT seem only to be used in Method (_STM, 3, NotSerialized) as far as I can see...
That was calling _GTF without calling _STM first. _GTM doesn't have any prerequisite (it can't). Can someone familiar with ACPI tell me why the method is failing? At any rate, libata should work fine regardless of ACPI failures. Maybe it's time to start blacklist to skip ATA-ACPI for some boards to avoid those annoying messages during boot.
dup of bug# 7907? assign to yakui for investigation...
(In reply to comment #11) > I see the same problem: >... > It's an Opteron board with an nVidia MCP55 chipset, so this might be BIOS > specific? > dmidecode says: > Vendor: American Megatrends Inc. > BIOS Revision: 8.12 BIOS Information Vendor: Phoenix Technologies, LTD Version: 6.00 PG Release Date: 05/18/2007
Yakui, any ideas?
The root cause of this bug is caused by calling _GTM. And according to ACPI 3.0 spec, there is no prerequisite for calling _GTM. (The _GTM method will be called to get the current channel time settings). Maybe it will be appropriate that the ata is no longer associated with ACPI(setting libata_noacpi to 1) when the machine is detected. Thanks.
This bug is not duplicate of bug 7907. The root cause of bug 7907 is that _GTF is called before calling _STM. And this bug is caused by the uncorrect _GTM function definition in BIOS. The bug can be fixed by disable acpi for ATA when the machine is detected in the ATA blacklist.(setting libata_noacpi to 1). Thanks.
Okay, time to start yet another blacklist I guess. Hans de Brin, please report the result of 'dmidecode'. Thanks.
Hans de Brin, could you please post dmidecode info..thanks..
Created attachment 13782 [details] dmidecode
hi, Tejun, would you please help to add the quirk?
I'm brewing patches. I'll soon post the patch. Please standby a bit. Thanks.
Created attachment 13806 [details] bug9320-dbg0.patch Please apply this patch on top of 2.6.24-rc3 and report kernel boot log.
Created attachment 13807 [details] bug9320-dbg1.patch After that, unapply dbg0 and apply this one and report the boot log. Thanks.
Created attachment 13832 [details] dbg0_dmesg
Created attachment 13833 [details] dbg1_dmesg
Created attachment 13843 [details] bug9320-acpi-blist.patch Please test this patch and report boot log. Thanks.
Created attachment 13860 [details] bug9320-acpi-blist_dmesg Since I am about to move, all spare pata disks are in boxes (on the bottom of the pile). so I did not test any disk actions.
That's good enough. Thanks.
For all of the systems that get added to this blacklist, it would be good to archive their acpidump here. For we might discover by examining the acpidump that there is some magic "Windows bug compatibility" sequence of hoops that Linux could instead jump through in order to run on the installed base of BIOS.
Yes, can the reporters here please post their decompiled ACPI DSDT AML code as an attachment (as Torsten posted partially already)?
We've found one case where the _GTM method execution fails if the PATA port has no devices connected due to register values on the controller in that case that the ASL code doesn't handle. This is likely something similar, although a different cause since the port is not empty here. Torsten, can you post your full DSDT ASL dump as well as the output of "lspci -vvvxxx" ?
Created attachment 13932 [details] bug9320-dbg2.patch Hans, please test this patch. It seems we won't need the blacklist after all.
FYI bug9320-dbg2.patch fixes hibernation on my Shuttle SK41G (pata_via) that previously failed without libata.noacpi=1.
Created attachment 13953 [details] bug9320-dbg2-dmesg
Cool, so it was the same problem. Oh.. VIA. Torsten, can you please post the DSDT ASL Robert asked?
Created attachment 13976 [details] lspci -vvvxxx Didn't had time to check my mail, so I just only now read your requests. Attached lspci is from 2.6.24-rc3-mm2 I did see only see the ACPI error message with 2.6.23-mm1, both 2.6.24-rc2-mm1 and 2.6.24-rc3-mm2 worked without it. Bevor 2.6.23-mm1 I see: ata9: ACPI get timing mode failed (AE 0x1001) On 2.6.23-mm1 I see: ata9: ACPI get timing mode failed (AE 0x300d) Both 2.6.24-rc's say: ata9: PATA max UDMA/133 cmd 0x1f0 ctl 0x3f6 bmdma 0xffa0 irq 14 ata9.01: ATA-7: Maxtor 6L250R0, BAH41G10, max UDMA/133 ata9.01: 490234752 sectors, multi 16: LBA48 ata9: nv_mode_filter: 0x7f39f&0x701f->0x701f, BIOS=0x7000 (0xc00000) ACPI=0x701f (900:60:0x14) ata9.01: configured for UDMA/33 All 2.6.23 correctly complained: ata9.01: limited to UDMA/33 due to 40-wire cable DSDT will follow...
Created attachment 13980 [details] DSDT from Bios v0208 (In reply to comment #32) > We've found one case where the _GTM method execution fails if the PATA port > has > no devices connected due to register values on the controller in that case > that > the ASL code doesn't handle. I also have only one device attached to the pata port of the MCP55 Hmm... I just noticed this: ata9: PATA max UDMA/133 cmd 0x1f0 ctl 0x3f6 bmdma 0xffa0 irq 14 ata10: PATA max UDMA/133 cmd 0x170 ctl 0x376 bmdma 0xffa8 irq 15 [snip] ata9.01: configured for UDMA/33 ata10: port disabled. ignoring Shouldn't the second/slave drive be ata9.02? This board does not have a second physical pata port and I think even the chipset only has one... DSDT attached, I hope, I extracted/disassembled it correctly.
OK. So inside GTM, we do this: Store (Match (DerefOf (Index (TIM0, One)), MEQ, Arg0, MTR, Zero, Zero), Local6) In other words, find the table inside TIM0 with index 1, then look up the value in Arg0 in that table and store the found index in Local6. For CHN0 (the failing one), this comes from PMPT, which is part of an operation region in the controller's PCI config space. PMPT comes from config address 0x5B, whose value in your config space dump is 0x99. Now the table with index 1 in TIM0 contains: Package (0x05) { 0x11, 0x20, 0x22, 0x47, 0xA8 }, 0x99 is NOT in this table, and so Local6 gets 0xFFFFFFFF (not found), which the code then uses to look up as the index into the table with index 0 in TIM0 (which seems to be the corresponding output cycle time values): Package (0x05) { 0x3C, 0x78, 0xB4, 0xF0, 0x0384 }, and causes the interpreter to blow up. Question is how the value 0x99 got there. According to the AMD-766 IDE controller docs (the closest thing we have to specs I think, except that NVIDIA has their registers 16 bytes ahead), address 0x5B (which matches 0x4B in the document) is the EIDE Controller Drive Timing Control value: "The value in each 4-bit field, plus one, specifies a time period in 30 nanosecond PCI clocks. Note: The default state, A8h, results in a recovery time of 270ns and an active pulse width of 330ns for a 30ns PCI clock (total cycle time = 600ns) which corresponds to ATA PIO Mode 0." So 0x99 means 300ns active and 300ns recovery time, or 600ns cycle time, which I guess is PIO mode 0. There is no way the BIOS set primary master timing to that value since the drive should be UDMA/133 capable. Tejun, wasn't there a problem fixed somewhere in pata_amd where we would clobber the timing mode to PIO0 before calling _GTM? It seems this BIOS implementation is very non-tolerant of the controller having been poked with values other than what it expects. (My Asus A8N-SLI Deluxe has the same IDE controller, but I recall its _GTM implementation calculated the actual values from the register settings rather than using a lookup table - a much more robust implementation.) Calling _GTM on this BIOS if timings have been modified (other than with _STM) seems unreliable unless the exact settings the BIOS expects to be used are set. The approach used by pata_acpi (only use ACPI methods to modify the timings) would likely be safer on such a system.
Setting mode to PIO0 is done right after reset because we don't know the drive state at that point. This caused problem w/ cable detection using ACPI or BIOS setting because programming PIO0 on pata_amd clobbers DMA mode setting too and we lose BIOS/ACPI setting before looking at them, so the fix was to cache _GTM value while initializing controller before initiating probing sequence (init_gtm). It seems that Torsten's problem will be fixed by init_gtm but unfortunately that thing is queued for 2.6.25 merge as it involves a bunch of other changes. I guess we'll have to backport init_gtm part to fix this one. I'll get to it.
Created attachment 13986 [details] bug9320-dbg4.patch Please apply the attached patch on top of -rc4 and report the result. Thanks.
OK, I tested with vanilla 2.6.24-rc4 and the dbg4 patch relevant parts of the vanilla 2.6.24-rc4 dmesg: ata9: PATA max UDMA/133 cmd 0x1f0 ctl 0x3f6 bmdma 0xffa0 irq 14 ata10: PATA max UDMA/133 cmd 0x170 ctl 0x376 bmdma 0xffa8 irq 15 ACPI Exception (exoparg2-0442): AE_AML_PACKAGE_LIMIT, Index (0FFFFFFFF) is beyond end of object [20070126] ACPI Error (psparse-0537): Method parse/execution failed [\_SB_.PCI0.IDE0.GTM_] (Node ffff81007ff188c0), AE_AML_PACKAGE_LIMIT ACPI Error (psparse-0537): Method parse/execution failed [\_SB_.PCI0.IDE0.CHN0._GTM] (Node ffff81007ff18660), AE_AML_PACKAGE_LIMIT ata9: ACPI get timing mode failed (AE 0x300d) ata9.01: ATA-7: Maxtor 6L250R0, BAH41G10, max UDMA/133 ata9.01: 490234752 sectors, multi 16: LBA48 ata9.01: limited to UDMA/33 due to 40-wire cable ata9.01: configured for UDMA/33 ata10: port disabled. ignoring. with dbg4 applied the following lines are added to the dmesg: +ata3: XXX skipping _GTM on empty channel +ata4: XXX skipping _GTM on empty channel +ata5: XXX skipping _GTM on empty channel +ata6: XXX skipping _GTM on empty channel +ata7: XXX skipping _GTM on empty channel +ata8: XXX skipping _GTM on empty channel +ata9: XXX skipping _GTM on empty channel +ata10: XXX skipping _GTM on empty channel The above ACPI error messages disappear, the detection is still identical. Even if the XXX message claims that ata9 is empty the drive is still getting detected correctly... Looking at the patch I think it is missing the addition of the ATA_FLAG_XXX to pata_amd, as I do not use pata_via. But even with this flag added to all ata_port_info's I got no further output.
PS: I just seen that the detection empty/not empty seems to be completely busted for the MCP55, as the ata devices do not start with ata0 ata1+2: sata_sil24 with two drives ata3..8: sata_nv, only one port has a drive ata9: pata_amd, one drive ata10: ??? BIOS bug? Did someone think that there should always be two IDE channels? ata3: XXX skipping _GTM on empty channel ata4: XXX skipping _GTM on empty channel ata3: SATA max UDMA/133 cmd 0xcc00 ctl 0xc880 bmdma 0xc400 irq 23 ata4: SATA max UDMA/133 cmd 0xc800 ctl 0xc480 bmdma 0xc408 irq 23 ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata3.00: ATA-7: MAXTOR STM3320820AS, 3.AAE, max UDMA/133 ata3.00: 625142448 sectors, multi 16: LBA48 NCQ (depth 31/32) ata3.00: configured for UDMA/133 ata4: SATA link down (SStatus 0 SControl 300) The drive on port ata3 works normal...
<rant>Gees... the initial _GTM value should be collected after device presence detection (for crap VIA ACPI implementations) but before configuration is changed.</rant> Will prep another patch soon. Thanks for you patience.
Okay, we can't do that. I think what we should do here is telling the ACPI interpreter to shut up even if something blows up. I'm forwarding patches upstream now. Other than annoying messages, everything should work fine.
I thought that Torsten was saying that the ACPI errors went away with the latest patch?
Yeah, but that would break acpi cable detection for pata_amd. I'll write about it in detail to lkml, linux-ide and linux-acpi and cc you. Thanks.
Created attachment 14014 [details] bug9320-dbg4-dmesg
Created attachment 14015 [details] bug9320-dbg5.patch Hans, can you please give a shot at this patch? This is the final revision and already posted for inclusion in mainline. You might see evaluation failure messages but everything else should be fine. Thanks.
Created attachment 14030 [details] bug9320-dbg6.patch Okay, one more revision. This should be it. Please test this. Thanks.
Created attachment 14043 [details] bug9320-dbg6-dmesg
Great, it works. Thanks.
Patchset posted. Resolving as CODE_FIX. Thanks a lot for all the testing. http://thread.gmane.org/gmane.linux.ide/26379
Fixed by: commit ededa4d396b15c282aa60d6aacddfc07f0142dbf Merge: 64396ac... 140b5e5... Author: Linus Torvalds <torvalds@woody.linux-foundation.org> Date: Mon Dec 17 19:29:32 2007 -0800 Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=ededa4d396b15c282aa60d6aacddfc07f0142dbf