Most recent kernel where this bug did not occur: didn't test but 2.6.23 worked fine Distribution: gentoo but doesn't matter as the kernel doesn't get to userland yet ;) Hardware Environment: hp nc8430 Software Environment: Problem Description: booting with 2.6.24-rc4 causes the following errors/messages while initializing sata/ahci: [ 26.494962] ahci 0000:00:1f.2: version 3.0 [ 26.494982] ACPI: PCI Interrupt 0000:00:1f.2[B] -> GSI 17 (level, low) -> IRQ 17 [ 26.495344] ahci 0000:00:1f.2: nr_ports (4) and implemented port map (0x1) don't match, using nr_ports [ 26.495566] ahci 0000:00:1f.2: forcing PORTS_IMPL to 0xf [ 27.498361] ahci 0000:00:1f.2: AHCI 0001.0100 32 slots 4 ports 1.5 Gbps 0xf impl SATA mode [ 27.498583] ahci 0000:00:1f.2: flags: 64bit ncq ilck stag pm led clo pmp pio slum part [ 27.498806] PCI: Setting latency timer of device 0000:00:1f.2 to 64 [ 27.499023] scsi0 : ahci [ 27.499241] scsi1 : ahci [ 27.499456] scsi2 : ahci [ 27.499667] scsi3 : ahci [ 27.499892] ata1: SATA max UDMA/133 abar m1024@0xf4705000 port 0xf4705100 irq 219 [ 27.500112] ata2: SATA max UDMA/133 abar m1024@0xf4705000 port 0xf4705180 irq 219 [ 27.500332] ata3: SATA max UDMA/133 abar m1024@0xf4705000 port 0xf4705200 irq 219 [ 27.501822] ata4: SATA max UDMA/133 abar m1024@0xf4705000 port 0xf4705280 irq 219 [ 28.137999] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) next comes the first failure: [ 28.140561] ata1.00: ACPI cmd b1/c1:00:00:00:00:a0 failed (Emask=0x1 Stat=0x51 Err=0x04) [ 28.141091] ata1: failed to recover some devices, retrying in 5 secs [ 34.786461] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) [ 34.789004] ata1.00: ACPI cmd b1/c1:00:00:00:00:a0 failed (Emask=0x1 Stat=0x51 Err=0x04) then the second failure: [ 34.789534] ata1.00: ACPI on devcfg failed the second time, disabling (errno=-5) [ 34.789753] ata1.00: revalidation failed (errno=1) [ 34.789914] ata1: failed to recover some devices, retrying in 5 secs [ 41.434924] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) [ 41.437002] ata1.00: configured for UDMA/100 after this the driver works and boot proceeds fine. so it's not a show stopper for me but something isn't right. also i'm not sure if it's really an ACPI problem, but it was some ACPI command that was last issued/logged so i thought it'd put this under ACPI, feel free to reclassify. if you need any extra info, let me know but i'd like to avoid bisecting if possible. Steps to reproduce: just compile/boot with 2.6.24-rc4.
Reply-To: akpm@linux-foundation.org On Mon, 10 Dec 2007 05:55:20 -0800 (PST) bugme-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=9533 Another box-killing regression to track, please. Either ATA or ACPI.
Reply-To: akpm@linux-foundation.org On Mon, 10 Dec 2007 12:52:43 -0800 Andrew Morton <akpm@linux-foundation.org> wrote: > On Mon, 10 Dec 2007 05:55:20 -0800 (PST) > bugme-daemon@bugzilla.kernel.org wrote: > > > http://bugzilla.kernel.org/show_bug.cgi?id=9533 > > Another box-killing regression to track, please. Either ATA or ACPI. er, no, not box-killing - just scary warnings. It's not clear what "kernel doesn't get to userland yet" is referring to - something else I guess. Your desire to avoid doing a bisection search is a good one - I've been trying to do one for a couple of days on and off and there are so many fatally buggy bisection points between 2.6.23 and 2.6.24-rc1 that I've given up on the attempt.
(In reply to comment #2) > Reply-To: akpm@linux-foundation.org > > It's not clear what "kernel doesn't get to userland yet" is referring to - > something else I guess. it just means that the bug/phenomenon occurs during kernel boot/init when there's no userland yet at all, so it doesn't really matter what in userland is, it's clearly a kernel issue. > Your desire to avoid doing a bisection search is a good one - I've been > trying to do one for a couple of days on and off and there are so many > fatally buggy bisection points between 2.6.23 and 2.6.24-rc1 that I've > given up on the attempt. for me it's just that i'm in the middle of setting up a fairly complex system here and i need all the time for that, not bisecting something that's not a show-stopper (else i'd have had to debug it already :). if you need to stuff in a printk somewhere or similar things, i'm glad to try.
On 12/10/2007 03:52 PM, Andrew Morton wrote: > On Mon, 10 Dec 2007 05:55:20 -0800 (PST) > bugme-daemon@bugzilla.kernel.org wrote: > >> http://bugzilla.kernel.org/show_bug.cgi?id=9533 > > Another box-killing regression to track, please. Either ATA or ACPI. > - > To unsubscribe from this list: send the line "unsubscribe linux-acpi" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Quite possibly the BIOS is buggy and handed a bad taskfile to the driver.
(In reply to comment #4) > Quite possibly the BIOS is buggy and handed a bad taskfile to the driver. hi Chuck, did you miss the part where i said that 2.6.23 had worked fine? somehow i doubt the BIOS would know which linux kernel is booting. not to mention that at this phase of booting the BIOS is long out of the picture, afaik.
Since Chuck replied via email, he didn't see your comment.
libata acpi was enabled by default in 2.6.24-rc1: 6ebe9d8667d3f449350cbc6686cd917ecf1f40dd And yes the driver will ask ACPI (the BIOS) for taskfiles and send them to the devices.
(In reply to comment #7) > libata acpi was enabled by default in 2.6.24-rc1: > > 6ebe9d8667d3f449350cbc6686cd917ecf1f40dd > > And yes the driver will ask ACPI (the BIOS) for taskfiles and send them to > the > devices. should libata.noacpi=1 fix it?
(In reply to comment #8) > (In reply to comment #7) > > libata acpi was enabled by default in 2.6.24-rc1: > > > > 6ebe9d8667d3f449350cbc6686cd917ecf1f40dd > > > > And yes the driver will ask ACPI (the BIOS) for taskfiles and send them to > the > > devices. > > should libata.noacpi=1 fix it? ok, it fixed my problem (well, apparently i got back the old behaviour which is good enough for me). consider updating Documentation/kernel-parameters.txt because right now it only says: libata.noacpi [LIBATA] Disables use of ACPI in libata suspend/resume when set. Format: <int> i.e., no mention of boot problems.
Andrew Morton wrote: > On Mon, 10 Dec 2007 12:52:43 -0800 > Andrew Morton <akpm@linux-foundation.org> wrote: > >> On Mon, 10 Dec 2007 05:55:20 -0800 (PST) >> bugme-daemon@bugzilla.kernel.org wrote: >> >>> http://bugzilla.kernel.org/show_bug.cgi?id=9533 >> Another box-killing regression to track, please. Either ATA or ACPI. > > er, no, not box-killing - just scary warnings. > > It's not clear what "kernel doesn't get to userland yet" is referring to - > something else I guess. > > Your desire to avoid doing a bisection search is a good one - I've been > trying to do one for a couple of days on and off and there are so many > fatally buggy bisection points between 2.6.23 and 2.6.24-rc1 that I've > given up on the attempt. This one will be fixed by ACPI fixes which are currently being tested. I just need to get one more test result. I'll post the patch to 9533 too. Thanks.
Created attachment 13996 [details] bug9320-dbg4.patch Please apply the attached patch on top of -rc5 and report the result. Thanks.
(In reply to comment #11) > Created an attachment (id=13996) [details] > bug9320-dbg4.patch > > Please apply the attached patch on top of -rc5 and report the result. thanks, it works now. in case you're interested, the new dmesg i got: [ 25.573366] ahci 0000:00:1f.2: version 3.0 [ 25.573387] ACPI: PCI Interrupt 0000:00:1f.2[B] -> GSI 17 (level, low) -> IRQ 17 [ 25.573751] ahci 0000:00:1f.2: nr_ports (4) and implemented port map (0x1) don't match, using nr_ports [ 25.573972] ahci 0000:00:1f.2: forcing PORTS_IMPL to 0xf [ 26.575477] ahci 0000:00:1f.2: AHCI 0001.0100 32 slots 4 ports 1.5 Gbps 0xf impl SATA mode [ 26.575697] ahci 0000:00:1f.2: flags: 64bit ncq ilck stag pm led clo pmp pio slum part [ 26.575920] PCI: Setting latency timer of device 0000:00:1f.2 to 64 [ 26.576143] scsi0 : ahci [ 26.576362] scsi1 : ahci [ 26.576575] scsi2 : ahci [ 26.576787] scsi3 : ahci [ 26.577013] ata1: SATA max UDMA/133 abar m1024@0xf4705000 port 0xf4705100 irq 219 [ 26.577234] ata2: SATA max UDMA/133 abar m1024@0xf4705000 port 0xf4705180 irq 219 [ 26.577454] ata3: SATA max UDMA/133 abar m1024@0xf4705000 port 0xf4705200 irq 219 [ 26.577674] ata4: SATA max UDMA/133 abar m1024@0xf4705000 port 0xf4705280 irq 219 [ 27.215083] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) [ 27.216757] ata1: XXX evaluating _GTF [ 27.217492] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 filtered out [ 27.217656] ata1.00: ACPI cmd b1/c1:00:00:00:00:a0 filtered out [ 27.217968] ata1.00: ACPI cmd c6/00:10:00:00:00:a0 succeeded [ 27.218134] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 succeeded [ 27.218991] ata1.00: ATA-7: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx, max UDMA/100 [ 27.219160] ata1.00: yyyyyyyyyyy sectors, multi 16: LBA48 NCQ (depth 31/32) [ 27.220965] ata1: XXX evaluating _GTF [ 27.221695] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 filtered out [ 27.221859] ata1.00: ACPI cmd b1/c1:00:00:00:00:a0 filtered out [ 27.222170] ata1.00: ACPI cmd c6/00:10:00:00:00:a0 succeeded [ 27.222336] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 succeeded [ 27.223208] ata1.00: configured for UDMA/100 [ 27.225281] ata1.00: configured for UDMA/100 [ 27.225443] ata1: EH complete [ 27.542894] ata2: SATA link down (SStatus 0 SControl 0) [ 27.862708] ata3: SATA link down (SStatus 0 SControl 0) [ 28.182523] ata4: SATA link down (SStatus 0 SControl 0) then it proceeds fine with the disk it found. the corresponding parts from the disassembled DSDT: Method (C24E, 3, NotSerialized) { If (LOr (LEqual (Arg0, Zero), LEqual (Arg0, One))) { Store (Buffer (0x07) { /* 0000 */ 0x03, 0x00, 0x00, 0x00, 0x00, 0x00, 0xEF }, Local0) Store (Arg1, Index (Local0, One)) } If (LEqual (Arg0, 0x02)) { Store (Buffer (0x07) { /* 0000 */ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xF5 }, Local0) } If (LEqual (Arg0, 0x03)) { Store (Buffer (0x07) { /* 0000 */ 0xC1, 0x00, 0x00, 0x00, 0x00, 0x00, 0xB1 }, Local0) } If (LEqual (Arg0, 0x04)) { Store (Buffer (0x07) { /* 0000 */ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xC6 }, Local0) Store (Arg1, Index (Local0, One)) } If (LEqual (Arg0, 0x05)) { Store (Buffer (0x07) { /* 0000 */ 0x00, 0x03, 0x00, 0x00, 0x00, 0x00, 0xEF }, Local0) Store (Arg1, Index (Local0, Zero)) } Store (Or (0xA0, ShiftLeft (Arg2, 0x04)), Index (Local0, 0x05)) Return (Local0) } Method (C24F, 2, NotSerialized) { Store (Zero, Local0) Store (Zero, Local4) While (LLess (Local0, SizeOf (Arg0))) { If (LNotEqual (DerefOf (Index (Arg0, Local0)), Zero)) { Store (C24E (Local0, DerefOf (Index (Arg0, Local0)), Arg1), Local2) If (LEqual (Local4, Zero)) { Store (Local2, Local1) { Store (Local2, Local1) Store (One, Local4) } Else { Concatenate (Local1, Local2, Local3) Store (Local3, Local1) } } Increment (Local0) } Return (Local1) } (i haven't found any references to method C24F)
Thanks for testing. I'll forward patches upstream soon.
Fixed by: commit ededa4d396b15c282aa60d6aacddfc07f0142dbf Merge: 64396ac... 140b5e5... Author: Linus Torvalds <torvalds@woody.linux-foundation.org> Date: Mon Dec 17 19:29:32 2007 -0800 Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=ededa4d396b15c282aa60d6aacddfc07f0142dbf