Bug 10884

Summary: CD/DVD drive not detected without acpi=off
Product: IO/Storage Reporter: Lars Luthman (lars.luthman)
Component: Serial ATAAssignee: Tejun Heo (tj)
Status: RESOLVED INVALID    
Severity: normal CC: acpi-bugzilla, holger, htejun, jgarzik, rui.zhang
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.25.5 Subsystem:
Regression: No Bisected commit-id:
Bug Depends on:    
Bug Blocks: 56331    
Attachments: dmesg output without acpi=off
dmesg output with acpi=off
acpidump output
lspci -v output too, in case it's useful
/proc/interrupts without any ACPI boot options
/proc/interrupts with acpi=off
dmesg output for pci=nommconf
dmesg output for acpi=noirq
dmesg output for noapic
dmesg output with libata.noacpi=1
BIOS screenshot, Main
BIOS screenshot, Main/IDE Channel 1 Master
BIOS screenshot, Main/IDE Primary/Slave
BIOS screenshot, Advanced
BIOS screenshot, Main/IDE Primary/Slave, AUTO
dmesg-acpi-off-ahci-disable
dmesg-acpi-on-ahci-disable
tree /sys/bus/scsi/devices/ when acpi=off
tree /sys/bus/scsi/devices/ when acpi=on
lspci -vxxx when acpi=off
lspci -vxxx when acpi=on
dmesg without the patch
dmesg with the patch
dmesg without the patch when acpi is enabled
dmesg-acpi-off-ahci-disable
dmesg-acpi-off-ahci-disable-with-the-patch
detection-debug.patch
dmesg-acpi-off-with-sata-detect-patch
dmesg-acpi-on-with-sata-detect-patch
lspci acpi=off
lspci acpi=on

Description Lars Luthman 2008-06-07 12:38:02 UTC
Latest working kernel version: none
Earliest failing kernel version: 2.6.25.5
Distribution: Debian
Hardware Environment: Zepto 6025WD laptop, Intel based
Software Environment: Debian Testing with kernel from kernel.org
Problem Description: The CD/DVD RW device is not detected at all by the kernel if I boot it without any special options. I get an ACPI error, but the boot continues and everything except the CD seems to work. The error is

ACPI Error (evregion-0316): No handler for Region [ERAM] (ffff81007f35e720) [Emb
eddedControl] [20070126]
ACPI Error (exfldio-0289): Region EmbeddedControl(3) has no handler [20070126]
ACPI Error (psparse-0537): Method parse/execution failed [\_SB_.PCI0.LPCB.EC0_._
REG] (Node ffff81007f35de10), AE_NOT_EXIST
ACPI: BIOS _OSI(Linux) query ignored
ACPI: DMI System Vendor: Zepto
ACPI: DMI Product Name: Znote
ACPI: DMI Product Version: 6025WD
ACPI: DMI Board Name: Znote
ACPI: DMI BIOS Vendor: Phoenix Technologies LTD
ACPI: DMI BIOS Date: 07/27/2007
ACPI: Please send DMI info above to linux-acpi@vger.kernel.org
ACPI: If "acpi_osi=Linux" works better, please notify linux-acpi@vger.kernel.org

I tried acpi_osi=Linux, but it didn't make a difference. There was another message that said

PCI: Using ACPI for IRQ routing
PCI: If a device doesn't work, try "pci=routeirq".  If it helps, post a report

but that didn't help either.

However, if I boot with acpi=off the kernel does detect the CD device and it seems to work OK. But then it doesn't detect the built-in SD card reader at all.

Steps to reproduce:

1. Get Zepto 6025WD.
2. Install Linux 2.6.25.5.
3. Boot.
4. Try to access the CD/DVD RW device - there is none.
Comment 1 Lars Luthman 2008-06-07 12:38:40 UTC
Created attachment 16430 [details]
dmesg output without acpi=off
Comment 2 Lars Luthman 2008-06-07 12:39:01 UTC
Created attachment 16431 [details]
dmesg output with acpi=off
Comment 3 Zhang Rui 2008-06-11 02:00:24 UTC
>ACPI Error (evregion-0316): No handler for Region [ERAM] (ffff81007f35e720)
>[Emb eddedControl] [20070126]
hmm, could you please attach the acpidump output?
you can get the latest pmtools at http://www.kernel.org/pub/linux/kernel/people/lenb/acpi/utils/
Comment 4 Lars Luthman 2008-06-11 03:47:55 UTC
Created attachment 16455 [details]
acpidump output
Comment 5 Lars Luthman 2008-06-11 03:48:24 UTC
Created attachment 16456 [details]
lspci -v output too, in case it's useful
Comment 6 Len Brown 2008-06-12 13:45:39 UTC
dmesg.acpi=off shows the device:

ata4: PATA max UDMA/100 cmd 0x1f0 ctl 0x3f6 bmdma 0x1810 irq 14
...
ata4.01: ATAPI: TSSTcorp CDDVDW SN-S082H, SB00, max UDMA/33
ata4.01: configured for UDMA/33
scsi 3:0:1:0: CD-ROM            TSSTcorp CDDVDW SN-S082H  SB00 PQ: 0 ANSI: 5
Uniform Multi-Platform E-IDE driver
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
Driver 'sr' needs updating - please use bus_type methods
sr0: scsi3-mmc drive: 62x/62x writer dvd-ram cd/rw xa/form2 cdda tray
Uniform CD-ROM driver Revision: 3.20
sr 3:0:1:0: Attached scsi CD-ROM sr0
sd 0:0:0:0: Attached scsi generic sg0 type 0
sr 3:0:1:0: Attached scsi generic sg1 type 5

When you use this device, does its interrupts show up on irq 14?
please attach the /proc/interrupts for both normal and acpi=off boots.

Rather than "acpi=off", does booting with any of these help?:
pci=nommconf
acpi=noirq
noapic
Comment 7 Lars Luthman 2008-06-14 02:09:41 UTC
Attaching /proc/interrupts for normal and acpi=off boots.

None of pci=nommconf, acpi=noirq and noapic gets me a device file for the device, attaching dmesg output for all three cases.
Comment 8 Lars Luthman 2008-06-14 02:10:23 UTC
Created attachment 16467 [details]
/proc/interrupts without any ACPI boot options
Comment 9 Lars Luthman 2008-06-14 02:10:51 UTC
Created attachment 16468 [details]
/proc/interrupts with acpi=off
Comment 10 Lars Luthman 2008-06-14 02:11:16 UTC
Created attachment 16469 [details]
dmesg output for pci=nommconf
Comment 11 Lars Luthman 2008-06-14 02:12:02 UTC
Created attachment 16470 [details]
dmesg output for acpi=noirq
Comment 12 Lars Luthman 2008-06-14 02:12:26 UTC
Created attachment 16471 [details]
dmesg output for noapic
Comment 13 Shaohua 2008-06-17 23:55:05 UTC
how about boot option libata.noacpi=1?
Comment 14 Lars Luthman 2008-06-24 00:44:36 UTC
libata.noacpi=1 does not help, attaching dmesg output.
Comment 15 Lars Luthman 2008-06-24 00:45:34 UTC
Created attachment 16594 [details]
dmesg output with libata.noacpi=1
Comment 16 Zhang Rui 2008-07-09 19:59:08 UTC
I tried to reproduce your problem on my laptop. it only happens if I set the AHCI to a specific mode, and it seems that this is not a Linux kernel bug.
Please lists all your bios options about SATA/IDE setting.
Comment 17 Lars Luthman 2008-07-10 08:33:25 UTC
Attaching screenshots from the relevant BIOS setup screens (Main, Main/IDE Channel 1 Master, Main/IDE Primary/Slave, Advanced).

The comments for the option Advanced/Large Disk Access Mode said that it should be set to "Other" for UNIX (it didn't mention Linux) so I tried that - it didn't help with the CD/DVD device and it broke my SD card reader, so I set it back to "DOS".
Comment 18 Lars Luthman 2008-07-10 08:34:34 UTC
Created attachment 16787 [details]
BIOS screenshot, Main
Comment 19 Lars Luthman 2008-07-10 08:35:38 UTC
Created attachment 16788 [details]
BIOS screenshot, Main/IDE Channel 1 Master
Comment 20 Lars Luthman 2008-07-10 08:36:30 UTC
Created attachment 16789 [details]
BIOS screenshot, Main/IDE Primary/Slave
Comment 21 Lars Luthman 2008-07-10 08:37:30 UTC
Created attachment 16790 [details]
BIOS screenshot, Advanced
Comment 22 Zhang Rui 2008-07-10 18:34:00 UTC
could you please change the type of IDE PRIMARY/SLAVE to "auto" and see if it helps?
please attach the screenshot after setting it to "auto".
Comment 23 Lars Luthman 2008-07-11 02:45:13 UTC
Screenshot attached. The only change was that some of the options on the Primary/Slave page got greyed out. It is still reported as [CD-ROM] on the Main page, and no device file appears in /dev.

While booting with a patched 2.6.22.5 (for audio and wireless) I found this in the dmesg output:

PCI: Probing PCI hardware (bus 00)
PCI: Transparent bridge - 0000:00:1e.0
PCI: Bus #07 (-#0a) is hidden behind transparent bridge #06 (-#07) (try 'pci=assign-busses')
Please report the result to linux-kernel to fix this permanently

Could it be relevant? Will try pci=assign-busses later, have to leave for a couple of hours now.
Comment 24 Lars Luthman 2008-07-11 02:46:09 UTC
Created attachment 16795 [details]
BIOS screenshot, Main/IDE Primary/Slave, AUTO
Comment 25 Zhang Rui 2008-11-16 23:24:01 UTC
does the problem still exists in the latest kernel?
I did some tests on my test box and found that enabling/disabling ACPI can result in the different sata register values.
but sorry I've no idea how to debug this problem, and IMO, this may be a ata/scsi problem?

cc Tejun and Jeff.
Comment 26 Tejun Heo 2008-11-17 17:28:59 UTC
That's one weird bug.  Can you try 2.6.27.6 and whether the problem is still there?  ACPI on/off is somehow affecting device presence detection on PATA, which is really weird.  I'll prep debug patches for 2.6.27.6 as soon as the problem is confirmed on 2.6.27.  Thanks.
Comment 27 Lars Luthman 2008-11-18 05:33:09 UTC
Tried with 2.6.27.6 now, the behaviour is a bit different.

Without acpi=off the kernel doesn't even finish booting, it gets stuck printing line after line of

  ACPI: EC: non-query interrupt received, switching to interrupt mode

After five minutes of this I rebooted with acpi=off, and then the DVD drive works, just as with earlier versions. It appears as /dev/scd0 (with a symbolic link /dev/sr0) and I can mount and read DVDs.

So not only is the problem still there in 2.6.27.6, it has gotten worse. Or maybe it's just detected earlier.
Comment 28 Tejun Heo 2008-11-18 16:37:32 UTC
Zhang Rui, can you please help here?  I don't got a clue.
Comment 29 Zhang Rui 2008-11-18 21:36:28 UTC
okay, the symptom on my test box is not the same to Lars'.
i.e. I have this problem only if the AHCI mode is disabled in the BIOS.
I probably don't have this issue when SATA is set to the AHCI mode, but I need to double check later.

some debug info is attached.
Comment 30 Zhang Rui 2008-11-18 21:40:43 UTC
Created attachment 18919 [details]
dmesg-acpi-off-ahci-disable
Comment 31 Zhang Rui 2008-11-18 21:43:26 UTC
Created attachment 18920 [details]
dmesg-acpi-on-ahci-disable
Comment 32 Zhang Rui 2008-11-18 21:46:20 UTC
Created attachment 18921 [details]
tree /sys/bus/scsi/devices/  when acpi=off
Comment 33 Zhang Rui 2008-11-18 21:58:26 UTC
Created attachment 18922 [details]
tree /sys/bus/scsi/devices/  when acpi=on
Comment 34 Zhang Rui 2008-11-18 22:02:37 UTC
Created attachment 18923 [details]
lspci -vxxx when acpi=off
Comment 35 Zhang Rui 2008-11-18 22:06:44 UTC
Created attachment 18924 [details]
lspci -vxxx when acpi=on
Comment 36 Zhang Rui 2008-11-18 22:30:32 UTC
when in AHCI mode, with ACPI enabled, I can only get one SCSI device
i.e.
#tree /sys/bus/scsi/devices/
/sys/bus/scsi/devices/
`-- 1:0:0:0 -> ../../../devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0

when acpi=off, kernel failed to get the rootfs.
Comment 37 Zhang Rui 2008-11-24 22:49:49 UTC
Hi, Tejun,
any other info needed?
Comment 38 Tejun Heo 2008-11-24 22:58:31 UTC
Does the following patch help?

  http://article.gmane.org/gmane.linux.ide/36218/raw
Comment 39 Zhang Rui 2008-11-24 23:25:00 UTC
no, :(

last time when I investigated this bug, I found that one port/link is inactive while it's active if acpi=off.
And the status of this port/link is read from a register.
I forget what the register is, but if you want, I can try to find it out.
Comment 40 Tejun Heo 2008-11-25 21:50:06 UTC
You're getting nobody cared on the interrupt handler, so the driver wouldn't really work right.  The misdetection could be a different problem tho.  Can you please post kernel log with the above patch applied?  Let's first clear the irq problem out of the way; then, I'll prep a patch to follow the detection sequence.  Thanks.
Comment 41 Zhang Rui 2008-11-26 00:03:34 UTC
Created attachment 19026 [details]
dmesg without the patch
Comment 42 Zhang Rui 2008-11-26 00:17:44 UTC
Created attachment 19027 [details]
dmesg with the patch
Comment 43 Tejun Heo 2008-11-26 00:41:11 UTC
Can you please try acpi=off w/o and w/ the patch?  Thanks.
Comment 44 Zhang Rui 2008-11-26 22:17:21 UTC
Created attachment 19043 [details]
dmesg without the patch when acpi is enabled

hah, I don't have this problem in 2.6.28-rc6.
do i still need to try the patch?
Comment 45 Zhang Rui 2008-11-26 22:29:35 UTC
Created attachment 19044 [details]
dmesg-acpi-off-ahci-disable

oops, forget what I said. :)
the problem only exists if acpi=off and ahci mode is disabled.
Comment 46 Tejun Heo 2008-11-26 23:32:59 UTC
Ah.. strange.  W/ acpi off, your machine is experiencing IRQ storm and this is with the spurious IRQ detection patch applied, right?  libata definitely needs to be more resilient against these problems but as for this specific problem, I'm running out of ideas.  Does irqpoll help?
Comment 47 Zhang Rui 2008-11-27 17:15:21 UTC
Created attachment 19059 [details]
dmesg-acpi-off-ahci-disable-with-the-patch

well, the dmesg in comment #45 is gotten without the patch.
and with the patch applied, everything works fine. :p
Comment 48 Tejun Heo 2008-11-27 18:30:23 UTC
Thanks for confirming.  There are still issues regarding the patch but we now know what the problem is.  I'll keep discussing it on linux-ide.
Comment 49 Zhang Rui 2008-11-27 18:41:38 UTC
hmmm, the irq problem is clear now.
next is the sata detection problem, :)
I can do the tests as soon as the patch is available.
Comment 50 Tejun Heo 2008-11-27 21:13:03 UTC
Created attachment 19060 [details]
detection-debug.patch

Heh... almost forgot about that.  Here it is.
Comment 51 Zhang Rui 2008-12-01 22:37:59 UTC
Created attachment 19098 [details]
dmesg-acpi-off-with-sata-detect-patch
Comment 52 Zhang Rui 2008-12-01 22:39:06 UTC
Created attachment 19099 [details]
dmesg-acpi-on-with-sata-detect-patch
Comment 53 Zhang Rui 2008-12-01 22:41:18 UTC
things don't change, still one active link detected.
Comment 54 Tejun Heo 2008-12-02 18:49:54 UTC
This is interesting.

The following is w/ acpi off.

[   11.640762] ata1: XXX devmask=0x3
[   11.791030] ata1: XXX WAIT_READY M 0
[   11.791147] ata1: XXX WAIT_READY S 0
[   11.791286] ata1.00: XXX CLASSIFY TF 00/01:01:14:eb
[   11.791413] ata1.01: XXX CLASSIFY TF 00/04:01:14:eb

and acpi on.

[    0.886948] ata1: XXX devmask=0x0
[    1.038015] ata1: XXX WAIT_READY M 0
[    1.038147] ata1.00: XXX CLASSIFY TF 7f/7f:7f:7f:7f
[    1.038277] ata1.01: XXX CLASSIFY TF 7f/7f:7f:7f:7f

devmask is determined by writing values to IDE TF registers and reading back.  If device is attached to the channel, the device will store the value and reply the stored value.  If not, bogus value will be reported.  Classification works by reading TF registers after SRST.  After SRST is complete, the device posts its status and classification code in its TF registers and libata uses the values to determine whether device is attached and if so which type.

With acpi turned on, the device is failing both detection methods and behaving exactly as if there is no device attached to the channel at all.  This is really strange.  I've never seen something like this.  I can't really think of any way acpi can affect this low level operation of ATA.  :-(

Can you attach the results of "lspci -nnvvvxxx" with acpi turned off and on.  Maybe the controller channel has been disabled somehow.
Comment 55 Zhang Rui 2008-12-02 19:22:09 UTC
Created attachment 19114 [details]
lspci acpi=off
Comment 56 Zhang Rui 2008-12-02 19:23:50 UTC
Created attachment 19115 [details]
lspci acpi=on
Comment 57 Tejun Heo 2008-12-02 19:44:38 UTC
Thanks, that explained what was going on.

With ACPI turned on, bits 16-17 of 00:1f.1:54-57 which is the SIG_MODE of IDE_CONFIG register is set to 01b which is Tri-state (Disabled) while with ACPI off it's 00b - Normal (Enabled).  The signal pins are tristated and thus the controller can't see the device.  This probably has something to do with ACPI docking support.  cc'ing Holger Macht.  Holger, with ACPI turned on, this laptop tristates the IDE port and thus fail to detect the cdrom attached there.  This smells like something regarding the dock didn't go as planned.  Any ideas?

eek... Holger is not registered to OSDL bz.  I'll ask him to take a look via email.
Comment 58 Holger Macht 2008-12-03 02:29:01 UTC
Just to make sure, all the latest dmesg output is with 2.6.28-rc-something, right? Or still 2.6.27.6?

Anyway, there's
[    0.242695] ACPI: ACPI Dock Station Driver: 1 docks/bays found
in the log, and I wonder whether this actually is the ata drive or a possible other dock station. Actually I can't find anything in acpidump ;-) So what does /sys/devices/platform/dock.?/type contain?
Comment 59 Zhang Rui 2008-12-03 21:27:01 UTC
yes, they are gotten from 2.6.28-rcX

#cat /sys/devices/platform/dock.0/type
dock_station

In fact, I uses an ATX box for this mobile board, and there is no dock station available.
Comment 60 Lars Luthman 2008-12-26 06:09:18 UTC
I just tested kernel 2.6.28, the bug is still there. No DVD device file when I boot without acpi=off.
Comment 61 Lars Luthman 2009-03-27 14:49:13 UTC
Tested again with 2.6.29, same problem. The bug is still there.
Comment 62 Lars Luthman 2009-08-11 14:23:09 UTC
Turns out this bug has been fixed by a BIOS upgrade. Not Linux's fault then. Everything seems to work now.
Comment 63 Tejun Heo 2009-08-11 23:55:01 UTC
Resolving as invalid.  Thanks.