Bug 6117

Summary: _DIS Interrupt Links prevents power button events -- MS-6167
Product: ACPI Reporter: Ryan Underwood (nemesis)
Component: Config-InterruptsAssignee: ykzhao (yakui.zhao)
Status: REJECTED INSUFFICIENT_DATA    
Severity: normal CC: acpi-bugzilla, rui.zhang
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.15.3 Subsystem:
Regression: --- Bisected commit-id:
Attachments: acpidump
dmesg
debug patch
dmesg after 1st debug patch
dmesg after 1st debug patch
acpidump after 1st debug patch
revised DSDT
aml code
lspci -xxxv
latest dmesg

Description Ryan Underwood 2006-02-21 19:24:40 UTC
Most recent kernel where this bug did not occur: 2.4.27-pre6
Distribution: Debian
Hardware Environment: MSI MS-6167 latest BIOS
Software Environment: 
Problem Description:
This change to drivers/acpi/pci_link.c from 2.4.27-pre6 to -rc1 broke the ACPI 
button:
@@ -698,6 +679,9 @@
 	acpi_link.count++;
 
 end:
+	/* disable all links -- to be activated on use */
+	acpi_ut_evaluate_object(link->handle, "_DIS", 0, NULL);
+
 	if (result)
 		kfree(link);

Steps to reproduce:
Build 2.4.27-rc1 without these lines, ACPI button works
build with these lines, ACPI button does not generate IRQ or events.

The DSDT is posted on sourceforge.
Comment 1 Ryan Underwood 2006-02-21 22:21:49 UTC
By the way, I commented out those lines from 2.6.15.3 and the power button works 
now (IRQ is generated, event appears at /proc/acpi/event)
Comment 2 Shaohua 2006-03-06 17:49:11 UTC
Please attach the dmesg and acpidump. I'd like to see if there are devices 
which share interrupt with ACPI.
Comment 3 Ryan Underwood 2006-03-06 18:16:08 UTC
Created attachment 7519 [details]
acpidump
Comment 4 Ryan Underwood 2006-03-06 18:16:26 UTC
Created attachment 7520 [details]
dmesg
Comment 5 Shaohua 2006-03-06 19:21:40 UTC
Created attachment 7521 [details]
debug patch

I guess I got the root cause. Can you please try the attached patch?
Comment 6 Ryan Underwood 2006-03-06 23:37:07 UTC
No, that did not fix the problem.  Here is new dmesg and acpidump.
Comment 7 Ryan Underwood 2006-03-06 23:40:09 UTC
Created attachment 7523 [details]
dmesg after 1st debug patch
Comment 8 Ryan Underwood 2006-03-06 23:40:14 UTC
Created attachment 7524 [details]
dmesg after 1st debug patch
Comment 9 Ryan Underwood 2006-03-06 23:40:42 UTC
Created attachment 7525 [details]
acpidump after 1st debug patch
Comment 10 Shaohua 2006-03-06 23:45:36 UTC
Oops. please add the two lines at the end of 'acpi_scan_init' (scan.c) like 
this:

	if (result)
		acpi_device_unregister(acpi_root, ACPI_BUS_REMOVAL_NORMAL);
+extern void eisa_set_level_irq(unsigned int irq);
+eisa_set_level_irq(9);
+
      Done:
	return_VALUE(result);
Comment 11 Ryan Underwood 2006-03-07 09:23:02 UTC
Same problem, no IRQ and no event.  BTW, was I supposed to also leave the 
previous change?  I removed it.
Comment 12 Ryan Underwood 2006-03-07 13:47:53 UTC
Maybe if you explain the problem I can help find the solution.  Is the 
triggering mode of IRQ 9 wrong?  If so can the kernel parameter 
acpi_pic_sci=edge still help?
Comment 13 Shaohua 2006-03-07 16:51:07 UTC
Sure. The _DIS method will change the irq's trigger mode in your system. We 
set irq 9 (acpi interrupt) to level trigger and then load the pci_link driver 
(which calls the _DIS method). So from my understanding, we should reset irq 9 
to level trigger after all _DIS. I wonder why my last patch doesn't help.
Comment 14 Shaohua 2006-03-07 16:55:59 UTC
hay, maybe there is an irq pending just before we reset irq to level trigger, 
and the pending irq is missed or something which causes the acpi controller 
wrong.
Comment 15 Shaohua 2006-03-07 17:05:15 UTC
Created attachment 7529 [details]
revised DSDT

Can you try to use this DSDT? You should override your DSDT. I removed the 'irq
trigger change' code from your original DSDT. SO _DIS will not have the side
effect.
Comment 16 Ryan Underwood 2006-03-07 17:32:42 UTC
What about disable interrupts while changing the trigger mode?
Comment 17 Ryan Underwood 2006-03-07 17:33:31 UTC
Is DSDT-as-initrd been merged yet?  If not what is the recommended way to 
override the DSDT currently.
Comment 18 Ryan Underwood 2006-03-07 17:40:17 UTC
Never mind, I found the following.
http://gentoo-wiki.com/HOWTO_Fix_Common_ACPI_Problems#3._Build-
in_Options_for_Kernel_2.6.9_and_Later
Comment 19 Shaohua 2006-03-07 17:47:08 UTC
Created attachment 7530 [details]
aml code

Or maybe I should give your the revised aml code, so you can compile it in your
way.
Comment 20 Ryan Underwood 2006-03-07 18:07:47 UTC
It's OK, I just hope there is a way to fix it besides DSDT override for a silly 
problem.  Maybe a quirks entry?
Comment 21 Shaohua 2006-03-07 18:13:23 UTC
Sure, but if this motherboard works in winxp?
Comment 22 Ryan Underwood 2006-03-07 18:26:58 UTC
The fixed DSDT did not change the behavior.  Maybe something else is going on?  
Since this is very strange, I did verify that the kernel is rebuilt, and AmlCode 
appears in System.map:
c02e33a0 D AmlCode
Comment 23 Ryan Underwood 2006-03-07 18:27:30 UTC
Yeah, everything seems to works in WinXP.  And also any linux <= 2.4.26
Comment 24 Ryan Underwood 2006-03-07 18:30:31 UTC
   tbget-0284: *** Info: Table [DSDT] replaced by host OS
Comment 25 Shaohua 2006-03-07 18:31:54 UTC
Can you send me the lspci -xxxv output then?
Comment 26 Ryan Underwood 2006-03-07 18:36:42 UTC
Created attachment 7531 [details]
lspci -xxxv
Comment 27 Shaohua 2006-03-07 18:49:18 UTC
Oops, I forgot the ELCR register is just io ports.
please give me the out put of 'inb 0x4d0' and 'inb 0x4d1' with/without the 
overrided DSDT (but without any patch). I'm sorry for my last mistake.
Comment 28 Ryan Underwood 2006-03-07 19:02:46 UTC
You want *both* with and without the override DSDT?
Comment 29 Shaohua 2006-03-07 19:07:28 UTC
Yes. The ELCR register determines irq's level/edge trigger.
Comment 30 Ryan Underwood 2006-03-07 19:47:18 UTC
With DSDT:
zephyr:~# inb 0x4d0
Port 0x4d0 has value 0x0 (00000000).
zephyr:~# inb 0x4d1
Port 0x4d1 has value 0xe (00001110).

Without DSDT:
zephyr:~# inb 0x4d0
Port 0x4d0 has value 0x0 (00000000).
zephyr:~# inb 0x4d1
Port 0x4d1 has value 0xc (00001100).
Comment 31 Shaohua 2006-03-07 19:58:28 UTC
Thanks! Looks revised DSDT makes irq 9 level trigger. That's good.
But as you said, power button still can't generate acpi interrupt. That's 
strange. I'll look more closely at your dsdt. I'll let you know if I have 
other idea.
Comment 32 Len Brown 2006-03-08 00:21:36 UTC
Is the issue that acpi events, such as the power button,
don't work, or that poweroff doesn't work?

ie. does 
# init 0
still work?

> ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 10 11 12 14 15) *9
> ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 10 11 12 14 15) *9

The BIOS hands the system to the OS with LNKA and LNKD
with a current IRQ that is outside the list of legal IRQs.
(this is a BIOS bug, but not a rare one)
Linux disables these links, and then when it enables them,
it chooses IRQ11 from the list for both LNKA and LNKD.

Out of curiosity...
Is it possible to disconnect/unload the devices/drivers
on LNKA and see if your ACPI button presses register
interrupts on IRQ11 in /proc/interrupts?
Comment 33 Ryan Underwood 2006-03-08 09:49:31 UTC
Yes, ACPI poweroff works, it's just the button press doesn't generate an IRQ/
event in /proc/acpi/event.

I tried doing what you said, but when I remove all the other drivers, the entry 
in /proc/interrupts goes away.  I re-inserted the sound driver which I figured 
would be low/no interrupt traffic when not in use.  However, watching the # of 
interrupts while pressing the power button provided no change in the number.
Comment 34 Shaohua 2006-03-08 17:04:32 UTC
Then how about something like below: (the change is in acpi_pci_link_allocate, 
just delete one line)
        /*
	 * forget active IRQ that is not in possible list
	 */
	if (i == link->irq.possible_count) {
		if (acpi_strict)
			printk(KERN_WARNING PREFIX "_CRS %d not found"
			       " in _PRS\n", link->irq.active);
-		link->irq.active = 0;
	}
Please try this with the DSDT override. BTW, with DSDT override, did you 
oberve other ACPI interrupts besides button?
Comment 35 Len Brown 2006-03-08 18:21:01 UTC
This should be easy to fix this system, the question
is the risk of the fix to other systems.  Options in
order of risk:

1. add cmdline flag to skip the Link _DIS for all links
1a. optionally invoke it automatically with DMI

Note #1. The ELCR init code assumes we've disabled all
the links, and just has a special case for the SCI.
So that needs to be un-done in this case.

2. For systems with non-zero(2a) CRS outside PRS
   dont _DIS that particulary link.

Note #2, also has ELCR issue as in Note #1.

Note #2a, some systems return CRS 0 always...

   This is a workaround for the CRS outside PRS issue,
   where we can't possibly get back the initial
   condidition after we evaluate _DIS.

3. Don't run _DIS on links that are *used*
   We don't know what links will be used, so we
   need to keep track of referenced links and then
   evaluate _DIS on the others.

   The assumption here is that we'll get spurious
   interrupts if we don't run _DIS on unused links --
   which is why we started running _DIS in the first place.

   This is not possible in the general case, because
   a device can be added anytime after boot which
   creates a new reference to a Link.


4. Never run _DIS, except with special cmdline or
   DMI to enable that option.

Note 4. need to repair ELCR code, and Link init code at same time.
Note 4a. probably makes sense to try this first on PIC-mode systems
where we do NO irq balancing by default.  Unclear if IOAPIC-mode
systems will need to be different from PIC-mode in this area.

I recommend that we go with option #4 and see what we run into.

Comment 36 Ryan Underwood 2006-03-08 19:28:20 UTC
If the whole point of running _DIS was to get around spurious interrupts which 
only occur on some systems, I agree with option 4 as well - assuming the case of 
spurious interrupts is more of a rarity than the case of a broken BIOS like 
mine.

Note: I guess if the BIOS gives you a strange CRS/PRS *and* you get spurious 
interrupts when _DIS is not invoked, you are screwed?
Comment 37 Ryan Underwood 2006-03-08 19:56:32 UTC
David, what other ways could I generate an ACPI interrupt?

Also, your one line patch did not fix the problem, even with the DSDT override.  
I applied it alone, was it intended to go with one of the other patches?
Comment 38 Ryan Underwood 2006-03-08 19:57:13 UTC
Created attachment 7538 [details]
latest dmesg

I enabled acpi=strict to get some few more debug messages.  You might notice
that other things are getting IRQ 9 now.
Comment 39 Shaohua 2006-03-09 18:16:37 UTC
I doulbe checked your DSDT. The _DIS really just does:
1. clear ELCR register
2. disable routing devices

The overriden DSDT removed 1. For 2, we will enable the routing devices later. 
So very likely the enabling dosn't fully reverse the disabling operations. We 
need the chipset datasheet to understand the DSDT code.

The biggest issue confused me is why changing routing devices affects ACPI 
interrupts. IIRC, ACPI interrupts don't go to rounting devices. But maybe this 
is just a strange device.

Reassign this to Len. I guess he will provide a patch for option 4 in comment 
35.
Comment 40 Ryan Underwood 2006-03-10 10:05:19 UTC
http://cdrom.amd.com/21860/22548.pdf
It doesn't seem very useful though.
Comment 42 Ryan Underwood 2006-06-15 13:49:28 UTC
Anything new?  I'd actually like to give this motherboard to someone else as I 
plan to upgrade, but it would be nice if I could check that this bug were fixed 
first.
Comment 43 ykzhao 2007-12-13 23:27:01 UTC
Hi, Rayn
Does the problem still exist in the latest kernel release?
Please try 2.6.23 and attach the dmesg output.
Comment 44 Ryan Underwood 2007-12-24 09:19:46 UTC
I'm not living near the computer right now but I'll try as soon as I can.
Comment 45 ykzhao 2008-02-01 23:51:18 UTC
There is no response for more than one month. The bug will be rejected. 
If the problem still exists, please open a new bug.
Thanks.