Bug 6316 - PnPACPI problems with new resource manager - Fujitsu Lifebook C-1020
Summary: PnPACPI problems with new resource manager - Fujitsu Lifebook C-1020
Status: CLOSED CODE_FIX
Alias: None
Product: ACPI
Classification: Unclassified
Component: Config-Interrupts (show other bugs)
Hardware: i386 Linux
: P2 normal
Assignee: Shaohua
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-04-02 06:32 UTC by Michael Karcher
Modified: 2007-03-10 21:24 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.16
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
DSDT source code (141.40 KB, text/plain)
2006-04-02 06:35 UTC, Michael Karcher
Details
Fixes flags in pnpacpi code (7.07 KB, patch)
2006-04-07 06:26 UTC, Michael Karcher
Details | Diff
Updated patch for kernels since 2.6.19 (7.20 KB, patch)
2007-03-08 01:14 UTC, Michael Karcher
Details | Diff

Description Michael Karcher 2006-04-02 06:32:04 UTC
Most recent kernel where this bug did not occur: 2.6.15
Distribution: Debian
Hardware Environment: Fujitsu Lifebook C1020
Software Environment:
Problem Description:

ACPI Plug'n'play fails to enable the parallel port after S3 with

> ACPI Error (exoparg2-0397): Index value (000000005) beyond package end (4)
[20060127]
> ACPI Error (psparse-0517): Method parse/execution failed
[\_SB_.PCI0.PIB_.SIOD.PSRS] (Node c144e420), AE_AML_PACKAGE_LIMIT
> ACPI Error (psparse-0517): Method parse/execution failed
[\_SB_.PCI0.PIB_.SIOD.ECP_._SRS] (Node c144e580), AE_AML_PACKAGE_LIMIT
> pnp: Failed to activate device 00:0a.

The cause of this problem seem to be the PSRS function on setting the DMA value
trying to set dma channel 5. This should not happen, as the only allowed
channels are 0, 1 and 3. The DSDT (attached) looks for the DMA channel mask byte
at offset 0x14 in the resource buffer passed to _SRS. The new abstraction layer
of the resource manager puts in the 3-byte-version of the interrupt descriptor
at offset 8, so everything afterwards is shifted one byte, so the DSDT does not
read the DMA mask byte now at 0x15, but the resource descriptor type byte 0x2A,
it then uses FindSetLeftBit to find the number of the highest (and it should be
only) set bit in this byte, which is 5.

Steps to reproduce:
Reproducable only on the given hardware, error message appears after entering S3
Comment 1 Michael Karcher 2006-04-02 06:35:03 UTC
Created attachment 7743 [details]
DSDT source code

problem occurs in line 2659, where Local0 is 5, and DMAP is a package of length
4.
Comment 2 Robert Moore 2006-04-03 15:17:35 UTC
Is the sequence of events something like: 1) GetCurrentResources on device 
ECP, 2) make modifications to the resource descriptor(s), 3) 
SetCurrentResources on ECP?

You are saying that in step 3, the resource template is constructed 
incorrectly?
Comment 3 Robert Moore 2006-04-03 15:34:52 UTC
Please post the contents of the buffer that is being passed to 
acpi_set_current_resources, and if possible, the buffer returned from the 
earlier call to get_current_resources or get_possible_resources.
Comment 4 Robert Moore 2006-04-03 16:04:04 UTC
I think the short description may be: resource manager is emitting a length 3 
IRQ descriptor when the length 2 short IRQ descriptor is expected 
(IRQNoFlags). Still need to see the buffers requested previously.


Comment 5 Michael Karcher 2006-04-04 04:35:34 UTC
Exactly, Robert. Your comment 4 says it all. The Buffer returned from CRS can be
found in the DSDT and looks like

ResourceTemplate ()
{
    IO (Decode16,0x0378,0x0378,0x01,0x08)
    IRQNoFlags () {7}
    IO (Decode16,0x0778,0x0778,0x01,0x08)
    DMA (Compatibility, NotBusMaster, Transfer8) {3}
}

or, as hexadecimal bytes:
 0  1  2  3  4  5  6  7  8  9  A  B  C  D  E  F 10 11 12 13 14 15
-----------------------------------------------------------------
47 01 78 03 78 03 01 08 22 80 00 47 01 78 07 78 07 01 08 2A 08 00

whereas the buffer passed to _SRS looks like
 0  1  2  3  4  5  6  7  8  9  A  B  C  D  E  F 10 11 12 13 14 15 16
--------------------------------------------------------------------
47 01 78 03 78 03 01 08 23 80 00 00 47 01 78 07 78 07 01 08 2A 08 00

The firmware code looks at byte 14, and thus tries to program the parallel port
to DMA 5 which fails, as the lookup table for the data written into the
configuration register is only for the range 0-3.
Comment 6 Robert Moore 2006-04-04 13:55:15 UTC
Here is what I am seeing with the current ACPICA version (20060331):

[ACPI Debug]  String: [0x07] "ECP_CRS"
[ACPI Debug]  Buffer: [0x18]
  0000: 47 01 78 03 78 03 01 08 22 80 00 47 01 78 07 78
  0010: 07 01 08 2A 08 00 79 00

[ACPI Debug]  String: [0x07] "ECP_SRS"
[ACPI Debug]  Buffer: [0x19]
  0000: 47 01 78 03 78 03 01 08 22 80 00 47 01 78 07 78
  0010: 07 01 08 2A 08 00 79 00 00

Other than an extra trailing zero (should not cause a problem), the buffers 
are identical.

We did make a couple of fixes to the resource manager since the 20060127 
release, and this problem may be fixed (current release is 20060331.)

It still may be useful to see the buffer in the internal resource format, as 
returned by the get_current_resources interface and passed as a parameter to 
the set_current_resources interface. Please dump these buffers and post them.

In the meantime, I will regenerate the 20060127 release and see if the result 
can be reproduced.


Comment 7 Robert Moore 2006-04-04 14:47:49 UTC
The extra byte is being generated because the resource manager believes that 
the interrupt is level triggered. The default IRQNoFlags descriptor can only 
be used for edge triggered interrupts.

I suspect that the reason behind this is because we fixed an error with two 
defines in actypes.h. Originally, they were:
/*
 *  IRQ Attributes
 */
#define ACPI_EDGE_SENSITIVE             (UINT8) 0x00
#define ACPI_LEVEL_SENSITIVE            (UINT8) 0x01

Which is backwards from the ACPI spec and the IRQ flags. We fixed this, and 
the new defines are reversed:

#define ACPI_LEVEL_SENSITIVE            (UINT8) 0x00
#define ACPI_EDGE_SENSITIVE             (UINT8) 0x01

You may be running into some code that assumes the previous definition and is 
inadvertently setting the IRQ to level-triggered instead of edge-triggered.
Comment 8 Robert Moore 2006-04-04 14:52:37 UTC
Could not reproduce under 20060127 with a test program that runs _CRS via 
acpi_get_current_resources, then calls acpi_set_current_resources with the 
same buffer.

Therefore, I think that a driver above the ACPICA code is indavertently 
setting the IRQ to level-triggered, forcing the resource manager to emit the 
larger IRQ resource descriptor.
Comment 9 Robert Moore 2006-04-05 12:09:00 UTC
You might also want to look for code of the form:

if (irq->triggered)
{
   // assumes level-sensitive
}

should be:

if (irq->triggered == ACPI_LEVEL_SENSITIVE)
Comment 10 Robert Moore 2006-04-05 15:04:57 UTC
->triggering if using the acpi headers
Comment 11 Michael Karcher 2006-04-07 06:26:15 UTC
Created attachment 7801 [details]
Fixes flags in pnpacpi code

This fix to the Plug'n'play provider using ACPI as backend fixes this bug, I
suppose. Tested on one machine (A Thinkpad T20) yet.
Comment 12 Michael Karcher 2006-04-07 06:29:26 UTC
Hello robert,
you are right. The problem is in the pnpacpi driver. It messes up the flags of
the resources if it just wants to reenable a device that already was activated
during boot. A fix for that problem, that fixes a similar to the reported
problem on my laptop, is in attachment 7801 [details]. I will get access to the machine
the original report was for tomorrow, and will test the fix there too.

Thanks for your help, Michael Karcher

PS: what is the right category/component for PnPACPI? Is it ACPI/Other or
Drivers/PnP or something else?
Comment 13 Robert Moore 2006-04-07 13:51:01 UTC
>res->irq_resource[i].flags = IORESOURCE_IRQ;  // Also clears _UNSET flag

So this code was hardcoding the interrupt flags regardless of what the BIOS 
said? If so, it's a very good thing that we caught this.

There is no subsection for PnP ACPI issues, perhaps there should be. In the 
meantime "other" may have to suffice.
Comment 14 Michael Karcher 2006-04-08 02:37:50 UTC
Hello Robert,
you are right. The current (unpatched) code hardcoded the flags to no(!) bit for the type set, which is an invalid combination in the PNP layer (they have a bit for all four possibilities), leading to uninitialized variables if I read the code correctly.
The problem only occurred on re-activating a device with the resources it was booted with, everything is working fine on activating a device the PnP-layer auto-assigned resources to, as an other codepath is used in that case. So you could also look at this problem as a S3 resume issue.

The patch now has also been tested on the machine I reported the problem for (a Fujitsu Lifebook C-1020) and works well on that machine too.

Michael Karcher

PS: who is responsible for closing the resolved bug? Me or you?
Comment 15 Shaohua 2006-05-17 20:47:00 UTC
Acked. The patch looks quite good to me.
Comment 16 Luming Yu 2006-05-17 23:30:39 UTC
Tested with Fujitsu Lifebook S6130, didn't find the error described in the bug 
description. So, my lifebook S6130 is quite different with yours, and is not 
affected by this bug.  Anyway, I tested the patch, no error found.

Thanks,
Luming
Comment 17 Len Brown 2007-03-08 00:01:38 UTC
a different patch went into 2.6.19 mucking with
rsparser.c dma  resources.

what part of this patch still makes sense?
Comment 18 Michael Karcher 2007-03-08 01:14:03 UTC
Created attachment 10652 [details]
Updated patch for kernels since 2.6.19

The bug is still present in current kernels. I have attached an updated version
of this patch.
Comment 19 Len Brown 2007-03-08 20:32:59 UTC
patch in comment #18 applied to acpi-test
Comment 20 Len Brown 2007-03-10 21:24:57 UTC
shipped in 2.6.21-rc3-git6
closed

Note You need to log in before you can comment on or make changes to this bug.