Bug 3219 - C-state failure on IBM ThinkPad R40e
Summary: C-state failure on IBM ThinkPad R40e
Status: REJECTED DUPLICATE of bug 3549
Alias: None
Product: ACPI
Classification: Unclassified
Component: Power-Processor (show other bugs)
Hardware: i386 Linux
: P2 normal
Assignee: Venkatesh Pallipadi
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-08-15 12:44 UTC by Josef Kufner
Modified: 2004-12-16 16:55 UTC (History)
3 users (show)

See Also:
Kernel Version: 2.6.8 - vanilla
Tree: Mainline
Regression: ---


Attachments
Output of acpidmp (115.56 KB, text/plain)
2004-09-08 14:11 UTC, Josef Kufner
Details
Test 1 (407 bytes, patch)
2004-09-09 13:50 UTC, Venkatesh Pallipadi
Details | Diff
Test 2 (383 bytes, patch)
2004-09-09 13:53 UTC, Venkatesh Pallipadi
Details | Diff
Test 3 (336 bytes, text/plain)
2004-09-12 14:21 UTC, Josef Kufner
Details
test4 patch (1.22 KB, patch)
2004-09-23 21:23 UTC, Venkatesh Pallipadi
Details | Diff

Description Josef Kufner 2004-08-15 12:44:33 UTC
Distribution: All
Hardware Environment: IBM Thinkpad R40e
Software Environment: Kernel 2.6.8 Vanilla (without any patches)
Problem Description: System imediantly hangs when module "processor" is loaded
or when ACPI is initialized (if is not compiled as kernel module)

Steps to reproduce: in kernel menuconfig select:
[*] ACPI Support
<M>   Processor
and try to boot

(sorry, i'm non-native english speaker)
Comment 1 Venkatesh Pallipadi 2004-09-03 14:05:42 UTC
This is what I understand the problem is:

You have this in config
[*] ACPI Support
<M>   Processor
Later you boot with that kernel and try to "insmod processor.ko" and the 
system hangs.

If you have  
[*] ACPI Support
<*>   Processor

System hangs during the boot itself.

Am I correct in my understanding of the problem?


Can you please provide more details.

1) Was it working on some kernel version before 2.6.8 and started failing now? 
Or it never worked?

2) Does it print anything in console (when you insmod from text console), 
before hanging?

3) Please attach the output of acpidmp, available in /usr/sbin/ or 
in pmtools:  http://www.kernel.org/pub/linux/kernel/people/lenb/acpi/utils/


Comment 2 Josef Kufner 2004-09-08 14:11:30 UTC
Created attachment 3648 [details]
Output of acpidmp

> 3) Please attach the output of acpidmp, available in /usr/sbin/ or 
> in pmtools:  http://www.kernel.org/pub/linux/kernel/people/lenb/acpi/utils/
Here is it.
Comment 3 Josef Kufner 2004-09-08 14:13:18 UTC
> Am I correct in my understanding of the problem?
Yes.

> 1) Was it working on some kernel version before 2.6.8 and
> started failing now? Or it never worked?
I tested 2.6.0. Is it same as 2.6.8. => It never worked.

> 2) Does it print anything in console (when you insmod from text console),
> before hanging?
Insmod is done without error. System hangs after insmod is finished. There is
nothing unusual in console, cpu is detected and all seems be ok.
It says: "ACPI: Processor [CPU] (supports C1 C2 C3, 8 thorttling states)"

Comment 4 Venkatesh Pallipadi 2004-09-08 14:17:53 UTC
Do you have any wireless driver loaded on this system at the time of the hang? 
Comment 5 Venkatesh Pallipadi 2004-09-08 14:20:40 UTC
Do you have preemption enabled in kernel config?
Comment 6 Josef Kufner 2004-09-09 03:20:28 UTC
> Do you have preemption enabled in kernel config?
I tryied to disable it, but it is same.

> Do you have any wireless driver loaded on this system at the time of the hang?
No.
Comment 7 Venkatesh Pallipadi 2004-09-09 13:49:07 UTC
OK. May be this is some platform specific bug, when it tries to enter C2/C3 
state.

Can you try doing this. Keep the CPU busy by running something in background. 
Say "make -j8" of kernel. And while it is busy try adding the processor 
driver. If it is the problem with some C states on this platform, then it 
should not hang when CPU is busy. Probably it will hang once the make is done.

Comment 8 Venkatesh Pallipadi 2004-09-09 13:50:24 UTC
Created attachment 3656 [details]
Test 1

You can also try this test patch, which disables C-state part of processor
driver to see if the hang goes away.
Comment 9 Venkatesh Pallipadi 2004-09-09 13:53:03 UTC
Created attachment 3657 [details]
Test 2

If "Test 1" patch made the hang to go away, then
- Remove Test 1 patch
- Apply Test 2 patch
- And try again.

This patch only disables C3 state. 

Hopefully we should get to the root cause of the problem with these patches.
Comment 10 Josef Kufner 2004-09-12 14:21:32 UTC
Created attachment 3660 [details]
Test 3

Patch "test1" working. System not hangs after insmod.

Patch "test2" not working. System hangs after insmod.

I try to disable C2 state (like C3 state in patch "test2") and system not hangs
after insmod.

/proc/acpi/processor/CPU/info:
processor id:		 0
acpi id:		 1
bus mastering control:	 yes
power management:	 yes
throttling control:	 yes
limit interface:	 yes

/proc/acpi/processor/CPU/power:
active state:		 C1
default state:		 C1
bus master activity:	 00000000
states:
   *C1: 		 promotion[--] demotion[--] latency[000]
usage[00208796]
    C2: 		 <not supported>
    C3: 		 promotion[--] demotion[--] latency[250]
usage[00000000]


It seems that in C2 state is something wrong ;)
Comment 11 Venkatesh Pallipadi 2004-09-13 09:54:36 UTC
Yes. It surely looks like a C2 related bug on this platform. Can you please 
make sure that you are running with the latest BIOS. Also check whether there 
are any C-state specific options in the BIOS.
Comment 12 Josef Kufner 2004-09-14 01:24:58 UTC
BIOS is lastest (update was first thing that I done). And there is no C-state
specific options in the BIOS.
Comment 13 Venkatesh Pallipadi 2004-09-15 10:01:53 UTC
Len: IBM Thinkpad R40e C2 state entry seems to be hanging. Have you come 
across any other similar complaints? Should we just disable C2, C3 on this 
platform. Right now it just hangs while adding processor driver.


Comment 14 Thomas Renninger 2004-09-21 12:28:59 UTC
Same problem on same machine R40e.
Bug reporter told that he has bought about ten of those machines for holding
lessons. He tested on three machines, all hang when trying to load processor module.

Test 3 (Disabling C2) prevented freeze -> machine seems to work fine.
Comment 15 Venkatesh Pallipadi 2004-09-21 13:35:27 UTC
>> Test 3 (Disabling C2) prevented freeze -> machine seems to work fine.

This doesn't say anything about whether C3 has the problem or not. As, the 
current C-state promotion/demotion algorithm doesn't use C3, if C2 is not 
supported. 

I feel that even C3 may have an issue here. Looks like this machine has to go 
onto some C-state black list.

Comment 16 Thomas Renninger 2004-09-23 18:18:05 UTC
This is a nice example how useful it could be, to be able to override the DSDT.
Please correct me if I'm wrong, but it should be not that difficult to override
the _CST function in the DSDT to get the machines working.

In this case people buying a R40e will get sooner or later a BIOS update as IBM
is Linux friendly, others might not have this luck.

However they still have to buy a USB floppy and/or are confronted with the
difficulties/risks of flushing their BIOS (->booting DOS to start the .exe?).

Providing a new DSDT in /etc/DSDT.aml, mkinitrd and rebooting would be quite
useful in this case..., right?
Comment 17 Venkatesh Pallipadi 2004-09-23 21:22:30 UTC
Probably true. But, supplying the updated DSDT file for every platform, for 
every BIOS update, and expecting the user to rebuild the kernel on some DSDT 
updates, somehow doesn't seem very practical to me.


Anyways, I am still trying to find out, whether C3 states are totally broken 
on these systems or is there a kernel bug somewhere.
There are two ways in which BIOS can export C-state information. P_LVLx and 
_CST. Looking at the acpidmp, these two information agree with each other. 
But, in kernel we only use P_LVLx. Not sure, whether both these tables can be 
broken. In that case we may see issues with "the other OS" as well.

Can someone try the attached test4 patch. We do everything as usual, except 
for port accesses which takes us to C2 and C3 state. With this patch, I expect 
system to continue running, even after adding processor module. I am curious 
to see the C2, and C3 address that gets printed after processor module is 
added and also "/proc/acpi/processor/CPU/power" output after adding the module.

Comment 18 Venkatesh Pallipadi 2004-09-23 21:23:42 UTC
Created attachment 3706 [details]
test4 patch
Comment 19 Thomas Renninger 2004-09-24 08:53:01 UTC
Tested patch4:
/proc/acpi/processor/*/power:
active state:            C2
default state:           C1
bus master activity:     00000001
states:
    C1:                  promotion[C2] demotion[--] latency[000] usage[00000010]
   *C2:                  promotion[C3] demotion[C1] latency[003] usage[00043169]
    C3:                  promotion[--] demotion[C2] latency[250] usage[01402130]

tail /var/log/messages:
Sep 24 17:35:25 ibm3 kernel: lvl2[0x00008014] lvl3[0x00008015]
Sep 24 17:35:25 ibm3 kernel: ACPI: Processor [CPU] (supports C1 C2 C3, 8
throttling states)

According to your statement before:
Current kernels never use _CST function?
So my try to patch the DSDT could never work...:

ACPI spec:
Also notice that if the _CST object exists and the _PTC object does not exist,
OSPM will use the processor control register defined in P_BLK and the P_LVLx
registers in the _CST object.

I tried to provide an empty _PTC template (think this never could work) and
shortend _CST to only return C1 state:
Name(_PTC, ResourceTemplate() 
            { 
	        Register(FFixedHW, 0, 0, 0) 
	    } )
to let the kernel use the _CST method (what in current kernels never is used,
right?)

Just for me: Where can I find values to fill above template, are they provided
in some Celeron/Pentium spec?

Conclusion:
1) The broken R40e system cannot be fixed with an overridden DSDT, at least a
new FADT is needed, or the kernel needs to be recompiled with patch/test 3?

2) Current kernels violate the ACPI spec -> _CST is never used.
Comment 20 Venkatesh Pallipadi 2004-09-24 09:45:00 UTC
AFAIK, pre ACPI 2.0 had only P_LVLx based C-state support. With this we can 
have upto C3 state. This is what is supported in current kernels. The fields 
used for this is FADT and Processor object in DSDT.

ACPI 2.0 introduced _CST objects to support more than C3 (C4,...) state. The 
support for this is work in progress.

Having said that, I don't think providing _CST support is going to give us 
anything different on this platform (except for C4 state). Looking at the 
acpidmp, both P_LVLx and _CST objects have same IO port specified (8014 and 
8015 for C2 and C3. 

And looks like we hang when we try to do an in on 8014 (and/or 8015). Thats 
what test4 patch is telling us.

So, if you want to change _CST to use any other IO ports by patching DSDT, you 
can also change Processor object to change the IO ports used through P_LVLx 
mechanism. And changing _CST and _PTC will not be of much use. Kernel doesn't 
violate ACPI spec. It just supports pre ACPI 2.0 as far as C-state support is 
concerned :).

Len: This system seems to have bad C-state information in both P_LVL and _CST. 
When we try in on the port mentioned, while trying to go to C state, we hang.
Just commenting the "in" instruction seems to work fine.
Comment 21 Venkatesh Pallipadi 2004-10-13 06:48:58 UTC
Patch in bugzilla #3549 is a clean workaround for this problem (until the 
actual problem gets fixed by BIOS, that is).
Comment 22 Venkatesh Pallipadi 2004-11-08 13:43:32 UTC

*** This bug has been marked as a duplicate of 3549 ***
Comment 23 Peter Tiggerdine 2004-12-16 16:39:24 UTC
So this is a BIOS problem and not that the kernel only supports ACPI pre 2.0 for
C-states? If this is the case then surely as IBM are linux friendly, that's once
we e-mailed them about the problem that they can make a fix.... ( I have one of
these laptops, anoying not haveing power management and C2/C3 ability.)
Comment 24 Venkatesh Pallipadi 2004-12-16 16:55:02 UTC
It is true that current day Linux only supports pre ACPI 2.0 based C-states. 
But, in this particular system BIOS has same information in both P_LVLx and 
_CST. So, even if linux supports ACPi 2.0 base C-states (_CST that is), it 
will fail in the same way. Thats why I can say that this is a BIOS bug.

Note You need to log in before you can comment on or make changes to this bug.