Distribution: All Hardware Environment: IBM Thinkpad R40e Software Environment: Kernel 2.6.8 Vanilla (without any patches) Problem Description: System imediantly hangs when module "processor" is loaded or when ACPI is initialized (if is not compiled as kernel module) Steps to reproduce: in kernel menuconfig select: [*] ACPI Support <M> Processor and try to boot (sorry, i'm non-native english speaker)
This is what I understand the problem is: You have this in config [*] ACPI Support <M> Processor Later you boot with that kernel and try to "insmod processor.ko" and the system hangs. If you have [*] ACPI Support <*> Processor System hangs during the boot itself. Am I correct in my understanding of the problem? Can you please provide more details. 1) Was it working on some kernel version before 2.6.8 and started failing now? Or it never worked? 2) Does it print anything in console (when you insmod from text console), before hanging? 3) Please attach the output of acpidmp, available in /usr/sbin/ or in pmtools: http://www.kernel.org/pub/linux/kernel/people/lenb/acpi/utils/
Created attachment 3648 [details] Output of acpidmp > 3) Please attach the output of acpidmp, available in /usr/sbin/ or > in pmtools: http://www.kernel.org/pub/linux/kernel/people/lenb/acpi/utils/ Here is it.
> Am I correct in my understanding of the problem? Yes. > 1) Was it working on some kernel version before 2.6.8 and > started failing now? Or it never worked? I tested 2.6.0. Is it same as 2.6.8. => It never worked. > 2) Does it print anything in console (when you insmod from text console), > before hanging? Insmod is done without error. System hangs after insmod is finished. There is nothing unusual in console, cpu is detected and all seems be ok. It says: "ACPI: Processor [CPU] (supports C1 C2 C3, 8 thorttling states)"
Do you have any wireless driver loaded on this system at the time of the hang?
Do you have preemption enabled in kernel config?
> Do you have preemption enabled in kernel config? I tryied to disable it, but it is same. > Do you have any wireless driver loaded on this system at the time of the hang? No.
OK. May be this is some platform specific bug, when it tries to enter C2/C3 state. Can you try doing this. Keep the CPU busy by running something in background. Say "make -j8" of kernel. And while it is busy try adding the processor driver. If it is the problem with some C states on this platform, then it should not hang when CPU is busy. Probably it will hang once the make is done.
Created attachment 3656 [details] Test 1 You can also try this test patch, which disables C-state part of processor driver to see if the hang goes away.
Created attachment 3657 [details] Test 2 If "Test 1" patch made the hang to go away, then - Remove Test 1 patch - Apply Test 2 patch - And try again. This patch only disables C3 state. Hopefully we should get to the root cause of the problem with these patches.
Created attachment 3660 [details] Test 3 Patch "test1" working. System not hangs after insmod. Patch "test2" not working. System hangs after insmod. I try to disable C2 state (like C3 state in patch "test2") and system not hangs after insmod. /proc/acpi/processor/CPU/info: processor id: 0 acpi id: 1 bus mastering control: yes power management: yes throttling control: yes limit interface: yes /proc/acpi/processor/CPU/power: active state: C1 default state: C1 bus master activity: 00000000 states: *C1: promotion[--] demotion[--] latency[000] usage[00208796] C2: <not supported> C3: promotion[--] demotion[--] latency[250] usage[00000000] It seems that in C2 state is something wrong ;)
Yes. It surely looks like a C2 related bug on this platform. Can you please make sure that you are running with the latest BIOS. Also check whether there are any C-state specific options in the BIOS.
BIOS is lastest (update was first thing that I done). And there is no C-state specific options in the BIOS.
Len: IBM Thinkpad R40e C2 state entry seems to be hanging. Have you come across any other similar complaints? Should we just disable C2, C3 on this platform. Right now it just hangs while adding processor driver.
Same problem on same machine R40e. Bug reporter told that he has bought about ten of those machines for holding lessons. He tested on three machines, all hang when trying to load processor module. Test 3 (Disabling C2) prevented freeze -> machine seems to work fine.
>> Test 3 (Disabling C2) prevented freeze -> machine seems to work fine. This doesn't say anything about whether C3 has the problem or not. As, the current C-state promotion/demotion algorithm doesn't use C3, if C2 is not supported. I feel that even C3 may have an issue here. Looks like this machine has to go onto some C-state black list.
This is a nice example how useful it could be, to be able to override the DSDT. Please correct me if I'm wrong, but it should be not that difficult to override the _CST function in the DSDT to get the machines working. In this case people buying a R40e will get sooner or later a BIOS update as IBM is Linux friendly, others might not have this luck. However they still have to buy a USB floppy and/or are confronted with the difficulties/risks of flushing their BIOS (->booting DOS to start the .exe?). Providing a new DSDT in /etc/DSDT.aml, mkinitrd and rebooting would be quite useful in this case..., right?
Probably true. But, supplying the updated DSDT file for every platform, for every BIOS update, and expecting the user to rebuild the kernel on some DSDT updates, somehow doesn't seem very practical to me. Anyways, I am still trying to find out, whether C3 states are totally broken on these systems or is there a kernel bug somewhere. There are two ways in which BIOS can export C-state information. P_LVLx and _CST. Looking at the acpidmp, these two information agree with each other. But, in kernel we only use P_LVLx. Not sure, whether both these tables can be broken. In that case we may see issues with "the other OS" as well. Can someone try the attached test4 patch. We do everything as usual, except for port accesses which takes us to C2 and C3 state. With this patch, I expect system to continue running, even after adding processor module. I am curious to see the C2, and C3 address that gets printed after processor module is added and also "/proc/acpi/processor/CPU/power" output after adding the module.
Created attachment 3706 [details] test4 patch
Tested patch4: /proc/acpi/processor/*/power: active state: C2 default state: C1 bus master activity: 00000001 states: C1: promotion[C2] demotion[--] latency[000] usage[00000010] *C2: promotion[C3] demotion[C1] latency[003] usage[00043169] C3: promotion[--] demotion[C2] latency[250] usage[01402130] tail /var/log/messages: Sep 24 17:35:25 ibm3 kernel: lvl2[0x00008014] lvl3[0x00008015] Sep 24 17:35:25 ibm3 kernel: ACPI: Processor [CPU] (supports C1 C2 C3, 8 throttling states) According to your statement before: Current kernels never use _CST function? So my try to patch the DSDT could never work...: ACPI spec: Also notice that if the _CST object exists and the _PTC object does not exist, OSPM will use the processor control register defined in P_BLK and the P_LVLx registers in the _CST object. I tried to provide an empty _PTC template (think this never could work) and shortend _CST to only return C1 state: Name(_PTC, ResourceTemplate() { Register(FFixedHW, 0, 0, 0) } ) to let the kernel use the _CST method (what in current kernels never is used, right?) Just for me: Where can I find values to fill above template, are they provided in some Celeron/Pentium spec? Conclusion: 1) The broken R40e system cannot be fixed with an overridden DSDT, at least a new FADT is needed, or the kernel needs to be recompiled with patch/test 3? 2) Current kernels violate the ACPI spec -> _CST is never used.
AFAIK, pre ACPI 2.0 had only P_LVLx based C-state support. With this we can have upto C3 state. This is what is supported in current kernels. The fields used for this is FADT and Processor object in DSDT. ACPI 2.0 introduced _CST objects to support more than C3 (C4,...) state. The support for this is work in progress. Having said that, I don't think providing _CST support is going to give us anything different on this platform (except for C4 state). Looking at the acpidmp, both P_LVLx and _CST objects have same IO port specified (8014 and 8015 for C2 and C3. And looks like we hang when we try to do an in on 8014 (and/or 8015). Thats what test4 patch is telling us. So, if you want to change _CST to use any other IO ports by patching DSDT, you can also change Processor object to change the IO ports used through P_LVLx mechanism. And changing _CST and _PTC will not be of much use. Kernel doesn't violate ACPI spec. It just supports pre ACPI 2.0 as far as C-state support is concerned :). Len: This system seems to have bad C-state information in both P_LVL and _CST. When we try in on the port mentioned, while trying to go to C state, we hang. Just commenting the "in" instruction seems to work fine.
Patch in bugzilla #3549 is a clean workaround for this problem (until the actual problem gets fixed by BIOS, that is).
*** This bug has been marked as a duplicate of 3549 ***
So this is a BIOS problem and not that the kernel only supports ACPI pre 2.0 for C-states? If this is the case then surely as IBM are linux friendly, that's once we e-mailed them about the problem that they can make a fix.... ( I have one of these laptops, anoying not haveing power management and C2/C3 ability.)
It is true that current day Linux only supports pre ACPI 2.0 based C-states. But, in this particular system BIOS has same information in both P_LVLx and _CST. So, even if linux supports ACPi 2.0 base C-states (_CST that is), it will fail in the same way. Thats why I can say that this is a BIOS bug.