Bug 3549 - add option/dmi to disable c2/c3 in processor.c
add option/dmi to disable c2/c3 in processor.c
Status: CLOSED CODE_FIX
Product: ACPI
Classification: Unclassified
Component: Config-Processors
i386 Linux
: P2 normal
Assigned To: Len Brown
:
: 3219 3406 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2004-10-11 06:02 UTC by Andi Kleen
Modified: 2008-07-26 00:45 UTC (History)
6 users (show)

See Also:
Kernel Version: 2.6.9rc3
Tree: Mainline
Regression: ---


Attachments
Disable C2/C3 on R40 and add option (2.21 KB, patch)
2004-10-11 06:04 UTC, Andi Kleen
Details | Diff
One Medion laptop seems to have the same problem. (438 bytes, patch)
2004-10-11 06:05 UTC, Andi Kleen
Details | Diff
acpi_cstate_limit patch included in 2.6.10-rc2 (3.58 KB, patch)
2004-12-01 15:18 UTC, Len Brown
Details | Diff
max_cstate patch on top of previous two patches for 2.6.10-rc3 (6.93 KB, patch)
2004-12-01 15:20 UTC, Len Brown
Details | Diff
IBM_R40e_C2C3lockup_fix: Blacklist by Board, not BIOS Version (576 bytes, patch)
2005-03-07 02:25 UTC, Thomas R. (TRauMa)
Details | Diff
linux-2.6.11-IBM_R40e_C2C3lockup_fix-2: Blacklist all known BIOS Versions (2.02 KB, patch)
2005-03-08 00:08 UTC, Thomas R. (TRauMa)
Details | Diff
Updated patch for 2.6.13-rc5 (2.38 KB, patch)
2005-08-06 09:37 UTC, Martin West
Details | Diff

Description Andi Kleen 2004-10-11 06:02:39 UTC
Some IBM and Medion laptops crash in C2/C3. It might be a Linux or a BIOS
bug, but it needs some workaround. This patch allows disabling c2/c3
and adds a DMI table for it for the IBM T40
Comment 1 Andi Kleen 2004-10-11 06:04:23 UTC
Created attachment 3804 [details]
Disable C2/C3 on R40 and add option


It's the IBM R40e, not T40 with the problem sorry (typo in previous comment)
Comment 2 Andi Kleen 2004-10-11 06:05:15 UTC
Created attachment 3805 [details]
One Medion laptop seems to have the same problem.
Comment 3 Matthew Garrett 2004-10-19 12:12:21 UTC
There are already two more recent versions of the BIOS than 1SET60WW - is this
the best string to test?
Comment 4 Venkatesh Pallipadi 2004-11-08 13:43:46 UTC
*** Bug 3219 has been marked as a duplicate of this bug. ***
Comment 5 Len Brown 2004-11-14 22:11:02 UTC
*** Bug 3406 has been marked as a duplicate of this bug. ***
Comment 6 Len Brown 2004-11-15 19:40:03 UTC
shipped in Linux 2.6.10-rc2 - closed. 
Comment 7 Len Brown 2004-12-01 15:18:47 UTC
Created attachment 4189 [details]
acpi_cstate_limit patch included in 2.6.10-rc2
Comment 8 Len Brown 2004-12-01 15:20:45 UTC
Created attachment 4190 [details]
max_cstate patch on top of previous two patches for 2.6.10-rc3
Comment 9 Len Brown 2004-12-05 19:37:37 UTC
patch in comment #8 shipped in 2.6.10-rc3 
Comment 10 mark struberg 2005-02-17 02:26:11 UTC
I use an R40e with bios version 1SET65WW (which still hangs in C2 and C3) with
FC3 2.6.10-1.766

Adding my bios version to drivers/acpi/processor.c structure processor_dmi_table
got me running again.

All R40e bios versions may be found here (14 different)
http://www-307.ibm.com/pc/support/site.wss/document.do?lndocid=MIGR-52981

I had bios versions 1SET56WW before with the same problem and guess the other
versions have it too (though not having tested them)

Is there a possibility to check those bios versions and extend the
processor_dmi_table?
Comment 11 Thomas R. (TRauMa) 2005-03-04 17:39:28 UTC
Parent is right: I still have to patch the kernel or else it crashes on my
machine, so this CODE_FIX doesn't work for me. Particulary nice if booting from
CD... :-(. Please add the other bios version strings, too.
Comment 12 Martin West 2005-03-05 04:06:35 UTC
Hi, I have an R40E with the latest bios and running 2.6.10-1.770_FC2 with this
problem. Happy to be a guinea pig on testing. I tried recompiling the kernel w/o
cpu_freq but still had the problem.
Comment 13 Thomas R. (TRauMa) 2005-03-07 02:25:36 UTC
Created attachment 4684 [details]
IBM_R40e_C2C3lockup_fix: Blacklist by Board, not BIOS Version

OK, I think this should work across BIOS Versions, for all R40e. All Guniea
Pigs have fun testing :-). (Patch is against 2.6.11, in 2.6.10 you have to
patch the file drivers/acpi/processor.c instead - search for R40e, apply by
hand).

If I was the bug owner, I'd REOPEN this bug. A fix still crashing on 80% of
R40e != fix
Comment 14 Martin West 2005-03-07 12:56:37 UTC
Not sure this fix will work my machine number is 2684HTG, I think 2684 is the
R40E, but Ill check..
Comment 16 Thomas R. (TRauMa) 2005-03-07 23:56:07 UTC
So, if we add 2685, we catch all R40e? Still would be better than putting 14
BIOS Versions in the code? OTOH there still is the slim posibility that IBM
releases a fixed BIOS, then we'd have to check against Version anyway.
Comment 17 Thomas R. (TRauMa) 2005-03-07 23:59:28 UTC
Ah, sry, only saw your first comment. Is there a way to only match against the
first four chars in DMI_MATCH? If not, it's hardcoding all versions. :|
Comment 18 Thomas R. (TRauMa) 2005-03-08 00:08:42 UTC
Created attachment 4687 [details]
linux-2.6.11-IBM_R40e_C2C3lockup_fix-2: Blacklist all known BIOS Versions

Ugly. But it should work.
Comment 19 Martin West 2005-03-08 01:35:06 UTC
life is never perfect :<)
Comment 20 Martin West 2005-03-08 01:39:15 UTC
PS, I dont have a processor_idle.c, I have just processor.c - I can hand craft
those changes into my processor.c 

Also Im just in the process of switching to FC3.
Comment 21 Martin West 2005-03-08 03:48:24 UTC
whey-hey - the last fixed work on my machine, well done, Thanks.
Comment 22 Andi Kleen 2005-05-08 06:45:21 UTC
Additional patch from Thomas needs to be integrated too.
Comment 23 Thomas R. (TRauMa) 2005-06-20 09:12:11 UTC
Still not in 2.6.12...
Comment 24 Martin West 2005-06-20 09:34:19 UTC
How irritating :-(
Comment 25 Len Brown 2005-08-05 10:11:45 UTC
has anybody ported the latest R40e DMI hook to linux-2.6.13-rc5?

RE: booting off a CD without this DMI hook
note that you can use "processor.max_cstate=1" manually at boot time
(and you can change /sys/module/processor/parameters/max_cstate
 at run-time if you want to experiment to see why the R40e dies)
Comment 26 Martin West 2005-08-06 06:15:27 UTC
no but I can test if you wish. Just give me a couple of days. By dmi hook do you
mean the patch attached here?
Comment 27 Martin West 2005-08-06 07:29:47 UTC
Just reworked the attached patch to the new table format. Just compiling the kernel.
Comment 28 Martin West 2005-08-06 09:29:39 UTC
The reworked patched compiled ok, and the new kernel boot ok on my R40E. Ill
attach the patch file in a sec.
Comment 29 Martin West 2005-08-06 09:37:53 UTC
Created attachment 5526 [details]
Updated patch for 2.6.13-rc5
Comment 30 Thomas R. (TRauMa) 2005-10-31 06:19:17 UTC
Just to update: patch isn't in 2.6.14, but I still hope it'll be in soon.

Meanwhile, IBM released a new BIOS for R40e - can some test it and confirm that
the bug is still there? (I'm not able to patch my laptop, company policy).

The bios is
1.37  	1SET69WW (1.37)  	30 Sep 2005  	Current version
on
http://www-307.ibm.com/pc/support/site.wss/document.do?sitestyle=ibm&lndocid=MIGR-50301

If C2/C3 still crash the system, I'll add this version to the blacklist
(Changelog of BIOS patch doesn't mention this problem, so i presume it'll be
necessary).
Comment 31 Martin West 2005-10-31 10:09:46 UTC
Okey doke, Ill give it a whirl
Comment 32 Martin West 2005-10-31 15:01:06 UTC
Yep, still a problem with the new BIOS, and adding a new entry with the latest
bios name gets us going again. This was on 2.6.13.4.
Comment 33 Len Brown 2005-11-30 20:26:54 UTC
i think it is appropriate to send this workaround to distros
and to stable@kernel.org, but I don't think it is wise
to check it into the upstream kernel or we'll hide
and never fix the root cause.
Comment 34 Andi Kleen 2005-12-01 05:59:59 UTC
It's already in the upstream kernel.

If it's only in old Thinkpads I'm not sure it's really worth debugging.
Comment 35 Martin West 2005-12-01 10:34:11 UTC
I agree, with this fix you get battery stats, throttling, power off, ... which
is perfectly acceptable, well for me.
Comment 36 Thomas R. (TRauMa) 2005-12-03 09:35:44 UTC
I don't know if anyone at IBM/Levono is aware of this bug. In my experience they
fix such things in BIOS even for very old laptops. But even if they do, we'd
still need the patch (I'm thinking about those poor linux first timers that see
the kernel hang, they'll just give up and never come back, I don't think they'll
go and get a new BIOS).

As far as I understand it (which isn't that far), the crash is probably due to
the ACPI BIOS being faulty, so we won't fix this properly in kernel anyway. And
C2/C3 missing isn't sooo bad if you have speedstep.
Comment 37 Len Brown 2006-02-02 14:29:20 UTC
workaround shipped in 2.6.16-rc1-git6 -- closing.

When i get my hands on an R40e, I'll root cause and fix Linux.
Comment 38 Martin West 2006-02-03 01:08:27 UTC
Thanks
Comment 39 Martin West 2006-04-02 09:21:23 UTC
Hi Guys, Ive just tried the 2.6.16.1 kernel (vanilla) which appears to the have
the fix but it doesnt seem to detect the offending cpu on boot (ie no "ACPI:
processor limited to max C-state" message) and I get a hang when I modprobe
processor.

Any suggestions?
Comment 40 Martin West 2006-04-02 14:03:16 UTC
hhhmm, mea culpa, I upgraded to bios 1SET69WW and forgot to update the patch
here. Should I attach a patch here or start a new bug?
Comment 41 Martin West 2006-04-02 14:47:53 UTC
K, adding

        { set_max_cstate, "IBM ThinkPad R40e", {
          DMI_MATCH(DMI_BIOS_VENDOR,"IBM"),
          DMI_MATCH(DMI_BIOS_VERSION,"1SET69WW") }, (void*)1},

Gets us going again. Need to add

        { set_max_cstate, "IBM ThinkPad R40e", {
          DMI_MATCH(DMI_BIOS_VENDOR,"IBM"),
          DMI_MATCH(DMI_BIOS_VERSION,"1SET70WW") }, (void*)1},

as well

Gee, Id love to install a linux with out patching the kernel :-)
Comment 42 Martin West 2006-04-02 14:58:31 UTC
pps ... would be slightly more efficient to put the latest bios's at the head of
the table.
Comment 43 Thomas R. (TRauMa) 2006-04-02 16:17:27 UTC
Heh, if we ever have enough BIOS versions to cover to make it matter, we should
perhaps switch to other forms of DMI blacklisting (regular expressions, anyone?
;-)).

On a more constructive note, Len, I assume you didn't get your hands on an R40e?
I have some free time right now, so we could do some remote debugging if you
want (you tell me what to do, I keep compiling and crashing like crazy).
Comment 44 Andi Kleen 2006-04-02 16:21:28 UTC
The match is already a prefix match (like bla*) 
Comment 45 Thomas R. (TRauMa) 2006-04-02 17:06:19 UTC
Wow, guess I asked the wrong people last time around, then. Or is this new? So
we could just add one

 	{ set_max_cstate, "IBM ThinkPad R40e", {
 	  DMI_MATCH(DMI_BIOS_VENDOR,"IBM"),
 	  DMI_MATCH(DMI_BIOS_VERSION,"1SET") }, (void*)1},

and be done? Or does this collide with other laptops' bios versions?
Comment 46 Martin West 2006-04-03 01:15:44 UTC
Im just investigating whether we can key on something else more consistent.
Comment 47 Martin West 2006-04-03 01:17:25 UTC
Using 1SET depends on whether other good bios use the same prefix, Ill investigate
Comment 48 Martin West 2006-04-03 01:24:34 UTC
Just done a few random checks on BIOS names at

http://www-307.ibm.com/pc/support/site.wss/document.do?lndocid=TPAD-MATRIX

and 1SET appears to be unique to the R40E.
Comment 49 Martin West 2006-04-03 02:05:04 UTC
1SET worked on my machine, table now looks like

static struct dmi_system_id __cpuinitdata processor_power_dmi_table[] = {
        { set_max_cstate, "IBM ThinkPad R40e", {
          DMI_MATCH(DMI_BIOS_VENDOR,"IBM"),
          DMI_MATCH(DMI_BIOS_VERSION,"1SET") }, (void*)1},
        { set_max_cstate, "Medion 41700", {
          DMI_MATCH(DMI_BIOS_VENDOR,"Phoenix Technologies LTD"),
          DMI_MATCH(DMI_BIOS_VERSION,"R01-A1J")}, (void *)1},
        { set_max_cstate, "Clevo 5600D", {
          DMI_MATCH(DMI_BIOS_VENDOR,"Phoenix Technologies LTD"),
          DMI_MATCH(DMI_BIOS_VERSION,"SHE845M0.86C.0013.D.0302131307")},
         (void *)2},
        {},
};

Comment 50 Andi Kleen 2006-04-03 04:50:10 UTC
The problem is to ensure it doesn't match on good IBM laptops. We don't
want to penalize them just for bugs in a specific BIOS.
Comment 51 Martin West 2006-04-03 05:56:51 UTC
I dont think it does, see my earlier append. 
Comment 52 Martin West 2008-07-26 00:45:04 UTC
Did this make it to the main sources? Reason I ask is that I just installed ubuntu 8.04 which has kernel 2.6.24-19 and it hangs at boot again.
Comment 53 Martin West 2008-07-26 00:45:55 UTC
PS and ok with acpi=off

Note You need to log in before you can comment on or make changes to this bug.