Some IBM and Medion laptops crash in C2/C3. It might be a Linux or a BIOS bug, but it needs some workaround. This patch allows disabling c2/c3 and adds a DMI table for it for the IBM T40
Created attachment 3804 [details] Disable C2/C3 on R40 and add option It's the IBM R40e, not T40 with the problem sorry (typo in previous comment)
Created attachment 3805 [details] One Medion laptop seems to have the same problem.
There are already two more recent versions of the BIOS than 1SET60WW - is this the best string to test?
*** Bug 3219 has been marked as a duplicate of this bug. ***
*** Bug 3406 has been marked as a duplicate of this bug. ***
shipped in Linux 2.6.10-rc2 - closed.
Created attachment 4189 [details] acpi_cstate_limit patch included in 2.6.10-rc2
Created attachment 4190 [details] max_cstate patch on top of previous two patches for 2.6.10-rc3
patch in comment #8 shipped in 2.6.10-rc3
I use an R40e with bios version 1SET65WW (which still hangs in C2 and C3) with FC3 2.6.10-1.766 Adding my bios version to drivers/acpi/processor.c structure processor_dmi_table got me running again. All R40e bios versions may be found here (14 different) http://www-307.ibm.com/pc/support/site.wss/document.do?lndocid=MIGR-52981 I had bios versions 1SET56WW before with the same problem and guess the other versions have it too (though not having tested them) Is there a possibility to check those bios versions and extend the processor_dmi_table?
Parent is right: I still have to patch the kernel or else it crashes on my machine, so this CODE_FIX doesn't work for me. Particulary nice if booting from CD... :-(. Please add the other bios version strings, too.
Hi, I have an R40E with the latest bios and running 2.6.10-1.770_FC2 with this problem. Happy to be a guinea pig on testing. I tried recompiling the kernel w/o cpu_freq but still had the problem.
Created attachment 4684 [details] IBM_R40e_C2C3lockup_fix: Blacklist by Board, not BIOS Version OK, I think this should work across BIOS Versions, for all R40e. All Guniea Pigs have fun testing :-). (Patch is against 2.6.11, in 2.6.10 you have to patch the file drivers/acpi/processor.c instead - search for R40e, apply by hand). If I was the bug owner, I'd REOPEN this bug. A fix still crashing on 80% of R40e != fix
Not sure this fix will work my machine number is 2684HTG, I think 2684 is the R40E, but Ill check..
R40E is 2684 and 2685 http://www-307.ibm.com/pc/support/site.wss/product.do?brandind=10&familyind=116178&operatingsystemind=49979&template=%2Fproductselection%2Flandingpages%2FdownloadsDriversLandingPage.vm&validate=true
So, if we add 2685, we catch all R40e? Still would be better than putting 14 BIOS Versions in the code? OTOH there still is the slim posibility that IBM releases a fixed BIOS, then we'd have to check against Version anyway.
Ah, sry, only saw your first comment. Is there a way to only match against the first four chars in DMI_MATCH? If not, it's hardcoding all versions. :|
Created attachment 4687 [details] linux-2.6.11-IBM_R40e_C2C3lockup_fix-2: Blacklist all known BIOS Versions Ugly. But it should work.
life is never perfect :<)
PS, I dont have a processor_idle.c, I have just processor.c - I can hand craft those changes into my processor.c Also Im just in the process of switching to FC3.
whey-hey - the last fixed work on my machine, well done, Thanks.
Additional patch from Thomas needs to be integrated too.
Still not in 2.6.12...
How irritating :-(
has anybody ported the latest R40e DMI hook to linux-2.6.13-rc5? RE: booting off a CD without this DMI hook note that you can use "processor.max_cstate=1" manually at boot time (and you can change /sys/module/processor/parameters/max_cstate at run-time if you want to experiment to see why the R40e dies)
no but I can test if you wish. Just give me a couple of days. By dmi hook do you mean the patch attached here?
Just reworked the attached patch to the new table format. Just compiling the kernel.
The reworked patched compiled ok, and the new kernel boot ok on my R40E. Ill attach the patch file in a sec.
Created attachment 5526 [details] Updated patch for 2.6.13-rc5
Just to update: patch isn't in 2.6.14, but I still hope it'll be in soon. Meanwhile, IBM released a new BIOS for R40e - can some test it and confirm that the bug is still there? (I'm not able to patch my laptop, company policy). The bios is 1.37 1SET69WW (1.37) 30 Sep 2005 Current version on http://www-307.ibm.com/pc/support/site.wss/document.do?sitestyle=ibm&lndocid=MIGR-50301 If C2/C3 still crash the system, I'll add this version to the blacklist (Changelog of BIOS patch doesn't mention this problem, so i presume it'll be necessary).
Okey doke, Ill give it a whirl
Yep, still a problem with the new BIOS, and adding a new entry with the latest bios name gets us going again. This was on 2.6.13.4.
i think it is appropriate to send this workaround to distros and to stable@kernel.org, but I don't think it is wise to check it into the upstream kernel or we'll hide and never fix the root cause.
It's already in the upstream kernel. If it's only in old Thinkpads I'm not sure it's really worth debugging.
I agree, with this fix you get battery stats, throttling, power off, ... which is perfectly acceptable, well for me.
I don't know if anyone at IBM/Levono is aware of this bug. In my experience they fix such things in BIOS even for very old laptops. But even if they do, we'd still need the patch (I'm thinking about those poor linux first timers that see the kernel hang, they'll just give up and never come back, I don't think they'll go and get a new BIOS). As far as I understand it (which isn't that far), the crash is probably due to the ACPI BIOS being faulty, so we won't fix this properly in kernel anyway. And C2/C3 missing isn't sooo bad if you have speedstep.
workaround shipped in 2.6.16-rc1-git6 -- closing. When i get my hands on an R40e, I'll root cause and fix Linux.
Thanks
Hi Guys, Ive just tried the 2.6.16.1 kernel (vanilla) which appears to the have the fix but it doesnt seem to detect the offending cpu on boot (ie no "ACPI: processor limited to max C-state" message) and I get a hang when I modprobe processor. Any suggestions?
hhhmm, mea culpa, I upgraded to bios 1SET69WW and forgot to update the patch here. Should I attach a patch here or start a new bug?
K, adding { set_max_cstate, "IBM ThinkPad R40e", { DMI_MATCH(DMI_BIOS_VENDOR,"IBM"), DMI_MATCH(DMI_BIOS_VERSION,"1SET69WW") }, (void*)1}, Gets us going again. Need to add { set_max_cstate, "IBM ThinkPad R40e", { DMI_MATCH(DMI_BIOS_VENDOR,"IBM"), DMI_MATCH(DMI_BIOS_VERSION,"1SET70WW") }, (void*)1}, as well Gee, Id love to install a linux with out patching the kernel :-)
pps ... would be slightly more efficient to put the latest bios's at the head of the table.
Heh, if we ever have enough BIOS versions to cover to make it matter, we should perhaps switch to other forms of DMI blacklisting (regular expressions, anyone? ;-)). On a more constructive note, Len, I assume you didn't get your hands on an R40e? I have some free time right now, so we could do some remote debugging if you want (you tell me what to do, I keep compiling and crashing like crazy).
The match is already a prefix match (like bla*)
Wow, guess I asked the wrong people last time around, then. Or is this new? So we could just add one { set_max_cstate, "IBM ThinkPad R40e", { DMI_MATCH(DMI_BIOS_VENDOR,"IBM"), DMI_MATCH(DMI_BIOS_VERSION,"1SET") }, (void*)1}, and be done? Or does this collide with other laptops' bios versions?
Im just investigating whether we can key on something else more consistent.
Using 1SET depends on whether other good bios use the same prefix, Ill investigate
Just done a few random checks on BIOS names at http://www-307.ibm.com/pc/support/site.wss/document.do?lndocid=TPAD-MATRIX and 1SET appears to be unique to the R40E.
1SET worked on my machine, table now looks like static struct dmi_system_id __cpuinitdata processor_power_dmi_table[] = { { set_max_cstate, "IBM ThinkPad R40e", { DMI_MATCH(DMI_BIOS_VENDOR,"IBM"), DMI_MATCH(DMI_BIOS_VERSION,"1SET") }, (void*)1}, { set_max_cstate, "Medion 41700", { DMI_MATCH(DMI_BIOS_VENDOR,"Phoenix Technologies LTD"), DMI_MATCH(DMI_BIOS_VERSION,"R01-A1J")}, (void *)1}, { set_max_cstate, "Clevo 5600D", { DMI_MATCH(DMI_BIOS_VENDOR,"Phoenix Technologies LTD"), DMI_MATCH(DMI_BIOS_VERSION,"SHE845M0.86C.0013.D.0302131307")}, (void *)2}, {}, };
The problem is to ensure it doesn't match on good IBM laptops. We don't want to penalize them just for bugs in a specific BIOS.
I dont think it does, see my earlier append.
Did this make it to the main sources? Reason I ask is that I just installed ubuntu 8.04 which has kernel 2.6.24-19 and it hangs at boot again.
PS and ok with acpi=off