Bug 12231 - fan not controled on a Dell Latitude E5400
Summary: fan not controled on a Dell Latitude E5400
Status: CLOSED INVALID
Alias: None
Product: Platform Specific/Hardware
Classification: Unclassified
Component: i386 (show other bugs)
Hardware: All Linux
: P1 high
Assignee: ykzhao
URL:
Keywords:
Depends on:
Blocks: 56331
  Show dependency tree
 
Reported: 2008-12-15 11:24 UTC by Sébastien Hinderer
Modified: 2013-04-09 06:23 UTC (History)
12 users (show)

See Also:
Kernel Version: any
Subsystem:
Regression: No
Bisected commit-id:


Attachments
The output of acpidump. (33.05 KB, application/octet-stream)
2008-12-15 11:33 UTC, Sébastien Hinderer
Details
try the debug patch in which rsdt is used instead of XSDT (815 bytes, patch)
2008-12-15 17:46 UTC, ykzhao
Details | Diff
try the debug patch in which rsdt is used instead of XSDT (2.83 KB, patch)
2008-12-15 17:58 UTC, ykzhao
Details | Diff
Kernel configuration (10.87 KB, application/bzip2)
2008-12-20 04:28 UTC, Sébastien Hinderer
Details
Output of dmesg (8.45 KB, application/bzip2)
2008-12-20 04:32 UTC, Sébastien Hinderer
Details
Output of lspci -vxxx (3.78 KB, application/bzip2)
2008-12-20 04:33 UTC, Sébastien Hinderer
Details
The complete output of detect-sensors (4.17 KB, text/plain)
2009-01-01 19:47 UTC, Sébastien Hinderer
Details
The output of acpidump. (137.34 KB, application/octet-stream)
2009-08-18 11:26 UTC, Łukasz Jachymczyk
Details

Description Sébastien Hinderer 2008-12-15 11:24:50 UTC
Latest working kernel version:
Earliest failing kernel version:
Distribution:Debian unstable
Hardware Environment:Dell Latitude E5400
Software Environment:Dual boot: Linux + Windows XP
Problem Description:The fan is constantly working although CPU load and temperature are low. Under Windows XP this does not happen.

Steps to reproduce:boot the systemn and listen to its fan. run acpi -f.
Comment 1 Sébastien Hinderer 2008-12-15 11:33:13 UTC
Created attachment 19311 [details]
The output of acpidump.
Comment 2 Zhang Rui 2008-12-15 17:05:54 UTC
there is not ACPI fan device from the acpidump you attached.
so the fan is controlled by either BIOS or some platform specific drivers.
maybe you can try the hwmon drivers.
cc Jean.
Comment 3 ykzhao 2008-12-15 17:13:26 UTC
Hi, Sebastien
    Will you please attach the output of dmesg, lspci -vxxx?
    From the acpidump it seems that there exists another issue that the GPE block length of 32/64X mismatches. 
    >32/64X bit length mismatch in Gpe0Block: 128/64
    Thanks.
Comment 4 ykzhao 2008-12-15 17:46:59 UTC
Created attachment 19320 [details]
try the debug patch in which rsdt is used instead of XSDT

Will you please try the debug patch on the latest kernel and see whether the Fan issue still exists?
    In the debug patch the RSDT table is used instead of XSDT.
    Thanks.
Comment 5 ykzhao 2008-12-15 17:58:46 UTC
Created attachment 19321 [details]
try the debug patch in which rsdt is used instead of XSDT

Sorry that the incorrect patch is attached.
Thanks.
Comment 6 Sébastien Hinderer 2008-12-20 04:28:56 UTC
Created attachment 19397 [details]
Kernel configuration
Comment 7 Sébastien Hinderer 2008-12-20 04:32:10 UTC
Created attachment 19398 [details]
Output of dmesg
Comment 8 Sébastien Hinderer 2008-12-20 04:33:19 UTC
Created attachment 19399 [details]
Output of lspci -vxxx
Comment 9 Sébastien Hinderer 2008-12-21 16:32:49 UTC
The patch did not fix the fan issue.
However, it may be worth noticing that the problem appears only
when the laptop is plugged to electric power.
When the laptop works on its batter only, the fan seems to be handled properly.
Comment 10 Zhang Rui 2008-12-28 17:11:19 UTC
please check your BIOS and see if there is a option like "fan always on when AC"?
Comment 11 Sébastien Hinderer 2008-12-28 23:49:33 UTC
I can't check my BIOS right now because I am blind and have no sighted assistance available, but I think such an option is not set, because when Windows XP is started, the fan is not constantly working even when the AC adapter is plugged.
Comment 12 Jean Delvare 2009-01-01 10:19:59 UTC
Sébastien, it might help to know which hardware monitoring chip is present in your system, if any. Do you have lm-sensors installed? If you don't, please install it, then attach the full output of sensors-detect.
Comment 13 Sébastien Hinderer 2009-01-01 19:47:30 UTC
Created attachment 19598 [details]
The complete output of detect-sensors
Comment 14 Sébastien Hinderer 2009-01-01 20:10:47 UTC
detect-sensors has been ran with 2.6.28-rc9 kernel to which the patch
previously included in this bug report had been applied.
In addition to this information, the i8k module could find a few
information:
cat /proc/i8k
1.0 A06 BWB814J 31 -22 0 -22 0 -1 -22
According to drivers/char/i8k.c:i8k_proc_show, thi information
is to be interpreted as follows:
1) "1.0": format of output
2) "A06": BIOS version
3) "BWB814J" BIOS machine id
4) "31": CPU temperature
5) "-22": Left fan status
6) "0": Right fan status
7) "-22": Left fan speed
8) "0": Right fan speed
9) "-1": AC power
10) "-22": Fn Key status
The reported CPU temperature (4) is coherent with the one reported by
acpi. However, 5, 6, 7 and 8 look odd to me: as if the information
should be reported differently, e.g. fan statuscould be -22 for the two
fans, and their speed could be 0. The code looks correct, though, so
perhaps it's the hardware which makes the information available in a
diffferent way ?
Ccing the author of the i8k module.
Comment 15 Sébastien Hinderer 2009-01-01 20:26:48 UTC
Regarding coment #9. Not sure fan is handled correctly when the laptop is on battery, so this remark should perhaps be taken into account with care.
Comment 16 Jean Delvare 2009-01-02 02:27:58 UTC
Regarding the output of sensors-detect: there is an unknown SMSC Super-I/O in the machine. It may or may not include fan speed control outputs, but anyway at this point it doesn't really matter, because we (obviously) do not have any hwmon driver for this chip. SMSC makes a lot of custom chips for PC system vendors, and in general we do not have access to specifications for these chips.

You may want to load the coretemp driver to get accurate CPU temperatures (possibly better than what ACPI reports) but that's about all lm-sensors can do for you. As far as your fan issues are concerned, the solution has to come from either ACPI or laptop-specific code such as the i8k driver. You might also try to ask Dell themselves for help, they are generally Linux-friendly.
Comment 17 Samuel Thibault 2009-01-02 13:22:26 UTC
I have some weird information too: 1.0 A08 BRN4Z3J 54 -22 0 27660 0 -1 -22 -22 may be -EINVAL perhaps? :) 
Comment 18 Samuel Thibault 2009-01-04 21:16:03 UTC
I tried the i8kutils package, it works just fine:  i8kctl fan - 1  started the right fan (I do not physically have a left fan)  i8kctl fan - 2  made it turn faster  i8kctl fan - 0  made it stop. 
Comment 19 Philipp Hagemeister 2009-01-04 21:41:26 UTC
Which BIOS revision do you use, Samuel? On my machine (e5400, BIOS A07), i8kctl changes get overriden in less than a second, i.e. the fan stops/slows down and powers up nearly immediately after. Additionally, i8kutils is not available for x86_64 so even if it works it's not a viable solution.
Comment 20 Samuel Thibault 2009-01-04 21:46:11 UTC
As can be seen above, by bios is A08.

Why do you say i8kutils is not available for x86_64?  My box _is_ x86_64.
Comment 21 Philipp Hagemeister 2009-01-04 22:15:36 UTC
Oops, my mistakes. Sorry, should have seen that you already posted your BIOS version. Where did you get version A08? The newest version I can see at http://tinyurl.com/e5400bios is A07.

And you're right, i8kutils works fine on x86_64. For whatever reason, the Debian package (http://packages.debian.org/sid/i8kutils) lists only x86.
Comment 22 Samuel Thibault 2009-01-05 00:46:54 UTC
Ah, wait, my laptop is a D430, I was just pointing out that these -22
strange values are completely normal, it doesn't affect whether fan
control works or not.
Comment 23 Alessandro Fachin 2009-01-31 12:51:47 UTC
I have the same problem on ubuntu 8.10 the laptop is the same latitude e8400 with intel 8400... Could I send any information ? I have an A07 bios.
Comment 24 Sébastien Hinderer 2009-01-31 13:18:56 UTC
Dell's support couldn't provide any useful information.
As far as I could understand, the person I talked to said somethink like
"If it works with Windows and not with Linux, it's a Linux problem".
This message was not very useful...
Has anybody an idea about what could be done, now ?
Comment 25 Alessandro Fachin 2009-01-31 13:36:16 UTC
Have you try also the new bios version A08 ?
Release Date:	1/8/2009
Version:	A08
Comment 26 Alessandro Fachin 2009-01-31 13:37:21 UTC
Yes I read only all the comments...
Comment 27 Alessandro Fachin 2009-01-31 14:21:39 UTC
It's kind of funny because hibernate and suspend works, all device works good except the fan... I don't known what to do... but with i8kctl fan - 0  made it stop the fan stops ?
Comment 28 Dave Young 2009-03-04 22:21:32 UTC
I have same issue (dell e5400)

cat /proc/i8k
1.0 A06 FC7MJ2X 62 -22 1 -22 93450 -1 -22

'i8kctl fan l 0' make the left fan stop, but them it start immediately.
Comment 29 Sébastien Hinderer 2009-05-08 10:15:32 UTC
Does someone notice a difference in fan behaviour when the laptop works on batteries rather than on AC adapter ?
It seems to me that the fan is used in a much more reasonable way when i use the batteries...
Does this inspire you guys an idea ?
Comment 30 Łukasz Jachymczyk 2009-07-07 21:44:53 UTC
I found this bug is also present on Dell E6400 (Intel Core 2 Duo T9600 + nvidia Quadro NVS 160M) with bios ver. A12. and Linux 2.6.30

I can notice the same fan's behaviour as described above.

Please write if you need more info.
Comment 31 Łukasz Jachymczyk 2009-08-18 11:26:42 UTC
Created attachment 22763 [details]
The output of acpidump.

This is acpidump from Latitude E6400 bios A12. Fan is still working almost all the time, I can't see difference whether it's on AC or battery.
Comment 32 mariachi 2011-05-25 22:09:19 UTC
Has any progress been made in this front? It's 2011 now, kernel 2.6.39 and this issue is still around (Bios A16).
Comment 33 Sébastien Hinderer 2011-05-28 16:44:22 UTC
In response to #32. Not awareof any progress. Bug still present.
Would be interested in any suggestion of an action that could help this to be solved.
Comment 34 mariachi 2011-05-28 16:57:21 UTC
This has been suggested, but I haven't been able to test it (I don't have that much knowledge). Maybe someone with better skills could try it: https://bbs.archlinux.org/viewtopic.php?pid=780692#p780692
Comment 35 mariachi 2011-05-28 17:06:45 UTC
BTW: doing the (Shift+Fn) + 1,5,3,2,4 (one at a time, in this order) and disabling BIOS thermal control will fix the issue. This has to be set every boot though, and sometimes the BIOS screen that shows up after pressing the keys will freeze the computer.
Comment 36 Sébastien Hinderer 2011-05-28 18:12:38 UTC
In response to #34:
The mentionned suggestion has been obsoletedby commit bc1f419c76a2d6450413ce4349f4e4a07be011d5
In response to #35:
which BIOS version is it ? Did you try doing the change in the BIOS before the system starts ? Doesit work and are you able to save the changer in a persistent way ?
Comment 37 mariachi 2011-05-28 18:21:17 UTC
#36: BIOS is the latest (A16). The procedure is done after you have completely booted your system (I can do it right now, for instance). Right now my fan is completely silent (mode 0) using that "trick" + dellfand. These changes reset when you halt the system and you need to do it every boot, and I don't know of any way to save them. BTW: my system is currently 54ºC hot.
Comment 38 Sébastien Hinderer 2011-05-29 17:11:41 UTC
I am not able to reproduce the shift+Fn trick, it does not enter the BIOS here.
Mariachi: my question was: if you go into the BIOS during the machine starts up, _before_ the bootloader appears, are you able to turn off fan control, too ? If yes: in which menu is it ?
incidentally: are you Latitude E5400 users able to use the built-in wireless adapter ? Even on WPA-protected networks ?
Comment 39 mariachi 2011-05-29 17:28:54 UTC
No, I cannot change the fan control from the BIOS setup on system start.
try reading the section that begins with "New stuff" 
http://en.gentoo-wiki.com/wiki/Dell_Latitude_E6x00#CPU_overheating_throttling
My wireless with the broadcom card never worked quite right, so I bought an intel 5300 and now I'm happy camper.
Comment 40 Sébastien Hinderer 2011-06-05 09:16:08 UTC
Hasn't anybody a (good) contact at Dell that could pass on this bug to someone inside Dell ?
Comment 41 mariachi 2011-06-05 09:18:18 UTC
This bug has been around since day 1, I don't believe Dell cares about it :(
Comment 42 Sébastien Hinderer 2011-06-05 19:04:25 UTC
But the fact they didn't react does not prove they don't care, perhaps they just don't know, or perhaps the problem was not reported to the right person so far. They are supposed to be linux friendly. They could at least add the ability to turn off BIOS fan control permanently (persistent after reboot) in one of their BIOS updates...
Comment 43 Alan 2011-06-05 19:18:55 UTC
https://lists.us.dell.com/mailman/listinfo/linux-desktops

is probably a better starting point
Comment 44 Alan 2011-06-05 19:20:05 UTC
I would guess that the SMM (system management mode) code on the notebook is continually overriding the kernel attempts to adjust it. It may be worth disabling hwmon if that occurs and seeing what happens as one reason we've seen BIOS fan control fail is if Linux hwmon reconfigures the temperature sensors.
Comment 45 Sébastien Hinderer 2011-06-05 19:38:32 UTC
In response to #43: are you suggesting to post the issue on that list ? If this is your suggestion, I am willing to subscribe to the list and post the issue there.

In resopnse to #44: I have no hwmon package installed, no mention of hwmon in dmesg and no hwmon appears in lsmod, so I have no idea how I should disable this feature which I suspect is not enabled at all on my system.
Comment 46 Matt Domsch 2011-06-06 14:00:35 UTC
Adding the Dell engineering team.
Comment 47 mariachi 2011-06-09 07:50:58 UTC
Thank you Matt! Maybe now something can be done about this ;)
Comment 48 Sébastien Hinderer 2011-07-01 15:28:47 UTC
By the way. In my logs I have messages like these:
CPU0: Core temperature above threshold, cpu clock throttled (total events = 1)
CPU0: Core temperature/speed normal
[Hardware Error]: Machine check events logged
CPU1: Core temperature above threshold, cpu clock throttled (total events = 26)
CPU1: Core temperature/speed normal
[Hardware Error]: Machine check events logged
So I installed mcelog and here is what it logs:
Hardware event. This is not a software error.
MCE 0
CPU 0 THERMAL EVENT TSC 1404bf3f020 
TIME 1309528922 Fri Jul  1 16:02:02 2011
Processor 0 heated above trip temperature. Throttling enabled.
Please check your system cooling. Performance will be impacted
STATUS 88020003 MCGSTATUS 0
MCGCAP 806 APICID 0 SOCKETID 0 
CPUID Vendor Intel Family 6 Model 15
Comment 49 Jean Delvare 2011-07-01 16:01:19 UTC
Sébastien, does the coretemp driver report excessive temperatures too?
Comment 50 Sébastien Hinderer 2011-07-01 17:01:09 UTC
Jean: the coretemp driver was initially not loaded. So I loaded it a few minutes ago. I got two messages at load time:
[11320.610670] coretemp coretemp.0: TjMax is assumed as 100 C!
[11320.610730] coretemp coretemp.1: TjMax is assumed as 100 C!
Since then nothing else has appeared in dmesg so I assume nothing else has been logged, except if it was logged in another place but then please let me know where to look.
Comment 51 Jean Delvare 2011-07-01 18:42:08 UTC
The purpose of the coretemp driver isn't to log anything in the kernel, but to expose temperature values to user-space through sysfs. Run "sensors" (from the lm-sensors package) to get the temperature values.
Comment 52 Sébastien Hinderer 2011-07-01 19:02:16 UTC
For reference, see comments #13 and #16.
Output of sensors:
acpitz-virtual-0
Adapter: Virtual device
temp1:        +59.5°C  (crit = +102.0°C)

coretemp-isa-0000
Adapter: ISA adapter
Core 0:       +59.0°C  (high = +100.0°C, crit = +100.0°C)

coretemp-isa-0001
Adapter: ISA adapter
Core 1:       +60.0°C  (high = +100.0°C, crit = +100.0°C)
This temperature matches the one reported by acpi -t
The real issue is that the operating system seems to have no control over the fan.
Comment 53 Jean Delvare 2011-07-01 19:20:07 UTC
The idea to use the coretemp driver is to check the reported temperature when the kernel logs a thermal throttling event. It will tell us whether the CPU is really heating  or not.

That being said, the coretemp driver doesn't yet report the thermal thresholds, so please report the output of:
# modprobe msr
# rdmsr -x -p 1 0x19b

But I completely agree that the root problem appears to be fan not kicking in when they it should.
Comment 54 Sébastien Hinderer 2011-07-01 19:32:15 UTC
~# rdmsr -x -p 1 0x19b
3
Comment 55 Jean Delvare 2011-07-01 19:47:00 UTC
Hmm, looks like the threshold isn't set, which would mean that the alarm triggers at TjMax i.e. 100°C on your CPU. I'm curious if you'll see any temperature even close to this being reported by the coretemp driver.

Maybe I don't really understand the thermal threshold mechanism after all, sorry for the noise.
Comment 56 Sébastien Hinderer 2011-07-01 19:57:23 UTC
As I said I have just loaded coretemp when you asked for so it isnot loaded usually, but I have never seen such a temperature for my CPU.

Still, there are these messages I showed in one of my previous comments and which appear from time to time in my logs:
CPU1: Core temperature above threshold, cpu clock throttled (total events = 782)
CPU1: Core temperature/speed normal
(and the samekind of messages for CPU0)

when this happens the temperature is clearly not close to 100 degrees C, so I am suspecting that there are two thresholds, but I don't really know the things I am talking about here.
Comment 57 Sébastien Hinderer 2011-08-24 19:44:21 UTC
I'm wondering about CONFIG_INTEL_IDLE.
It is disabled in my kernel. Would it help to enable it ?
And also, the help text in the kernel configuration for this item says:
Enable intel_idle, a cpuidle driver that includes knowledge of
native Intel hardware idle features.  The acpi_idle driver
can be configured at the same time, in order to handle
processors intel_idle does not support.

This sounds interesting to me, but I couldn't find the mentioned acpi_idle driver.
Could anybody please say whether these two are relevant and where to find the acpi_idle driver, please ?
Thanks.
Comment 58 Sébastien Hinderer 2011-08-24 19:54:17 UTC
In addition, here are the output of
dmesg | grep idle
[    0.000000] 	RCU dyntick-idle grace-period acceleration is enabled.
[    0.009477] using mwait in idle threads.
[    0.805768] cpuidle: using governor ladder
[    0.805770] cpuidle: using governor menu
[    9.106195] ACPI: acpi_idle registered with cpuidle
[    9.106915] Marking TSC unstable due to TSC halts in idle

I'm just wondering whether there wouldn't be something to tweak here, in what the kernel does when the CPU is idle.

Thanks for any feedback.
Comment 59 Thomas Renninger 2011-08-24 21:00:46 UTC
Google says:
Dell Latitude E5400 Intel Core2 Duo P8400 2.26GHz
This CPU should not get controlled by intel_idle.c, it doesn't hurt to enable it, but acpi_idle (processor.ko) would still stay active -> no change.

Which C-states are supported on your machine depends on BIOS (with acpi_idle driver, intel_idle does not depend on BIOS).
There is a new userspace tool: tools/power/cpupower in the kernel since 3.1-rc1 that tells you which C-states are supported:
cpupower idle-info

powertop is also a convenient tool to find out about unnecessary power consumption.
Comment 60 Thomas Renninger 2011-08-24 21:04:33 UTC
What helped on my wife's HP which also had always running fans and I searched quite some time on software side: Cleaning the fan.
Often a vacuum cleaner is enough. On this machine it was not and it took me quite some time to get the dozens of different screws removed to be able to access the fan and remove the dust.
Comment 61 Sébastien Hinderer 2011-08-25 08:58:56 UTC
In response to #60:
Fan, CPU and heatsink have been replaced three days ago and this did not improve the situation.
Motherboard about to be replaced, too.
The thing is that, once the fan starts to be noisy, it seems it never goes back to its non noisy staate although the temperature goes down. So for instance one can still hear the fan being rather loud even after more than one hour idle time on a system running in text mode, with only a few services started and whereas the temperature reported by acpi -t is of 31.5 degrees Celcius.
Comment 62 Sébastien Hinderer 2011-08-28 08:49:46 UTC
So, the motherboard of the laptop with the noisy fan was finally replaced and that didn't seem to solve the problem either.
The engineer then had the idea to use another thermal grease. So he took out the heatsink, removed the thermal grease on it (provided by the manufacturer, and who was completely dry just three days after he has put it on the heatsink), and put another thermal grease on the heatsink, from MIcrosi, he said. And that improved the situation a lot. The noise is now completely reasonable and acceptable.

So, for those users who have this problem, cleaning the fan and heatsink and changing the thermal grease may be a good thing to try first. On this device the heatsink it not difficult to access: 5 screws to descrew and one pannel to remove.

Thanks very much to all those who have helped solving this problem !
Comment 63 mariachi 2012-06-03 08:01:20 UTC
I've been away for the past year, so only now did I remember about this...
I'm going to change the grease on my laptop too. It has the exact same problem: once the fan kicks in, even if the temperature goes down it never stops -- only way to fix this is to suspend or reboot. But now I see this is marked as invalid so I guess no one will work on this anymore? It has been way too long anyway, I'm not expecting this to be fixed...
Comment 64 Alan 2012-06-03 11:40:08 UTC
If your laptop shows it with a modern kernel then probably best to open a new bug specific to that laptop. It could easily be a software problem in your case.
Comment 65 Sébastien Hinderer 2012-06-07 19:04:45 UTC
In response to #63. I'm fully convinced that the problem has absolutely nothing to do with anything software related. The change of grease has been done in August 2011, so almost one year ago now and the fan is still okay. One way to see that Linux is not causing the problem is to boot Windows. One then notices that the problem persists under that OS.

Note You need to log in before you can comment on or make changes to this bug.