Bug 3008
Summary: | CPU generates more heat in 2.6.7 than in 2.4, causing critical temperatures - Acer TravelMate 233xc | ||
---|---|---|---|
Product: | ACPI | Reporter: | Ville P (drc) |
Component: | Power-Thermal | Assignee: | Konstantin Karasyov (konstantin.karasyov) |
Status: | CLOSED PATCH_ALREADY_AVAILABLE | ||
Severity: | high | CC: | acpi-bugzilla, kernel.org |
Priority: | P2 | ||
Hardware: | i386 | ||
OS: | Linux | ||
Kernel Version: | 2.6.7 | Subsystem: | |
Regression: | --- | Bisected commit-id: | |
Attachments: |
ACPI dump from the machine
DMI data |
Description
Ville P
2004-07-03 09:45:20 UTC
Created attachment 3304 [details]
ACPI dump from the machine
Created attachment 3305 [details]
DMI data
Try to modify HZ to 100 in include/asm-i386/param.h Does it help? Yes it helps. The computer works fine now. Is there already a patch in the works? When could it be expected, especially e. g. for the reported FC2? Since this bug is a bit urgent: Is there any way to do a quickfix, means turning the fans always to full speed or something? Thought about disabling ACPI - but I wonder if that's a clever solution (irq-sharing etc., which is also combined with ACPI afaik). Ville, As your comment said, what does cpu do after boot instead of idle,if you don't start any program? Is there a fs issue? HD run fast? or any other phenomena? thanks, -zhen I don't know what the cpu does, is there a way to find out? The "no idle looping" comment was from this mail: http://sourceforge.net/mailarchive/message.php?msg_id=8862563 The system load is minimal and there doesn't seem to be any extra disk activity. Changing "Hz" and recompiling the kernel did not help here at all. The system still shows high temperature and degrades service due to heat. :-(( Any more ideas on that topic? Is the CPU really hotter, or didn't previous kernels just not notice that the CPU-mode was automatically degraded? All right, can you figure out if cpu is really _hotter_? Or just the value from acpi thermal module is very unnormally high. Could you take a look at bug http://bugme.osdl.org/show_bug.cgi?id=3191? That SMBus unhidding reveals acpi thermal issue, resource conflict here. Maybe the problem is the same here... Considering the fact that my laptops cooling mode is handled in the firmware level, it would seem that the cpu is actually really too hot. Also the fact that if you restart the machine too soon after an emergency shutdown the computer shuts down before Linux is even loaded. After trying 2.6.8 and having no chance I tried disabling acpi/processor, and it works perfectly. Changing HZ to 100 seemed to be only a badly working workaround. The first thing I noticed was that the cooling fan spins slower. The computer also runs a lot cooler judging by hand. Since thermal_zone is also disabled with acpi/processor I can't say what the change in degrees is. The processor load when running applications decreased a lot. Video playback used to cause 30% cpu load, now the load isn't even noticeable (like it was in 2.4.). One thing I noticed about acpi/processor/*/power was that the idle mode was C3 all the time in 2.6, while it was always C2 in 2.4. do these overheating systems all run in C3 when idle? does the problem go away if you disable C3 with the boot parameters mentioned in bug 3549 (available in 2.6.10-rc2)? other ways to disable c3 are to plug in a USB mouse (causes bus-master activity). or run a low priority cycle soaker so that there is never any idle time. My system runs in C3 even with bus-master activity (USB mouse). I've never seen it be anything else than C3 in 2.6. It was always C2 in 2.4 when I checked. I noticed the C3 disabling ability has been added, but I haven't had a chance to test yet since I'm running 2.6.9 still. > My system runs in C3 even with bus-master activity (USB mouse)
hmm, unexpected. Please show your /proc/acpi/processor/CPU0/power
Notes from Stefan's system: in syslog it says: Nov 14 14:00:08 www kernel: CPU1: Running in modulated clock mode Nov 14 14:00:08 www kernel: CPU0: Running in modulated clock mode Nov 14 14:00:13 www kernel: CPU0: Temperature above threshold Nov 14 14:00:13 www kernel: CPU1: Temperature above threshold This means the system noticed it is overheating and it has kicked in thermal throttling, but it quickly goes critical anyway. # cat /proc/acpi/processor/CPU4/power active state: C1 default state: C1 bus master activity: 00000000 states: *C1: promotion[--] demotion[--] latency[000] usage[00000000] C2: <not supported> C3: <not supported> This says that on the 4-way SMP, C2/C3 is not supported, so the failure may be totally different from Ville's system. Can you verify that the fans on this system have not failed and that the processor heat sync's are properly installed using thermal grease? Stefan, can you run this system with acpi=off and see if it still overheats? Legacy and hardware methods should keep it from burning (though you haven't mentioned exactly what hardware your're running). I used acpi/processor both as a module and built in in 2.6.9 with HZ 100 and the system went to C2 when there was bus master activity. So either I remembered incorrectly or this happened differently with .7 and .8 since I never tried 2.6.9 with acpi/processor enabled. is there still an issue here with 2.6.13? Note that the bit about the effectiveness of the idle loop has to be off-track. Power savings in the idle loop should not be needed for thermal control. Indeed, with idle=poll -- a busy-spin idle loop, the system should still be able to effectively cool itself. Problem still exists with 2.6.13 (acpi 20050902 patch applied). The difference I noted was that throttling was set to 4 (50% throttling) automatically(?), which enabled the machine not to shut down, atleast not as quickly. The first temperature measure was 59 degrees celsius, and it rose to 62 in less than a minute. System load was 0.01 to 0.02 during that time, with only minimal services and login shell running, no X. cooling_mode was reported as critical the whole time the machine was running with that kernel. When I manually set throttling to 0 the machine started heating up more quickly, ending up in critical shutdown (reported as 77 In kernel 2.6.15-17 (Ubuntu patch) the problem no longer exists. System functions without problems with acpi and its processor support enabled. Ville, I'm closing this bug now. If you'll find out that the issue remained - please reopen. |