Bug 93301

Summary: fan speed not being adjusted once booted up
Product: ACPI Reporter: Radek Podgorny (radek)
Component: Power-FanAssignee: Aaron Lu (aaron.lu)
Status: CLOSED UNREPRODUCIBLE    
Severity: normal CC: aaron.lu, lenb, manuelkrause, mika.westerberg, rui.zhang, vevais
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 3.18.0+ Subsystem:
Regression: Yes Bisected commit-id:
Attachments: acpidump.txt
grep . /sys/class/thermal/thermal_zone*/*

Description Radek Podgorny 2015-02-15 22:59:11 UTC
starting from kernel 3.18 the fan speed is set according to the temperature during the boot and never changes after that.

in other words, when booting up with cool laptop, the fan speed is low and it never spins up even on heavy compile job (or any other load). when booted with hot laptop, the fan is spun up to (almost?) full speed and never slows down even after the temperature gets to reasonable levels.

when booted cold, "acpi -tc" lists thermal 0 as 45 degrees and stays that way. on hot boot, the temp is 70 and never changes. other thermal zones change (not sure about the correctness of the values, thou).

i've bisected the problem to commit 6ab3430129e258ea31dd214adf1c760dfafde67a.

the hardware is hp mini 5103.

this may (or may not) be related to bugs:

https://bugzilla.kernel.org/show_bug.cgi?id=92431
https://bugzilla.kernel.org/show_bug.cgi?id=91411
Comment 1 Aaron Lu 2015-02-16 03:13:04 UTC
Your acpidump please:
# acpidump > acpidump.txt
Comment 2 Zhang Rui 2015-02-16 03:23:35 UTC
can you turn the fan off by manually set the "cur_state" to 0 for all the cooling devices with "type" equals "Fan"?
Comment 3 Radek Podgorny 2015-02-16 10:46:09 UTC
Created attachment 167081 [details]
acpidump.txt
Comment 4 Radek Podgorny 2015-02-16 10:49:52 UTC
yes, "echo 0 > /sys/class/thermal/cooling_device*/cur_state" turns the fan off. thermal 0 then shows 0 and does not raise even under heavy load. (and probably therefore the fan stays off)
Comment 5 Aaron Lu 2015-02-17 07:40:43 UTC
Looks like the thermal temperature doesn't change accordingly?
Please show us:
# grep . /sys/class/thermal/thermal_zone*/*
Comment 6 Radek Podgorny 2015-02-17 14:02:15 UTC
Created attachment 167341 [details]
grep . /sys/class/thermal/thermal_zone*/*

...this is after "echo 0 > /sys/class/thermal/cooling_device*/cur_state"
Comment 7 Manuel Krause 2015-02-18 22:13:01 UTC
Can someone of you assignees, please check, whether we have a duplicate of https://bugzilla.kernel.org/show_bug.cgi?id=92431

Thank you!
Comment 8 Radek Podgorny 2015-02-18 22:42:51 UTC
yup, that's one of the bugs i've mentioned in the description. the thing is, bug 92431 seems to be talking about fan-always-full-on situations only whereas this one also covers the fan-always-at-low-speed scenarios as well so please consider that when deciding whether this is a dupe or not.
Comment 9 Aaron Lu 2015-02-25 05:57:18 UTC
>> /sys/class/thermal/thermal_zone0/temp:51000
so the temperature of thermal_zone 0 is 51, which is higher than the trip point 6 and 5 and cdev0 and cdev1 should be turned on to do cooling. Can you please check in such a situation, do the cdev0 and cdev1's cur_state is 1?
Comment 10 Aaron Lu 2015-02-26 09:03:44 UTC
BTW, the cur_state for cdev0 should be:
/sys/class/thermal/thermal_zone0/cdev0/cur_state

Please also attach:
ls -l /sys/class/thermal/thermal_zone*/*

Is it that the temperature for those thermal zones constant no matter what the load it is?
Comment 11 Zhang Rui 2015-03-02 03:30:23 UTC
(In reply to Radek Podgorny from comment #0)
> when booted cold, "acpi -tc" lists thermal 0 as 45 degrees and stays that
> way. on hot boot, the temp is 70 and never changes. other thermal zones
> change (not sure about the correctness of the values, thou).
> 
I think the root cause is that the bogus temperature.

> i've bisected the problem to commit 6ab3430129e258ea31dd214adf1c760dfafde67a.
>

This is interesting. can you please double check if the "git checkout 6ab3430129e258ea31dd214adf1c760dfafde67a" kernel has this problem, and then reverting 6ab3430129e258ea31dd214adf1c760dfafde67a fixes the problem?
Comment 12 Radek Podgorny 2015-03-10 21:45:59 UTC
...what the heck?!? it actually looks like i've bisected this to a wrong revision. i've just checked it again and it actually really seems both 6ab3430129e258ea31dd214adf1c760dfafde67a and 7be180cc7a0c5768a984126d9468afc82dcf93a2 now seem to be behaving correctly.

the tripping temperature still has to be different or something - or i'm that stupid (i hope). :-( ...i'll try to investigate this further as i'm sure i did the bisection correctly the first time.

anyway, as a sidenote, everything seems to be fixed on my stock 3.18.6-1-ARCH kernel.
Comment 13 Zhang Rui 2015-03-14 09:15:53 UTC
Close the bug report because the problem is gone in 3.18.6.
Please feel free to re-open it if you can reproduce the problem again in the latest upstream kernel.
Comment 14 Radek Podgorny 2015-03-14 11:42:18 UTC
hmm, i was too quick. i've just hit the problem again on 3.18.6. ...still, not reopening for now as this behaves completely randomly for me and each boot is different.