Bug 75741
Summary: | HP 2510p fan comes on at boot, never shuts off | ||
---|---|---|---|
Product: | ACPI | Reporter: | Jake Edge (jake) |
Component: | Power-Fan | Assignee: | Rafael J. Wysocki (rjw) |
Status: | CLOSED UNREPRODUCIBLE | ||
Severity: | normal | CC: | auxsvr, lenb, rjw |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 3.15-rc4 | Subsystem: | |
Regression: | No | Bisected commit-id: |
Description
Jake Edge
2014-05-08 17:36:17 UTC
Perhaps a BIOS setting is related in this case, the one about keeping the fan always on when connected to AC. Well, that BIOS setting is (and was) "Disabled" (i.e. don't always keep fan on when plugged in), BUT, I did realize I hadn't tested any of this when I wasn't connected to the mains -- and there is a difference in behavior. One thing I noticed is that it is not coming on when it gets to the BIOS, but instead when it gets to GRUB. I must have been mistaken earlier. Anyway, when *not* connected to the mains, the fan comes on at GRUB time, but then goes away on its own after a bit (20-30 seconds or so). I did reconfirm that when it is connected, the fan comes on and stays on (until a sleep) ... maybe that helps narrow it down some? This seems to be BIOS-related, but perhaps we can work around it somehow. For starters, there is a revert of an ACPI AC commit that might be related to that in principle. Can you please check if https://patchwork.kernel.org/patch/4124871/ (on top of 3.15-rc4) makes any difference? Ok, some truly weird stuff here (at least to me, it's probably old hat to you guys :) ... I applied that patch and rebuilt (which, incidentally, caused the fans to come on and then eventually go off as the system cooled -- good stuff, thanks! :) ... i tried restarting the system, but it got hung up somewhere and never brought up the KDE login manager ... so I power cycled: - the fan came on strong (but not full blast) at power on, then throttled back (but not off) at GRUB time, then it slowed down to off somewhere in the boot sequence such that by the time I had put in my password for the encrypted /home, it was off (maybe that helps with where in the sequence it happened? probably not ...) -- so I thought: problem solved BUT - i restarted from the KDE menu (a warm reboot I guess) and the problem returned. The fan came on at GRUB time and never went off (until I slept/resumed) All of the above is with the mains connected, I tried without the mains and the behavior from before (correct behavior imo) continued (I did notice that the fan seemed to shut down much earlier in the boot process than it did on AC with a cold boot as described above ... it happened very quickly after starting to boot, long before the encrypted fs password prompt.) I am guessing, but don't know for sure, that all of this was true before I applied the patch. so, what's next? My experience with those HP boxes is that they have some kind of "memory" or "hysteresis" in case you power cycle them forcibly. That probably is related to the SMM being confused or something like that. Do I understand correctly that the behavior after a cold boot is *always* as expected, while the behavior after a warm reboot is *always* that the fan doesn't do what you'd expect it to do? Yes. I just confirmed that the patch makes no difference, either. In both cases, the fan comes on after a warm boot and stays on until i sleep/resume. I guess it isn't as easy as a test for is_fan_on() and does_it_need_to_be() at boot time, eh? :) thanks ... There are a few possibilities there. The first one is that the BIOS is supposed to notify us of temperature changes, but it doesn't. The second one is that it sends the notifications, but we ignore them for some unknown reason. Finally, if the notifications are there and we don't ignore them, it still may appear to the thermal subsystem that the temperature doesn't change. If we are able to figure out which is the case, we may be able to find a remedy (if notifications are not coming, we may need to poll for example etc.). Hopefully, I'll have more time to spend on debugging this next week. Please ping me in case you have some spare cycles. :-) Well if you have some pointers as to where in the code to look and/or instrument (printk or some such?) to figure out whether those notifications are being sent and if they are being ignored, I may have some time to poke at it. It appears that the BIOS is not notifying the thermal subsystem when a warm boot happens. I put some pr_info into acpi_thermal_notify() which does get called at cold boot time (ACPI_THERMAL_NOTIFY_TEMPERATURE followed by ACPI_THERMAL_NOTIFY_THRESHOLDS in the same second both for TZ0), there are a bunch more that get done at resume time (for more of the zones), which is presumably why that sets things straight, but none at all for warm boot. So, we somehow need to poll this BIOS or something? sounds kind of ugly ... can we even detect warm vs. cold boot? Dunno ... I am out of my depth (if I wasn't already :) It looks like the method we use for rebooting this machine confuses the BIOS and the thermal reporting gets stuck. Maybe we should poke at the EC or something like that to revive it. Does something like switching to battery and back to AC make any difference? Hmm, I seem to be falling into non-reproducibility, at least with any reliability. I sometimes get no thermal events from the BIOS on a warm boot, but the fan turns off on its own during boot. Sometimes it turns off after a minute or two, even without any thermal events from the BIOS (?). But sometimes a warm boot will trigger thermal events ... there isn't a pattern that I see. So far, I haven't been able to get 10 minutes of fan as I was a few days ago (maybe the house/office was warmer that day or something? -- though why sleep/immediate-resume would 'cool' things off enough seems a little puzzling). One of the times the fan *was* on, unplugging the AC, then plugging it back in did seem to cause the fan to go off. I suspect there is some problem here, but tracking it down may be difficult. I'll try to remember to test it out on a warm day again and see if that makes any difference. Otherwise, I don't think we have much to go on ... OK I will close it as "not reproducible", then, and if you find a reliable way to reproduce, please reopen. :-) On an HP mini 5102, 3.4.33 works fine. Starting from 3.7, if I remember correctly, the fan would work fine on boot and turn to full speed on resume from suspend to RAM. On 3.15-rc5, the fan is off on boot and stays that way no matter what the load is, up to 67 degrees Celsius (I can't make it hotter than that). After resume from suspend-to-RAM, it is on, full speed again. This could be a BIOS bug, but Windows 7 and earlier linux versions work perfectly. Also, HP offers BIOS upgrades that fix problems after resuming from suspend, but the program that does the upgrade is buggy and I'm not willing to do the upgrade for fear it might make things worse. On 3.15-rc5, this time the fan never starts, even after resume from suspend! The computer reaches 73 oC and temp5 remains 0?! I tried 3.15-rc5 just to see if anything had changed -- it hadn't for me on the 2510p ... in fact, annoyingly, the 'fan comes on at warm reboot -- doesn't stop until sleep' problem came back -- it happened on first boot of the kernel (not that i think it is specific to rc5), which was warm ... then i cold booted, no fan problems, then warm booted, fan on and stays on ... unplugging AC then plugging back in seemed to make no difference ... Of course, i no longer have those debug prints in ... but could put them back in if we really want to try to track this down -- i'm kind of ambivalent ... Also, after reboot into 3.4.33, the fan turned on only after lm_sensors or laptop-mode were loaded; even during POST the fan was off, while the system was too hot. Could the following messages on 3.15-rc5 be related to this? 3.4.33 does not display them. ACPI Warning: SystemIO range 0x00000428-0x0000042f conflicts with OpRegion ACPI: If an ACPI driver is available for this device, you should use it in ACPI Warning: SystemIO range 0x00000530-0x0000053f conflicts with OpRegion ACPI: If an ACPI driver is available for this device, you should use it in ACPI Warning: SystemIO range 0x00000500-0x0000052f conflicts with OpRegion ACPI: If an ACPI driver is available for this device, you should use it in lpc_ich: Resource conflict(s) found affecting gpio_ich Jake seems to have given up on debugging the 2510p, which is the subject of this bug report, so we'll leave it closed. auxsvr@gmail.com - if you are having problems with the HP mini 5102, then you should probably file a new report. |