Bug 210457
Summary: | Fan sporadically maxed on wake-up due to unavailable sensor temperature | ||
---|---|---|---|
Product: | Drivers | Reporter: | Daniel T. (pterion) |
Component: | Platform_x86 | Assignee: | drivers_platform_x86 (drivers_platform_x86) |
Status: | NEW --- | ||
Severity: | normal | CC: | enetor, jwrdegoede, kernel.org, ntran005, pterion, reescf, rui.zhang, s-cvhajmmblfsofmpsh |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 5.4.0-53-generic | Subsystem: | |
Regression: | No | Bisected commit-id: |
Description
Daniel T.
2020-12-02 15:37:48 UTC
Lenovo ThinkPad X720 is also affected by this, but it is not sporadic, as far as I can tell. Not once since installing my current kernel have I seen normal behaviour following sleep. The problem persists if the machine is rebooted. Only powering off and restarting restores the expected behaviour. The bug seems to be a regression. See https://bugzilla.kernel.org/show_bug.cgi?id=196129. My machine was not affected by the original bug, but the problem looks to be the same. Following sleep: cat /sys/devices/platform/thinkpad_hwmon/hwmon/hwmon7/temp* 0 1000 0 0 0 0 0 cat: /sys/devices/platform/thinkpad_hwmon/hwmon/hwmon7/temp1_input: No such device or address cat: /sys/devices/platform/thinkpad_hwmon/hwmon/hwmon7/temp2_input: No such device or address 0 0 0 0 0 0 0 and cat /proc/acpi/ibm/thermal temperatures: -128 -128 0 0 0 0 0 0 0 0 1 0 0 0 0 0 Prior to sleep, only the second sensor is missing and acpitz-acpi-0 gives a sane reading. Following, the first goes AWOL and acpitz-acpi-0 is stuck at 48. Removing and reloading thinkpad_acpi makes no difference and, for me, re-sleeping and rewaking makes no difference either. I should have looked at the date of this report. The problem I'm seeing is new. I installed a new kernel this week and the problem started then. ArchLinux kernel package version is 5.11.4-arch1-1. I did NOT see this bug with 5.10.16.arch1-1. I tried adding acpi.ec_freeze_events=Y acpi.ec_suspend_yield=Y to my kernel command line and rebooting, but I never got past the screen displaying the command. The machine froze and I had to poweroff. Is there a newer invocation I could try here? The bug does NOT manifest if I boot linux-lts 5.10.21-1. Following sleep: cat /sys/devices/platform/thinkpad_hwmon/hwmon/hwmon6/temp* 0 1000 0 0 0 0 0 38000 cat: /sys/devices/platform/thinkpad_hwmon/hwmon/hwmon6/temp2_input: No such device or address 0 0 0 0 0 0 0 and the fan is at 0RPM, while acpitz-acpi-0 and thinkpad-isa-0000 temp1 are both +38.0°C. I have the same issue on X1 Carbon 5th generation (i.e. lacking of temp1 makes the fan run in max speed) and the bug still exists in 5.11.7.arch1-1. Going back to the lts-kernel (5.10.24-1) makes temp1 appear. this seems like a duplicate of #211313, right? Probably it's the same issue. However, here we know that the main issue is that the first entry of /proc/acpi/ibm/thermal is erronously -128 some (or most) of the time (so the fan is just a symptom, not the issue). temperatures: -128 -128 0 0 0 0 0 0 0 0 1 0 0 0 0 0 Iirc, the first entry is /sys/devices/platform/thinkpad_hwmon/hwmon/hwmon*/temp1_input (By the way, the second entry being -128 is fine) On my machine, the issue seems to have been caused by a firmware bug, which only manifested in symptoms with the kernel update. That is, the firmware was the same for 3+ years with no issue, the new kernel triggered the bug, but a firmware update seems to have resolved it. (In reply to cfr from comment #8) > On my machine, the issue seems to have been caused by a firmware bug, which > only manifested in symptoms with the kernel update. That is, the firmware > was the same for 3+ years with no issue, the new kernel triggered the bug, > but a firmware update seems to have resolved it. Thank you. Can the other reporters of this bug please also see if the latest BIOS resolves this? ThinkPad BIOS updates are available on lvfs, so they can be done under Linux through fwupdmgr. (In reply to Hans de Goede from comment #9) > (In reply to cfr from comment #8) > > On my machine, the issue seems to have been caused by a firmware bug, which > > only manifested in symptoms with the kernel update. That is, the firmware > > was the same for 3+ years with no issue, the new kernel triggered the bug, > > but a firmware update seems to have resolved it. > > Thank you. > > Can the other reporters of this bug please also see if the latest BIOS > resolves this? ThinkPad BIOS updates are available on lvfs, so they can be > done under Linux through fwupdmgr. The situation is NOT resolved / the same for me after the recent firmware / BIOS update. I'm running X1 Carbon BIOS 1.48 (latest) and experienced this issue on kernel 5.4.0-58-generic. Also it was intermittent and not happened every resume from lid close/suspend. After updating to kernel 5.12.12(-arch1-1), the issue still existed, but only after multiple suspend/resume-cycles (I couldn't find a deterministic way to reproduce it). After a BIOS update I wasn't able to reproduce it anymore. The BIOS now reports: UEFI BIOS Version: N1MET65W (1.50), Embedded Controller Version: N1MHT31W (1.20), Machine Type Model: 20HQS3KG00. However, I do not own/hold the laptop anymore, so I can't tell whether the issue returns after running the laptop for a longer time (days, or so). |