Bug 214205 - thermal sensor broken after resume, fan speed at 100% speed on Lenovo ThinkPad Carbon X1 4th Gen
Summary: thermal sensor broken after resume, fan speed at 100% speed on Lenovo ThinkPa...
Status: CLOSED UNREPRODUCIBLE
Alias: None
Product: ACPI
Classification: Unclassified
Component: EC (show other bugs)
Hardware: Intel Linux
: P1 normal
Assignee: acpi_ec
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-08-27 20:30 UTC by Stanislav Brabec
Modified: 2022-09-04 14:17 UTC (History)
7 users (show)

See Also:
Kernel Version: 5.14.0-rc7.g61596f4
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Stanislav Brabec 2021-08-27 20:30:44 UTC
On Lenovo ThinkPad Carbon X1 4th Gen, fan speed often raises to max after resume.

Reproducibility varies, but recently I get about 85% reproducibility.

The problem never appears on boot from poweroff.

But when it appears after resume, reboot does not help. Shutdown is required to get fan back to work properly.

It can be worked around by repeated suspend/resume many times:
cat /proc/acpi/ibm/thermal | ( read t x y ; echo $x ; if test "$x" = -128 ; then systemctl suspend ; fi )

With ~1% probability, a different problem appears: Fan does not stop during resume. It may be related.

Another strange ACPI behavior, much less annoying: The computer often forgets wakeup preferences for USB devices during the suspend/wake cycle, e. g. /sys/bus/usb/devices/1-4.2.1.2/power/wakeup goes from enabled to disabled.

It is caused by the failed resume of the thermal sensor, which returns -128 instead of the correct value.

cat /proc/acpi/ibm/thermal
When it fails:
temperatures:   -128 -128 0 0 0 0 0 0
When OK:
temperatures:	47 -128 0 0 0 0 0 0

sensors command shows the same problem.

Hardware:
ThinkPad X1 Carbon 4th
20FB002UMC
BIOS N1FET75W (1.49) 05/25/2021

Output of hwprobe:
https://linux-hardware.org/?probe=44d0412dd3

This bug is similar to bug 196129 for 5th Gen.
Comment 1 Luke Midworth 2021-08-30 02:52:56 UTC
I have the Thinkpad X1 Carbon 4th gen laptop as well, and I experience this same problem about 90% of the time I resume from sleep. The issue does not occur in any other situation.

I'm getting the same output of cat /proc/acpi/ibm/thermal when the bug appears:
temperatures:   -128 -128 0 0 0 0 0 0

Hardware:
20FC000RAU
BIOS N1FUJ42W (1.49)

Output of hwprobe:
https://linux-hardware.org/?probe=b7c59b9453
Comment 2 kernel.org 2021-09-14 09:50:06 UTC
I don't know whether that's the same problem, but I can witness the same symptoms and the same `/proc/acpi/ibm/thermal` output on a x260 with Firmware version: R02ET74W (1.47 ) and Firmware Revision 1.15.

https://bugzilla.redhat.com/show_bug.cgi?id=1480844 says that it might be a firmware bug.
Comment 3 Gautam Iyer 2021-09-26 14:36:22 UTC
I can confirm I have exactly the same problem. On some resumes it starts fans blaring and I see the -128 in /proc/acpi/ibm/thermal that Stanislav Brabec

I have an X1 Yoga (4 years old now...):

Arch 	x86_64
Kernel 	5.14.7-arch1-1
Vendor 	LENOVO
Model 	ThinkPad X1 Yoga 1st 20FQ000RUS
BIOS N1FET75W (1.49 ) (Updated 05/25/2021)

Output of hwprobe:
https://linux-hardware.org/?probe=dd105ba06b
Comment 4 Marc Nijdam 2022-02-19 00:07:49 UTC
In the recent Thinkpad Carbon X1 4th firmware update notes (Version 1.52, dated 2022/01/27) it states:

...
[Problem fixes]
- Fixed an issue where fan might rotated with max speed due to not reading CPU
  temperature correctly.
...

(Reference: https://download.lenovo.com/pccbbs/mobiles/n1fur45w.txt)

I can confirm that after updating from 1.48 to 1.52 the fan does not rotate a single time anymore with max speed after resume.

Kernel: 5.13.0-28-generic, (Ubuntu 20.04.1)

To update I used the fwupdmgr command:

Make sure your model matches the exact model (For Thinkpad Carbon X1 4th this is: 20FB and 20FC)

Download and extract n1ful45w.zip from this page:

https://pcsupport.lenovo.com/ie/en/products/Laptops-and-netbooks/ThinkPad-X-Series-laptops/ThinkPad-X1-Carbon-Type-20FB-20FC/downloads/DS111756

Connect a power cord first. To update I used the method with fwupdmgr and using cab files from the shell:

    fwupdmgr install N1FET78W.cab
    reboot

Wait patiently until the bios is updated and your system is ready again. Then:

    fwupdmgr install N1FHT36W.cab
    reboot

Wait patiently until the ECP (Embedded Controller Program) is updated. You are done.
Comment 5 Marc Nijdam 2022-02-21 22:20:20 UTC
Concerning previous posting, cheering too soon: After 3 days of using, the issue happened now for the first time again.

#####

cat /proc/acpi/ibm/thermal
temperatures:	-128 -128 0 0 0 0 0 0

#####

sensors
iwlwifi_1-virtual-0
Adapter: Virtual device
temp1:        +24.0°C  

pch_skylake-virtual-0
...

BAT0-acpi-0
...

coretemp-isa-0000
...

thinkpad-isa-0000
Adapter: ISA adapter
fan1:        6912 RPM
CPU:              N/A  
GPU:              N/A  
temp3:         +0.0°C  
... 
temp8:         +0.0°C  

nvme-pci-0500
...

acpitz-acpi-0
...

####

cat /var/log/syslog

...
Feb 21 22:38:16 mni-X1 kernel: [27787.254306] thermal thermal_zone2: failed to read out thermal zone (-61)
...
Comment 6 Marc Nijdam 2022-02-21 22:21:14 UTC
Concerning previous posting, cheering too soon: After 3 days of using, the issue happened now for the first time again.

#####

cat /proc/acpi/ibm/thermal
temperatures:	-128 -128 0 0 0 0 0 0

#####

sensors
iwlwifi_1-virtual-0
Adapter: Virtual device
temp1:        +24.0°C  

pch_skylake-virtual-0
...

BAT0-acpi-0
...

coretemp-isa-0000
...

thinkpad-isa-0000
Adapter: ISA adapter
fan1:        6912 RPM
CPU:              N/A  
GPU:              N/A  
temp3:         +0.0°C  
... 
temp8:         +0.0°C  

nvme-pci-0500
...

acpitz-acpi-0
...

####

cat /var/log/syslog

...
Feb 21 22:38:16 xyz-X1 kernel: [27787.254306] thermal thermal_zone2: failed to read out thermal zone (-61)
...
Comment 7 Stanislav Brabec 2022-05-02 23:58:41 UTC
From my side, not a single occurrence of the problem since the installation date of the BIOS version 1.52 (2022-02-27). At least 100 suspend-resume cycles were performed without re-appearing of this issue. Before the update, about 85% of resumes had broken thermal values.

So I tempted to close the bug as RESOLVED.

Anybody is still able to reproduce with the latest BIOS and the latest Intel Management Engine?
Comment 8 Zhang Rui 2022-06-21 09:14:25 UTC
Thanks for the update.
Bug closed.
Comment 9 Myrdden42 2022-09-04 12:20:56 UTC
Hello, I'm on a Carbon X1 4th Gen, with BIOS update 1.53, kernel 5.19.6.arch1-1, and this issue still exists for me. This happens for me not only on resume but whenever the fans start to spin up at all there's about a 1/3 chance that the speed sensor breaks and shows 65536 until the fans are manually stopped for a while. This occurs regardless of whether the BIOS is allowed to control the fans or if they're controlled manually with a program like thinkfan, or manually, for instance manually setting the fans to their slowest speed will result in them maxing out anyways. Error is the same as Comment 6, however my output of /proc/acpi/ibm/fan seems to be normal even when the fans are malfunctioning.
Comment 10 Myrdden42 2022-09-04 14:17:42 UTC
Sorry, doesn't appear to be a way to edit comments, I meant to say the output of /proc/acpi/ibm/thermal seems to display the correct CPU temp even when the fans are malfunctioning.

Note You need to log in before you can comment on or make changes to this bug.