Bug 217548

Summary: thinkpad_acpi: System freeze on Thinkpad P14s Gen3 AMD Machine Type 21J5
Product: Platform Specific/Hardware Reporter: Simon Schröter (simon.schroeter)
Component: x86-64Assignee: platform_x86_64 (platform_x86_64)
Status: NEW ---    
Severity: normal CC: a5151b, kernel, piotr.socha
Priority: P3    
Hardware: i386   
OS: Linux   
Kernel Version: Subsystem:
Regression: No Bisected commit-id:

Description Simon Schröter 2023-06-13 21:33:23 UTC
I'm using thinkfan for managing my fan speed. But with my new Thinkpad P14s Gen3 AMD Machine Type 21J5 its causing system freezes after a random amount of time, when starting it. Same problem appears when using zcfan, a zero-config fan tool, also using `thinkpad_acpi`.

This is an example log output of thinkfan with 3 boots in between.

```
-- Boot 40cba4c2651649b4a54e90663138bc5e --
Mai 30 10:43:38 copper systemd[1]: Starting simple and lightweight fan control program...
Mai 30 10:43:38 copper thinkfan[898]: Daemon PID: 899
Mai 30 10:43:38 copper systemd[1]: Started simple and lightweight fan control program.
Mai 30 10:43:38 copper thinkfan[899]: Temperatures(bias): 77(0) -> Fans: level 7
Mai 30 10:43:45 copper thinkfan[899]: Temperatures(bias): 74(0) -> Fans: level 5
Mai 30 10:43:55 copper thinkfan[899]: Temperatures(bias): 60(0) -> Fans: level 3
Mai 30 10:44:05 copper thinkfan[899]: Temperatures(bias): 56(0) -> Fans: level 2
Mai 30 10:44:10 copper thinkfan[899]: Temperatures(bias): 54(0) -> Fans: level 1
Mai 30 10:46:05 copper thinkfan[899]: Temperatures(bias): 49(0) -> Fans: level 0
-- Boot fd3f2ec8214340a0922ff31e68d09722 --
Mai 30 11:55:58 copper systemd[1]: Starting simple and lightweight fan control program...
Mai 30 11:55:58 copper thinkfan[651]: Daemon PID: 654
Mai 30 11:55:58 copper thinkfan[654]: Temperatures(bias): 86(0) -> Fans: level 127
Mai 30 11:55:58 copper systemd[1]: Started simple and lightweight fan control program.
Mai 30 11:56:14 copper thinkfan[654]: Temperatures(bias): 73(0) -> Fans: level 5
Mai 30 11:56:26 copper thinkfan[654]: Temperatures(bias): 68(0) -> Fans: level 4
Mai 30 11:56:48 copper thinkfan[654]: Temperatures(bias): 64(0) -> Fans: level 3
Mai 30 11:57:13 copper thinkfan[654]: Temperatures(bias): 58(0) -> Fans: level 2
Mai 30 11:58:50 copper thinkfan[654]: Temperatures(bias): 54(0) -> Fans: level 1
Mai 30 11:59:27 copper thinkfan[654]: Temperatures(bias): 67(0) -> Fans: level 3
Mai 30 12:00:21 copper thinkfan[654]: Temperatures(bias): 59(0) -> Fans: level 2
Mai 30 12:00:46 copper thinkfan[654]: Temperatures(bias): 54(0) -> Fans: level 1
Mai 30 12:02:11 copper thinkfan[654]: Temperatures(bias): 49(0) -> Fans: level 0
-- Boot 400f452fc3fd4221a1821cb9ed5fea3e --
Mai 30 13:11:10 copper systemd[1]: Starting simple and lightweight fan control program...
Mai 30 13:11:10 copper thinkfan[633]: Daemon PID: 635
Mai 30 13:11:10 copper systemd[1]: Started simple and lightweight fan control program.
Mai 30 13:11:10 copper thinkfan[635]: Temperatures(bias): 86(0) -> Fans: level 127
Mai 30 13:11:26 copper thinkfan[635]: Temperatures(bias): 75(0) -> Fans: level 7
Mai 30 13:11:36 copper thinkfan[635]: Temperatures(bias): 69(0) -> Fans: level 4
Mai 30 13:11:51 copper thinkfan[635]: Temperatures(bias): 63(0) -> Fans: level 3
Mai 30 13:12:13 copper thinkfan[635]: Temperatures(bias): 71(0) -> Fans: level 4
Mai 30 13:12:20 copper thinkfan[635]: Temperatures(bias): 63(0) -> Fans: level 3
Mai 30 13:12:50 copper thinkfan[635]: Temperatures(bias): 59(0) -> Fans: level 2
Mai 30 13:13:57 copper thinkfan[635]: Temperatures(bias): 74(0) -> Fans: level 4
Mai 30 13:13:59 copper thinkfan[635]: Temperatures(bias): 81(0) -> Fans: level 7
Mai 30 13:14:09 copper thinkfan[635]: Temperatures(bias): 66(0) -> Fans: level 4
Mai 30 13:14:21 copper thinkfan[635]: Temperatures(bias): 61(0) -> Fans: level 3
Mai 30 13:14:31 copper thinkfan[635]: Temperatures(bias): 59(0) -> Fans: level 2
Mai 30 13:15:45 copper thinkfan[635]: Temperatures(bias): 54(0) -> Fans: level 1
Mai 30 13:20:45 copper thinkfan[635]: Temperatures(bias): 49(0) -> Fans: level 0
```

My system is:
- Distribution ArchLinux
- Kernel 6.3.4-arch1-1
- Thinkfan version 1.3.1
- Package thinkfan version 1.3.1-1
- Notebook Thinkpad P14s Gen3 AMD Machine Type 21J5
- BIOS version 1.35 (latest)


I also created an issue in thinkfan github repository: https://github.com/vmatare/thinkfan/issues/228
Comment 1 p3sto 2023-08-02 04:42:00 UTC
I've got the exact same problem on Thinkpad P16s Gen1 (AMD)
Comment 2 Piotr 2023-08-03 19:33:39 UTC
Same problem here
Model: T14 Gen 3 AMD
Type: 21CF
UEFI BIOS: 1.35 (newest)
ECP: 1.25 (newest)

I've tested:
Debian Stable - Kernel 6.1.38
Debian Testing - Kernel 6.3.7
Debian Unstable - Kernel 6.4.4

Same problem when Thinkfan 1.3.1 is enabled and also when running pwmconfig from fancontrol package.

Unfortunately using journalctl gives nothing. Whole laptop freezes before journalctl writes anything.
Comment 3 Carl Hjerpe 2023-09-07 21:50:23 UTC
Some additional information

Same hardware as @Piotr, kernel 6.5.1 (NixOS)

I've managed to troubleshoot a bit further when writing my own fan control script in python. If we write fan speeds too fast the kernel locks up super-hard instantly. If you're playing sound while it hangs it'll keep looping the same second or so of sound, not sure for how long because it's unbearable.

Writing fan speeds once per second is enough to freeze the kernel.

https://github.com/Lillecarl/nixos/blob/master/fancontrol.py This is the script I used to perform the tests.