Most recent kernel where this bug did not occur: unknown Distribution: Slackware 12.0 Hardware Environment: Averatec 2371 laptop Software Environment: N/A Problem Description: My laptop will occasionally (usually on a cold boot) shutoff because the fan doesn't spin up and the thermal limits of the CPU are hit. If I'm lucky I'll notice (or remember to notice) and I can reboot and usually get it to work as it should. Typically booting into that popular proprietary operating system before booting to linux will make things work as desired. Steps to reproduce: Happens seemingly at random. Often on a cold boot but that doesn't always have to be the case. I came across this problem before with my previous laptop, an Averatec 3270. It was only temporary and was caused by some changes to the EC polling mode that went in during -rc phase but were reverted before -final was released. Please let me know what I can provide to help debug this.
Created attachment 12459 [details] acpidump output output from acpidump
please paste the output from more /proc/acpi/thermal_zone/*/* | cat and the same for /proc/acpi/fan/*/* Does this output change as the system heats up?
Hi Len, Here's the output for thermal_zone, nothing gets registered for fan either in dmesg or in /proc/acpi/fan. The only change is that .../THRM/temperature goes up as expected. Thanks! -Cal :::::::::::::: /proc/acpi/thermal_zone/THRM/cooling_mode :::::::::::::: <setting not supported> :::::::::::::: /proc/acpi/thermal_zone/THRM/polling_frequency :::::::::::::: polling frequency: 1 seconds :::::::::::::: /proc/acpi/thermal_zone/THRM/state :::::::::::::: state: ok :::::::::::::: /proc/acpi/thermal_zone/THRM/temperature :::::::::::::: temperature: 44 C :::::::::::::: /proc/acpi/thermal_zone/THRM/trip_points :::::::::::::: critical (S5): 120 C hot (S4): 110 C passive: 90 C: tc1=2 tc2=1 tsp=100 devices=P001 P002
I'm excited to find this BIOS exports a _TZP -- a rare feature indeed. And I'm pleased to find the _TZP=10 properly indicated as a 1.0 second polling frequency on this thermal zone. It would be interesting to know if that polling is really necessary, or if you echo 0 > polling_frequency, kill acpid, and cat /proc/acpi/events if you see any thermal events as the system heats -- particularly if you can get _TMP to cross 90. But back to the problem at hand, which is no fan and a thermal shutdown. ThermalZone (THRM) { Name (_TZP, 0x0A) Method (_CRT, 0, NotSerialized) { Return (KELV (ACRT)) } Method (_HOT, 0, NotSerialized) { Return (KELV (AHOT)) } Method (KELV, 1, NotSerialized) { And (Arg0, 0xFF, Local0) Multiply (Local0, 0x0A, Local0) Add (Local0, 0x0AAC, Local0) Return (Local0) } Method (_TMP, 0, NotSerialized) { RTMP () If (LEqual (ATSP, Zero)) { Return (KELV (0x40)) } Else { Return (KELV (ATMP)) } } Name (_PSL, Package (0x02) { \_PR.P001, \_PR.P002 }) Method (_TSP, 0, NotSerialized) { Multiply (ATSP, 0x0A, Local0) Return (Local0) } Method (_TC1, 0, NotSerialized) { Return (TC1) } Method (_TC2, 0, NotSerialized) { Return (TC2) } Method (_PSV, 0, NotSerialized) { Return (KELV (APSV)) } } } There are no _ACx trip points for active (fan) cooling, and there are no PNP0C0B fan devices in the DSDT or SSDT. ie. There is no ACPI fan control on this system. Can you confirm that you see the same problem if booted with acpi=off, or at least with CONFIG_ACPI_THERMAL=n or "thermal.off=1"? It is interesting that booting Windows causes the fan issue to clear. Perhaps rebooting Linux would also clear the problem also? If it requires Windows, then there may be some platform-specific hooks in windows for this machine. No, the system doesn't have PNP0C14, so it would have to be native hooks and not WMI. Please also confirm that the system still has this issue when the kernel has CONFIG_HWMON=n, as sometimes the sensors drivers can interfere with firmware. Also, when the system fails, can you be polling _TMP to see what temperature is reported upon the shutdown? It should really enter processor thermal throttling at 90 and that should slow the system down and prevent it from getting any hotter -- even if the fans are broken. Also, 120 is a very high critical trip point. Hard to say if that is calibrated to reality, but most processors will shut down automatically before they hit 120.
ping for response from bug reporter.
My apologies. I've changed jobs recently and have had no time to further debug this. I hope to have the info Len requested by the end of next week. Thanks for the ping!
I would have to close this bug as insufficient info. please reopen if this bug still bothers you...