Bug 8913 - overheating due to fan off - Averatec 2371
Summary: overheating due to fan off - Averatec 2371
Status: REJECTED INSUFFICIENT_DATA
Alias: None
Product: ACPI
Classification: Unclassified
Component: Power-Thermal (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Len Brown
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-08-20 08:40 UTC by Cal Peake
Modified: 2007-12-09 22:26 UTC (History)
2 users (show)

See Also:
Kernel Version: 2.6.23-rc3
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
acpidump output (131.47 KB, text/plain)
2007-08-20 08:41 UTC, Cal Peake
Details

Description Cal Peake 2007-08-20 08:40:05 UTC
Most recent kernel where this bug did not occur: unknown
Distribution: Slackware 12.0
Hardware Environment: Averatec 2371 laptop
Software Environment: N/A
Problem Description:

My laptop will occasionally (usually on a cold boot) shutoff because the fan doesn't spin up and the thermal limits of the CPU are hit. If I'm lucky I'll notice (or remember to notice) and I can reboot and usually get it to work as it should. Typically booting into that popular proprietary operating system before booting to linux will make things work as desired.

Steps to reproduce:

Happens seemingly at random. Often on a cold boot but that doesn't always have to be the case.

I came across this problem before with my previous laptop, an Averatec 3270. It was only temporary and was caused by some changes to the EC polling mode that went in during -rc phase but were reverted before -final was released.

Please let me know what I can provide to help debug this.
Comment 1 Cal Peake 2007-08-20 08:41:20 UTC
Created attachment 12459 [details]
acpidump output

output from acpidump
Comment 2 Len Brown 2007-08-24 00:35:55 UTC
please paste the output from more /proc/acpi/thermal_zone/*/* | cat
and the same for /proc/acpi/fan/*/*

Does this output change as the system heats up?
Comment 3 Cal Peake 2007-08-24 13:08:07 UTC
Hi Len,

Here's the output for thermal_zone, nothing gets registered for fan either in dmesg or in /proc/acpi/fan. The only change is that .../THRM/temperature goes up as expected.

Thanks! -Cal

::::::::::::::
/proc/acpi/thermal_zone/THRM/cooling_mode
::::::::::::::
<setting not supported>
::::::::::::::
/proc/acpi/thermal_zone/THRM/polling_frequency
::::::::::::::
polling frequency:       1 seconds
::::::::::::::
/proc/acpi/thermal_zone/THRM/state
::::::::::::::
state:                   ok
::::::::::::::
/proc/acpi/thermal_zone/THRM/temperature
::::::::::::::
temperature:             44 C
::::::::::::::
/proc/acpi/thermal_zone/THRM/trip_points
::::::::::::::
critical (S5):           120 C
hot (S4):                110 C
passive:                 90 C: tc1=2 tc2=1 tsp=100 devices=P001 P002
Comment 4 Len Brown 2007-08-24 15:02:46 UTC
I'm excited to find this BIOS exports a _TZP --
a rare feature indeed.  And I'm pleased to find
the _TZP=10 properly indicated as a 1.0 second polling
frequency on this thermal zone.

It would be interesting to know if that polling is really
necessary, or if you echo 0 > polling_frequency,
kill acpid, and cat /proc/acpi/events if you see
any thermal events as the system heats -- particularly
if you can get _TMP to cross 90.

But back to the problem at hand, which is no fan
and a thermal shutdown.

        ThermalZone (THRM)
        {
            Name (_TZP, 0x0A)
            Method (_CRT, 0, NotSerialized)
            {
                Return (KELV (ACRT))
            }

            Method (_HOT, 0, NotSerialized)
            {
                Return (KELV (AHOT))
            }

            Method (KELV, 1, NotSerialized)
            {
                And (Arg0, 0xFF, Local0)
                Multiply (Local0, 0x0A, Local0)
                Add (Local0, 0x0AAC, Local0)
                Return (Local0)
            }

            Method (_TMP, 0, NotSerialized)
            {
                RTMP ()
                If (LEqual (ATSP, Zero))
                {
                    Return (KELV (0x40))
                }
                Else
                {
                    Return (KELV (ATMP))
                }
            }

            Name (_PSL, Package (0x02)
            {
                \_PR.P001,
                \_PR.P002
            })
            Method (_TSP, 0, NotSerialized)
            {
                Multiply (ATSP, 0x0A, Local0)
                Return (Local0)
            }

            Method (_TC1, 0, NotSerialized)
            {
                Return (TC1)
            }

            Method (_TC2, 0, NotSerialized)
            {
                Return (TC2)
            }

            Method (_PSV, 0, NotSerialized)
            {
                Return (KELV (APSV))
            }
        }
    }

There are no _ACx trip points for active (fan) cooling,
and there are no PNP0C0B fan devices in the DSDT or SSDT.
ie. There is no ACPI fan control on this system.

Can you confirm that you see the same problem if booted
with acpi=off, or at least with CONFIG_ACPI_THERMAL=n
or "thermal.off=1"?

It is interesting that booting Windows causes the fan issue to clear.
Perhaps rebooting Linux would also clear the problem also?
If it requires Windows, then there may be some platform-specific
hooks in windows for this machine.  No, the system doesn't have
PNP0C14, so it would have to be native hooks and not WMI.

Please also confirm that the system still has this issue
when the kernel has CONFIG_HWMON=n, as sometimes the
sensors drivers can interfere with firmware.

Also, when the system fails, can you be polling _TMP
to see what temperature is reported upon the shutdown?
It should really enter processor thermal throttling at 90
and that should slow the system down and prevent it from
getting any hotter -- even if the fans are broken.
Also, 120 is a very high critical trip point.
Hard to say if that is calibrated to reality, but most processors
will shut down automatically before they hit 120.
Comment 5 Fu Michael 2007-10-18 01:34:34 UTC
ping for response from bug reporter.
Comment 6 Cal Peake 2007-10-21 09:06:17 UTC
My apologies. I've changed jobs recently and have had no time to further debug this. I hope to have the info Len requested by the end of next week. Thanks for the ping!
Comment 7 Fu Michael 2007-12-09 22:26:29 UTC
I would have to close this bug as insufficient info. please reopen if this bug still bothers you...

Note You need to log in before you can comment on or make changes to this bug.