Bug 179981
Summary: | coretemp stops reporting new temperatures - Avaton - Intel(R) Atom(TM) CPU C2758 | ||
---|---|---|---|
Product: | Power Management | Reporter: | Alex Forencich (alex) |
Component: | Thermal | Assignee: | Srinivas Pandruvada (srinivas.pandruvada) |
Status: | CLOSED INSUFFICIENT_DATA | ||
Severity: | normal | CC: | lenb, rui.zhang, yu.c.chen |
Priority: | P1 | ||
Hardware: | Intel | ||
OS: | Linux | ||
Kernel Version: | 4.8.2 | Subsystem: | |
Regression: | No | Bisected commit-id: |
Description
Alex Forencich
2016-10-22 17:26:43 UTC
Just updated to kernel version 4.10.1; no change. the temperature reported by coretemp driver are directly read from MSR. Thus this sounds like a hardware issue to me. please attach the turbostat output as well, when the problem is reproduced. sensors output: $ sensors coretemp-isa-0000 Adapter: ISA adapter Core 0: +39.0°C (high = +98.0°C, crit = +98.0°C) Core 1: +39.0°C (high = +98.0°C, crit = +98.0°C) Core 2: +38.0°C (high = +98.0°C, crit = +98.0°C) Core 3: +38.0°C (high = +98.0°C, crit = +98.0°C) Core 4: +38.0°C (high = +98.0°C, crit = +98.0°C) Core 5: +38.0°C (high = +98.0°C, crit = +98.0°C) Core 6: +36.0°C (high = +98.0°C, crit = +98.0°C) Core 7: +36.0°C (high = +98.0°C, crit = +98.0°C) turbostat output, with turbostat.c edited to force no_MSR_MISC_PWR_MGMT to 1 to avoid an I/O error while reading msr 0x1aa: $ sudo ./turbostat --debug turbostat version 17.04.12 - Len Brown <lenb@kernel.org> CPUID(0): GenuineIntel 11 CPUID levels; family:model:stepping 0x6:4d:8 (6:77:8) CPUID(1): SSE3 MONITOR - EIST TM2 TSC MSR ACPI-TM TM CPUID(6): APERF, No-TURBO, DTS, No-PTM, No-HWP, No-HWPnotify, No-HWPwindow, No-HWPepp, No-HWPpkg, EPB cpu5: MSR_IA32_MISC_ENABLE: 0x00850089 (TCC EIST No-MWAIT PREFETCH TURBO) CPUID(7): No-SGX SLM BCLK: 100.0 Mhz RAPL: 2185 sec. Joule Counter Range, at 30 Watts cpu5: MSR_PLATFORM_INFO: 0xc0080001800 12 * 100.0 = 1200.0 MHz max efficiency frequency 24 * 100.0 = 2400.0 MHz base frequency cpu5: MSR_IA32_POWER_CTL: 0x00000000 (C1E auto-promotion: DISabled) cpu5: MSR_TURBO_RATIO_LIMIT: 0x00000000 cpu5: MSR_PKG_CST_CONFIG_CONTROL: 0x0000840e (locked: pkg-cstate-limit=14: pc6) cpu5: POLL: CPUIDLE CORE POLL IDLE cpu5: C1: MWAIT 0x00 cpu5: C6: MWAIT 0x51 cpu5: cpufreq driver: acpi-cpufreq cpu5: cpufreq governor: schedutil cpu0: MSR_IA32_ENERGY_PERF_BIAS: 0x00000004 (custom) cpu0: MSR_RAPL_POWER_UNIT: 0x000a1003 (0.125000 Watts, 0.000015 Joules, 0.000977 sec.) cpu0: MSR_PKG_POWER_LIMIT: 0x468bb8005b89c4 (UNlocked) cpu0: PKG Limit #1: ENabled (312.500000 Watts, 10.000000 sec, clamp ENabled) cpu0: PKG Limit #2: ENabled (375.000000 Watts, 0.009766* sec, clamp DISabled) cpu0: MSR_PP0_POWER_LIMIT: 0x00020000 (UNlocked) cpu0: Cores Limit: DISabled (0.000000 Watts, 0.001953 sec, clamp DISabled) cpu0: MSR_IA32_TEMPERATURE_TARGET: 0x00620000 (98 C) cpu0: MSR_IA32_THERM_STATUS: 0x883b0000 (39 C +/- 1) cpu0: MSR_IA32_THERM_INTERRUPT: 0x000a0507 (88 C, 93 C) cpu1: MSR_IA32_THERM_STATUS: 0x883b0000 (39 C +/- 1) cpu1: MSR_IA32_THERM_INTERRUPT: 0x000a0507 (88 C, 93 C) cpu2: MSR_IA32_THERM_STATUS: 0x883c0000 (38 C +/- 1) cpu2: MSR_IA32_THERM_INTERRUPT: 0x000a0507 (88 C, 93 C) cpu3: MSR_IA32_THERM_STATUS: 0x883c0000 (38 C +/- 1) cpu3: MSR_IA32_THERM_INTERRUPT: 0x000a0507 (88 C, 93 C) cpu4: MSR_IA32_THERM_STATUS: 0x883c0000 (38 C +/- 1) cpu4: MSR_IA32_THERM_INTERRUPT: 0x000a0507 (88 C, 93 C) cpu5: MSR_IA32_THERM_STATUS: 0x883c0000 (38 C +/- 1) cpu5: MSR_IA32_THERM_INTERRUPT: 0x000a0507 (88 C, 93 C) cpu6: MSR_IA32_THERM_STATUS: 0x883e0000 (36 C +/- 1) cpu6: MSR_IA32_THERM_INTERRUPT: 0x000a0507 (88 C, 93 C) cpu7: MSR_IA32_THERM_STATUS: 0x883e0000 (36 C +/- 1) cpu7: MSR_IA32_THERM_INTERRUPT: 0x000a0507 (88 C, 93 C) Core CPU Avg_MHz Busy% Bzy_MHz TSC_MHz IRQ SMI C1 C6 C1% C6% CPU%c1 CPU%c6 CoreTmp Pkg%pc3 Pkg%pc6 PkgWatt CorWatt - - 33 1.74 1900 2400 7395 0 223 7435 0.10 98.20 0.38 97.88 39 0.00 0.00 0.00 0.00 0 0 24 1.25 1900 2400 719 0 23 710 0.21 98.57 0.44 98.31 39 0.00 0.00 0.00 0.00 1 1 29 1.54 1900 2400 1025 0 18 811 0.05 98.44 0.29 98.16 39 2 2 33 1.76 1900 2400 1010 0 15 1191 0.02 98.27 0.34 97.90 38 3 3 22 1.14 1900 2400 979 0 21 957 0.06 98.84 0.32 98.54 38 4 4 27 1.40 1900 2400 629 0 36 838 0.02 98.62 0.28 98.33 38 5 5 28 1.46 1900 2400 749 0 36 1010 0.02 98.57 0.32 98.22 38 6 6 72 3.78 1900 2400 1233 0 30 925 0.26 96.00 0.57 95.65 36 7 7 30 1.58 1900 2400 1051 0 44 993 0.16 98.31 0.49 97.93 36 (In reply to Alex Forencich from comment #0) > > cpuinfo (1 core out of 8): > > processor : 0 > vendor_id : GenuineIntel > cpu family : 6 > model : 77 > model name : Intel(R) Atom(TM) CPU C2758 @ 2.40GHz #define INTEL_FAM6_ATOM_SILVERMONT2 0x4D /* Avaton/Rangely */ this is an Avaton platform. the turnostat output are consistent with the core_temp driver output. It seems that the real problem is that MSR stops updating... please 1. run "turbostat --debug --out turbostat.log" 2. stress cpu to make sure the temperature raises 3. quit turbostat and attach the turbostat.log here we can check if the other MSRs are updated properly. Bug closed because there is not response from the bug reporter. Please feel free to reopen it if you can provide the information required in comment #6. |