Created attachment 305976 [details] Output of 'dmesg' on Kernel 6.8 Hi, I have a Framework 13 AMD, which had 4 detected thermal zones though ACPI, but starting with kernel 6.8 they no longer appear and the following gets printed in dmesg: ``` [ 0.630727] ACPI: thermal: [Firmware Bug]: Invalid critical threshold (-274000) [ 0.630738] ACPI: thermal: [Firmware Bug]: No valid trip points! [ 0.630819] ACPI: thermal: [Firmware Bug]: Invalid critical threshold (-274000) [ 0.630828] ACPI: thermal: [Firmware Bug]: No valid trip points! [ 0.630904] ACPI: thermal: [Firmware Bug]: Invalid critical threshold (-274000) [ 0.630913] ACPI: thermal: [Firmware Bug]: No valid trip points! [ 0.630991] ACPI: thermal: [Firmware Bug]: Invalid critical threshold (-274000) [ 0.631000] ACPI: thermal: [Firmware Bug]: No valid trip points! ``` for comparison this is 6.7.9: ``` [ 0.632366] ACPI: thermal: Thermal Zone [TZ00] (42 C) [ 0.632593] ACPI: thermal: Thermal Zone [TZ01] (41 C) [ 0.632773] ACPI: thermal: Thermal Zone [TZ02] (39 C) [ 0.632867] ACPI: thermal: Thermal Zone [TZ03] (57 C) ``` Thanks, Steve
Would be great if you tried to bisect: https://docs.kernel.org/admin-guide/bug-bisect.html
Hi, sorry I had a busy week. Here's the output after bisecting: ``` Bisecting: 0 revisions left to test after this (roughly 0 steps) [9c8647224e9fabb765019193aa43c054a638f808] ACPI: thermal: Use library functions to obtain trip point temperature values ``` And after some debugging, it seems my device seems to report trip temps of 483.2K (210°C), but the kernel only checks the range 218K (-55°C) to 448K (175°C), which makes it think it's invalid. I'll attach a diff to raise the max to 488K (215°C); although I was wondering if that's enough or if maybe a value like 3276K (3003°C) would be better, since it's just below the signed 16bit limit which seems like it could be used as an invalid value on some devices. Thanks, Steve
Created attachment 306004 [details] Raise the max temperature of ACPI trip temps to 488K (215°C)
Created attachment 306005 [details] Raise the max temperature of ACPI trip temps to 488K (215°C) Sorry I thought I might make it a proper patch rather than a diff.
Let me at this to the regression tracker to ensure it does not fall through the cracks: #regzbot introduced: 9c8647224e9fabb765019193a #regzbot title: ACPI: thermal_lib: no ACPI Thermal Zones anymore #regzbot fix: ACPI: thermal_lib: Continue registering thermal zones even if trip points fail validation #regzbot monitor: https://lore.kernel.org/all/SY4P282MB3063EE2CC37BD0EF2318B746C5362@SY4P282MB3063.AUSP282.PROD.OUTLOOK.COM/
*** Bug 218652 has been marked as a duplicate of this bug. ***
Created attachment 306063 [details] Test patch for 6.8.x I tried Mario's patch from bug 218652; unfortunately it doesn't cleanly apply to 6.8.x (it looks like this code changed on master only a week ago). I made a corresponding patch for 6.8.x, and with this patch the sensors are back (though obviously the temperature thresholds are still bogus). Probably Mario's patch will work correctly on master, but I didn't want to run a true bleeding edge kernel in case other things are broken differently.
Quentin, can you try Stephen's suggestion above in comment 4 instead? I think that's more desirable if that works instead.
Patch: https://patchwork.kernel.org/project/linux-acpi/patch/SY4P282MB3063A002007A252337A416DEC5382@SY4P282MB3063.AUSP282.PROD.OUTLOOK.COM/
(In reply to Mario Limonciello (AMD) from comment #8) > Quentin, can you try Stephen's suggestion above in comment 4 instead? > > I think that's more desirable if that works instead. To raise the limit a bit? That might fix the Framework 16, but the same problem would exist on any other laptop that has invalid limits. Why is that more desirable? In any event, it looks like Rafael's patch is queued up for 6.9.