Bug 218244 - K10temp does not support AMD Threadripper 5000WX series (Family 19h, model 0x8)
Summary: K10temp does not support AMD Threadripper 5000WX series (Family 19h, model 0x8)
Status: RESOLVED CODE_FIX
Alias: None
Product: Drivers
Classification: Unclassified
Component: Hardware Monitoring (show other bugs)
Hardware: AMD Linux
: P3 normal
Assignee: Jean Delvare
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-12-08 21:11 UTC by bindkeys
Modified: 2023-12-13 11:51 UTC (History)
1 user (show)

See Also:
Kernel Version:
Subsystem:
Regression: No
Bisected commit-id:


Attachments
Modified patch (1.18 KB, application/mbox)
2023-12-11 15:55 UTC, Armin Wolf
Details

Description bindkeys 2023-12-08 21:11:10 UTC
lm-sensors does not read anything other than Tctl:

k10temp-pci-00c3
Adapter: PCI adapter
Tctl:         +37.0°C

in https://github.com/torvalds/linux/blob/master/drivers/hwmon/k10temp.c#L451-L477 there is no check for model 0x8 

# cat /sys/devices/system/cpu/modalias
cpu:type:x86,ven0002fam0019mod0008:feature:,0000,<lots of features>

My processor is a 5955WX but I believe model 0x8 stands for all TR 5000WX series (for example a 5975WX: https://linux-hardware.org/?probe=45a8669840&log=cpuid).
Comment 1 Artem S. Tashkinov 2023-12-09 15:04:22 UTC
k10temp is quite limited and that's exactly how it works now.

A totally different CPU:

k10temp-pci-00c3
Adapter: PCI adapter
Tctl:         +35.6°C
Comment 2 bindkeys 2023-12-09 20:13:52 UTC
What is that "totally different" (family, model? see the cat command in my original post) CPU? Perhaps it is not supported either? 

Anyway, I recompiled k10temp by simply adding a case for 0x8 along the rest of the Zen3 stuff:

--- linux-6.5.11/drivers/hwmon/k10temp.c        2023-11-08 15:09:07.000000000 +0200
+++ k10linux-6.5.11/drivers/hwmon/k10temp.c 2023-12-09 18:44:44.945290498 +0200
@@ -455,6 +455,7 @@ static int k10temp_probe(struct pci_dev

                switch (boot_cpu_data.x86_model) {
                case 0x0 ... 0x1:       /* Zen3 SP3/TR */
+               case 0x8:               /* Zen3 TR (5000WX) */
                case 0x21:              /* Zen3 Ryzen Desktop */
                case 0x50 ... 0x5f:     /* Green Sardine */
                        data->ccd_offset = 0x154;


Now I can see this, which is probably correct:

k10temp-pci-00c3
Adapter: PCI adapter
Tctl:         +45.0°C
Tccd3:        +36.0°C
Tccd5:        +38.2°C
Comment 3 Artem S. Tashkinov 2023-12-09 22:13:40 UTC
Tctl is the highest temp for the entire die, other temp sensors are quite inconsequential.

Please send the patch via email to the maintainer of the hw sensors subsystem.
Comment 4 bindkeys 2023-12-10 11:20:44 UTC
Actually, Tctl is used for fan control and it might have offsets for consistent fan policies across the platform. Tccd is the actual, *real* temperature.

I tried submitting a patch but failed miserably (never done it before, did not get accepted) and I'd rather not embarass myself any further, nor should I waste the hwmon guys time with my attempts.

So, if someone could submit a patch with the 0x8 case (as seen in my earlier post), that would be neat.

Here's how it looks under all-core stress-ng test, just to confirm the values look sane both in idle and under load:

k10temp-pci-00c3
Adapter: PCI adapter
Tctl:         +74.8°C
Tccd3:        +71.5°C
Tccd5:        +75.8°C
Comment 5 Armin Wolf 2023-12-11 15:40:12 UTC
No worries, its not embarrassing have a patch getting rejected, this happens regularly.
I will resubmit the patch for you.
Comment 6 Armin Wolf 2023-12-11 15:55:16 UTC
Created attachment 305586 [details]
Modified patch
Comment 7 Armin Wolf 2023-12-11 15:55:36 UTC
Can you check if the modified patch works?
Comment 8 bindkeys 2023-12-11 17:53:35 UTC
(In reply to Armin Wolf from comment #5)
> No worries, its not embarrassing have a patch getting rejected, this happens
> regularly.
> I will resubmit the patch for you.
Thank you very much!

(In reply to Armin Wolf from comment #7)
> Can you check if the modified patch works?
Yes, I can confirm it works. (I actually did compile the modules, instead of just looking at the patch and saying "yep that'll work").
Comment 9 Armin Wolf 2023-12-13 11:42:28 UTC
The patch has been accepted, your CPU should be supported by kernel 6.8.
Comment 10 Jean Delvare 2023-12-13 11:51:26 UTC
Thanks Armin. Here's a link to the git commit for reference:

https://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging.git/commit/?h=hwmon-next&id=62a991b3fcc9529c1a905ee42a4d860c31ca606a

Note You need to log in before you can comment on or make changes to this bug.