Bug 204807 - Hardware monitoring sensor nct6798d doesn't work unless acpi_enforce_resources=lax is enabled
Summary: Hardware monitoring sensor nct6798d doesn't work unless acpi_enforce_resource...
Status: RESOLVED CODE_FIX
Alias: None
Product: Drivers
Classification: Unclassified
Component: Platform_x86 (show other bugs)
Hardware: All Linux
: P1 high
Assignee: drivers_platform_x86@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-09-10 18:51 UTC by Artem S. Tashkinov
Modified: 2021-10-14 20:04 UTC (History)
53 users (show)

See Also:
Kernel Version: 5.2.11, git
Tree: Mainline
Regression: No


Attachments
acpidump -b (31.30 KB, application/x-xz)
2020-06-29 16:22 UTC, Artem S. Tashkinov
Details
acpidump -b (44.83 KB, application/x-compressed-tar)
2020-08-29 13:48 UTC, myhateisblind
Details
acpidump -b (43.39 KB, application/gzip)
2020-09-06 10:04 UTC, Jaap de Haan
Details
acpidump -b (114.97 KB, application/gzip)
2020-10-09 16:59 UTC, Lars Podszuweit
Details
acpidump -b for 5.9.6 and Asus X570-Plus TUF Gaming (153.33 KB, application/zip)
2020-11-09 14:49 UTC, lemniscattaden
Details
Dmesg after booting kernel 5.9.8 on Asus B550M TUF (86.33 KB, text/plain)
2020-11-18 18:04 UTC, myhateisblind
Details
dmesg after boot 5.9.8 Asus TUF Gaming X570-PLUS (93.29 KB, text/plain)
2020-11-20 21:12 UTC, lemniscattaden
Details
dmesg after boot ASUS PRIME X570-P (108.17 KB, text/plain)
2020-11-21 11:56 UTC, Jaap de Haan
Details
sensors not working acpi conflict (95.26 KB, text/plain)
2020-11-21 12:33 UTC, Facundo
Details
ASUS Prime B460 Plus dmesg (55.86 KB, text/plain)
2020-11-26 11:25 UTC, dflogeras2
Details
dmesg ASUS X570 TUF Gaming Pro (130.99 KB, text/plain)
2021-01-17 15:54 UTC, Thomas Langkamp
Details
acpicump -p ASUS X570 TUF Gaming Pro (79.28 KB, application/gzip)
2021-01-17 16:04 UTC, Thomas Langkamp
Details
dmesg for ASUS ROG CROSSHAIR VIII IMPACT with BIOS : 3204 (135.50 KB, text/plain)
2021-02-14 12:38 UTC, Gregory Duhamel
Details
acpidump for ASUS ROG CROSSHAIR VIII IMPACT with BIOS : 3204 (46.31 KB, application/octet-stream)
2021-02-14 12:41 UTC, Gregory Duhamel
Details
Dmesg for Asus B550M-plus WIFI (84.94 KB, text/plain)
2021-02-22 22:08 UTC, yasin inat
Details
dmesg for Asus ROG STRIX Z490-I (83.17 KB, text/plain)
2021-02-23 20:07 UTC, Michael Coote
Details
dmesg after boot ASUS PRIME X570-P bios 3602 (98.09 KB, text/plain)
2021-03-13 15:18 UTC, Jaap de Haan
Details
acpidump ASUS PRIME X570-P bios 3602 (41.07 KB, application/gzip)
2021-03-13 15:19 UTC, Jaap de Haan
Details
acpidump for Pro B550-C (44.47 KB, application/gzip)
2021-04-21 17:09 UTC, doomwarriorx
Details
Add support for access via Asus WMI to nct6775 (7.48 KB, patch)
2021-04-28 21:46 UTC, Bernhard Seibold
Details | Diff
Add support for access via Asus WMI to nct6775 (Rev 2) (22.59 KB, patch)
2021-05-04 22:08 UTC, Bernhard Seibold
Details | Diff
POC: Add support for access via Asus WMI to nct6775 by board name detect (21.24 KB, patch)
2021-08-30 20:47 UTC, Denis Pauk
Details | Diff
POC: Add support for access via Asus WMI to nct6775 by board/vendor name detect (21.47 KB, patch)
2021-09-04 20:46 UTC, Denis Pauk
Details | Diff
POC: Add support for access via Asus WMI to nct6775 use match_string (22.30 KB, patch)
2021-09-07 20:35 UTC, Denis Pauk
Details | Diff
System Info after applying latest patch (180.09 KB, application/zip)
2021-09-13 18:03 UTC, Jonathan
Details
dmesg for boot with Denis's patch applied (80.96 KB, text/plain)
2021-09-14 17:31 UTC, Jonathan
Details
POC: Add support for access via Asus WMI to nct6775 with debug (48.65 KB, patch)
2021-09-14 20:39 UTC, Denis Pauk
Details | Diff
Add support for access via Asus WMI to nct6775 (2021.09.20) (49.26 KB, patch)
2021-09-20 12:37 UTC, Denis Pauk
Details | Diff
Add support for access via Asus WMI to nct6775 (2021.09.21) (57.18 KB, patch)
2021-09-21 14:45 UTC, Denis Pauk
Details | Diff
Add support for access via Asus WMI (2021.09.25) (88.42 KB, patch)
2021-09-25 13:33 UTC, Denis Pauk
Details | Diff
Add support for access via Asus WMI (2021.10.05) (89.83 KB, patch)
2021-10-05 20:32 UTC, Denis Pauk
Details | Diff
Add support for access via Asus WMI (2021.10.10) (48.81 KB, patch)
2021-10-10 10:12 UTC, Denis Pauk
Details | Diff

Description Artem S. Tashkinov 2019-09-10 18:51:27 UTC
Without this kernel flag I see this on boot:


nct6775: Found NCT6798D or compatible chip at 0x2e:0x290
ACPI Warning: SystemIO range 0x0000000000000295-0x0000000000000296 conflicts with OpRegion 0x0000000000000290-0x0000000000000299 (\AMW0.SHWM) (20190509/utaddress-204)
ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver


I'd be glad to provide any required information.

This bug needs to be fixed because

1) It doesn't affect Windows
2) Average people will never know how to deal with issue
3) I cannot ask my motherboard vendor (ASUS) to fix this issue in BIOS because they don't provide support for Linux - they barely provide any support at all.
Comment 1 Artem S. Tashkinov 2019-09-11 14:01:20 UTC
Even with acpi_enforce_resources=lax I'm getting these messages on boot:

nct6775: Found NCT6798D or compatible chip at 0x2e:0x290
ACPI Warning: SystemIO range 0x0000000000000295-0x0000000000000296 conflicts with OpRegion 0x0000000000000290-0x0000000000000299 (\AMW0.SHWM) (20190509/utaddress-204)
ACPI: This conflict may cause random problems and system instability
ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver
Comment 2 Ulf 2020-01-01 17:17:39 UTC
Same for in 5.4.6 on a Asus Z170-WS (BIOS 3602 05/24/2019). dmesg says,

nct6775: Found NCT6793D or compatible chip at 0x2e:0x290
ACPI Warning: SystemIO range 0x0000000000000295-0x0000000000000296 conflicts with OpRegion 0x0000000000000290-0x0000000000000299 (\_GPE.HWM) (20190816/utaddress-204)
Comment 3 Artem S. Tashkinov 2020-04-20 10:42:57 UTC
Any updates on this one, Rui Zhang?

It's great you've specified HW but it surely looks like no one really cares.
Comment 4 Zhang Rui 2020-06-29 10:15:09 UTC
Please attach the acpidump output.

Well, TBH, it is probably that there is no way to fix this.
The root cause is that ACPI claims some resources that will possibly be used by the ACPI AML code, but the native nct6775 driver also requests the same piece of resource.
Comment 5 Artem S. Tashkinov 2020-06-29 16:22:27 UTC
Created attachment 289943 [details]
acpidump -b

(In reply to Zhang Rui from comment #4)
> Please attach the acpidump output.
> 
> Well, TBH, it is probably that there is no way to fix this.
> The root cause is that ACPI claims some resources that will possibly be used
> by the ACPI AML code, but the native nct6775 driver also requests the same
> piece of resource.

This doesn't seem right because Windows just works and doesn't require any hacks to be 100% functional on this PC. 99% of users will never know how to enable this option and will have a malfunctioning sensor.
Comment 6 myhateisblind 2020-08-29 13:48:36 UTC
Created attachment 292207 [details]
acpidump -b

Same issue with linux-next-git (5.9-rc2ish at the moment) on ASUS TUF B550M motherboard.
Comment 7 Jaap de Haan 2020-09-05 21:42:08 UTC
Same issue on latest ubuntu 20.04.1 LTS with an ASUS Prime X570-P, flashed at BIOS v2606.
Comment 8 Artem S. Tashkinov 2020-09-05 23:22:51 UTC
(In reply to Zhang Rui from comment #4)
> Please attach the acpidump output.
> 
> Well, TBH, it is probably that there is no way to fix this.
> The root cause is that ACPI claims some resources that will possibly be used
> by the ACPI AML code, but the native nct6775 driver also requests the same
> piece of resource.

From what I can see pretty much all recent AMD chipset motherboards users are affected. This sounds like something which must be fixed because we are talking about a very broad use case.
Comment 9 Jaap de Haan 2020-09-06 10:04:12 UTC
Created attachment 292371 [details]
acpidump -b

dump from my system ASUS Prime X570-P Bios v2606
Comment 10 Clodoaldo Pinto Neto 2020-10-07 19:57:44 UTC
Same issue on 5.8.13 on Gigabyte B550M DS3H.

out 07 07:09:44 d3.localdomain kernel: it87: Found IT8628E chip at 0xa40, revision 1
out 07 07:09:44 d3.localdomain kernel: it87: Beeping is supported
out 07 07:09:44 d3.localdomain kernel: ACPI Warning: SystemIO range 0x0000000000000A45-0x0000000000000A46 conflicts with OpRegion 0x0000000000000A45-0x0000000000000A46 (\GSA1.SIO1) (20200528/utaddress-204)
out 07 07:09:44 d3.localdomain kernel: ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver
out 07 07:09:44 d3.localdomain kernel: fuse: init (API version 7.31)
out 07 07:09:44 d3.localdomain systemd[1]: systemd-modules-load.service: Main process exited, code=exited, status=1/FAILURE
out 07 07:09:44 d3.localdomain systemd[1]: systemd-modules-load.service: Failed with result 'exit-code'.
out 07 07:09:44 d3.localdomain systemd[1]: Failed to start Load Kernel Modules.
out 07 07:09:44 d3.localdomain kernel: audit: type=1130 audit(1602065384.576:2): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=kernel msg='unit=systemd-modules-load comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'
Comment 11 Lars Podszuweit 2020-10-09 16:59:50 UTC
Created attachment 292911 [details]
acpidump -b

Same issue on ASUS TUF Z390M-PRO GAMING BIOS v2808  


nct6775: Enabling hardware monitor logical device mappings.
nct6775: Found NCT6798D or compatible chip at 0x2e:0x290
ACPI Warning: SystemIO range 0x0000000000000295-0x0000000000000296 conflicts with OpRegion 0x0000000000000290-0x0000000000000299 (\AMW0.SHWM) (20200528/utaddress-204)
Comment 12 lemniscattaden 2020-11-09 14:49:10 UTC
Created attachment 293593 [details]
acpidump -b for 5.9.6 and Asus X570-Plus TUF Gaming

I have the same problem with 5.9.6 kernel and Asus X570-Plus TUF Gaming motherboard.
Comment 13 Zhang Rui 2020-11-18 14:28:14 UTC
for all the people in this thread that reports the same problem, please attach the full dmesg output after boot.
Comment 14 myhateisblind 2020-11-18 18:04:02 UTC
Created attachment 293727 [details]
Dmesg after booting kernel 5.9.8 on Asus B550M TUF
Comment 15 lemniscattaden 2020-11-20 21:12:07 UTC
Created attachment 293751 [details]
dmesg after boot 5.9.8 Asus TUF Gaming X570-PLUS
Comment 16 Jaap de Haan 2020-11-21 11:56:48 UTC
Created attachment 293755 [details]
dmesg after boot ASUS PRIME X570-P

dmesg after boot ASUS PRIME X570-P
Comment 17 Facundo 2020-11-21 12:33:33 UTC
Created attachment 293757 [details]
sensors not working acpi conflict

Sensors not working with chip Nuvoton NCT6798D on Asus Primer X570-PRO
because of ACPI conflict: ACPI Warning: SystemIO range 0x0000000000000295-0x0000000000000296 conflicts with OpRegion
Comment 18 dflogeras2 2020-11-26 11:21:03 UTC
Also affected on ASUS Prime B460-Plus motherboard w v1403 BIOS.
Comment 19 dflogeras2 2020-11-26 11:25:04 UTC
Created attachment 293819 [details]
ASUS Prime B460 Plus dmesg

dmesg of ASUSM Prime B460 Plus booting
Comment 20 Thomas Langkamp 2021-01-17 15:53:12 UTC
Same with Nuvoton NCT6798D on Asus TUF Gaming X570 PRO and Kernel 5.10.2-2-MANJARO
Comment 21 Thomas Langkamp 2021-01-17 15:54:00 UTC
Created attachment 294703 [details]
dmesg ASUS X570 TUF Gaming Pro
Comment 22 Thomas Langkamp 2021-01-17 16:04:31 UTC
Created attachment 294705 [details]
acpicump -p ASUS X570 TUF Gaming Pro
Comment 23 Gregory Duhamel 2021-02-14 12:37:20 UTC
Same issue on : DMI: ASUS System Product Name/ROG CROSSHAIR VIII IMPACT, BIOS 3204 01/25/2021

ACPI Warning: SystemIO range 0x0000000000000295-0x0000000000000296 conflicts with OpRegion 0x0000000000000290-0x0000000000000299 (\AMW0.SHWM) (20200925/utaddress-204)

ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver
Comment 24 Gregory Duhamel 2021-02-14 12:38:43 UTC
Created attachment 295261 [details]
dmesg for ASUS ROG CROSSHAIR VIII IMPACT with BIOS : 3204
Comment 25 Gregory Duhamel 2021-02-14 12:41:00 UTC
Created attachment 295263 [details]
acpidump for ASUS ROG CROSSHAIR VIII IMPACT with BIOS : 3204
Comment 26 yasin inat 2021-02-22 22:08:37 UTC
Created attachment 295403 [details]
Dmesg for Asus B550M-plus WIFI

I have "acpi_enforce_resources=lax" in kernel line but still have the conflict problem. 

Probably not related but rest of the parameters: add_efi_memmap initrd=\amd-ucode.img mitigations=off video=current iommu=pt
Comment 27 Michael Coote 2021-02-23 20:07:30 UTC
Created attachment 295417 [details]
dmesg for Asus ROG STRIX Z490-I

dmesg for Asus ROG STRIX Z490-I. BIOS 11/30/2020 v1003

[    3.194752] nct6775: Found NCT6798D or compatible chip at 0x2e:0x290
[    3.194756] ACPI Warning: SystemIO range 0x0000000000000295-0x0000000000000296 conflicts with OpRegion 0x0000000000000290-0x0000000000000299 (\AMW0.SHWM) (20200528/utaddress-204)
[    3.194759] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver
Comment 28 frank 2021-03-05 06:40:12 UTC
Same also on Asus Prime H310I-Plus


Ubuntu 20.04.2 LTS
Kernel: 5.4.0-66-generic
acpi_enforce_resources=lax enabled

dmesg:

[    4.083054] nct6775: Found NCT6796D or compatible chip at 0x2e:0x290
[    4.083058] ACPI Warning: SystemIO range 0x0000000000000295-0x0000000000000296 conflicts with OpRegion 0x0000000000000290-0x0000000000000299 (\AMW0.SHWM) (20190816/utaddress-204)
[    4.083063] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver
Comment 29 Jaap de Haan 2021-03-13 15:18:38 UTC
Created attachment 295835 [details]
dmesg after boot ASUS PRIME X570-P bios 3602

ASUS PRIME X570-P after flashing newest bios 3602.
Comment 30 Jaap de Haan 2021-03-13 15:19:39 UTC
Created attachment 295837 [details]
acpidump ASUS PRIME X570-P bios 3602

ACPI Dump ASUS PRIME X570-P BIOS 3602
Comment 31 Zhang Rui 2021-03-18 04:20:47 UTC
I checked the dmesg and acpidump from Jaap,

[    3.497957] nct6775: Found NCT6798D or compatible chip at 0x2e:0x290
[    3.497963] ACPI Warning: SystemIO range 0x0000000000000295-0x0000000000000296 conflicts with OpRegion 0x0000000000000290-0x0000000000000299 (\AMW0.SHWM) (20200528/utaddress-204)
[    3.497969] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver

        Device (AMW0)
        {
            Name (_HID, EisaId ("PNP0C14") /* Windows Management Instrumentation Device */)  // _HID: Hardware ID
            Name (_UID, "ASUSWMI")  // _UID: Unique ID
        ...

So the resource conflict happens between native nct6775 driver and the ACPI asus_wmi driver.
My understanding is that asus_wmi/asus_nb_wmi do the same thing as nct6775 and expose them to hwmon class as well. If this is true, we can simply ignore these warning messages because "Yes, there is an ACPI driver is available for this device. And yes, we can use asus hwmon I/F instead of the native driver"

But this is just my guess. Need Hans to confirm on this.

For the other reporters in this thread, please
1. make sure your kernel is built with CONFIG_ASUS_WMI and CONFIG_ASUS_NB_WMI
2. attach the output of "sensors" command
because we should be able to see "asus" sensor in the output, provides similar functionality as nct6775 driver does.
Comment 32 Zhang Rui 2021-03-18 04:21:29 UTC
Reassign to platform driver category.
Comment 33 Matthew Garrett 2021-03-18 04:32:24 UTC
This isn't a bug - the ACPI tables claim the resource in question, and there's no way we can verify there are no conflicts between ACPI methods that touch that range and the native driver. If you're confident that this is safe on your system then you can boot with acpi_enforce_resources=lax, but we can't make that the default. This will still produce the warning, but the driver will be permitted to load.
Comment 34 Artem S. Tashkinov 2021-03-19 15:09:33 UTC
(In reply to Matthew Garrett from comment #33)
> This isn't a bug - the ACPI tables claim the resource in question, and
> there's no way we can verify there are no conflicts between ACPI methods
> that touch that range and the native driver. If you're confident that this
> is safe on your system then you can boot with acpi_enforce_resources=lax,
> but we can't make that the default. This will still produce the warning, but
> the driver will be permitted to load.

This bug needs to be fixed because

1) It doesn't affect Windows
2) Average people will never know how to deal with issue
3) I cannot ask my motherboard vendor (ASUS) to fix this issue in BIOS because they don't provide support for Linux - they barely provide any support at all.

OoB experience of Linux users should not be "I don't get any sensors output, how to fix that?" Most users don't even know what and how to Google. They don't know about dmesg either.

That's an effing horrible attitude.

I'm CC'ing Linus because I absolutely hate what's going on.
Comment 35 Artem S. Tashkinov 2021-03-19 15:14:14 UTC
This might not be a classic "bug" but **no one on Earth cares**. What people care about is having their systems work and be supported by Linux out of the box with **no cryptic voodoo applied**. You don't ask Windows users to run bcdedit.exe to fix their hardware, do you?

So, why do Linux users have to edit system configuration files to get at least comparable experience? Don't get me started that HWiNFO64 shows up to ten times more hardware sensors and their parameters than lm-sensors.
Comment 36 Artem S. Tashkinov 2021-03-19 15:15:19 UTC
Lastly, this problem affects literally hundreds of thousands of systems. It's not some single broken motherboard or broken EFI, we're talking about multiple classes of hardware.
Comment 37 Matthew Garrett 2021-03-19 19:13:59 UTC
Here's the situation. Your ACPI tables declare that your system firmware may access the addresses associated with your IO sensors. We have no idea what your firmware may do here - it may do nothing (in which case accessing the addresses is completely safe), or it may use them for its own internal monitoring. Sensor hardware frequently uses indexed addressing, which means that accessing a sensor requires something like the following:

1) Write the desired sensor to the index register
2) Read the sensor value from the data register

These can't occur simultaneously, so if both the OS and the firmware are accessing it you risk ending up with something like:

1) Write sensor A to the index register (from the OS)
2) Write sensor B to the index register (from the firmware)
3) Read the sensor value from the data register (returns the value of sensor B to the firmware)
4) Read the sensor value from the data register (returns the value of sensor B to the OS)

The OS asked for the value of sensor A, but received the value of sensor B. From the OS side this is probably not a big deal (you get a weird value in your graphing), but if it happens the other way around the firmware may decide that the system is running out of spec and shut it down to avoid damage. This is not a good user experience.

Why does Windows not have the same problem? Well, in the general case there's nothing stopping it from doing so. Vendor tooling usually takes one of two approaches:

1) They don't use the hardware sensors directly, they use firmware interfaces to them. This is alluded to in comment #31 - on Asus systems, the sensors are available via a WMI interface. Using a firmware interface ensures that the firmware knows what the state of the hardware is, and avoids any race conditions. Your board may well support an alternative firmware interface and Linux simply lacks driver support for it. If so, I'm afraid that the correct solution is to add that driver support. Given that this bug has ended up covering boards from multiple vendors, it's no longer the correct place to handle that, though.
2) The vendor knows that the firmware makes no policy decisions based on the sensor values, so it's safe to access the resources even though the firmware declares that it uses them. The problem with this approach is that *we* have no way of knowing that it's safe, and the consequences of it being unsafe include data loss. Given the choice between users being able to look at system temperatures and users not losing data, we choose to prioritise users not losing data.

Looking at your ACPI tables, we see the following:

    Name (IOHW, 0x0290)

    OperationRegion (SHWM, SystemIO, IOHW, 0x0A)
    Field (SHWM, ByteAcc, NoLock, Preserve)
    {
        Offset (0x05), 
        HIDX,   8, 
        HDAT,   8
    }

This means that there's a region of IO ports starting at address 0x290 and 0x0a addresses long. This is the same region of port IO that your sensor chip uses. Within that address range, we declare that 0x295 is called HIDX, and 0x296 is called HDAT. This is consistent with an index and data register as described above, which means that having the OS access this space directly is likely to race with the firmware (ie, it's dangerous).

Near here are two methods called RHWM and WHWM. At a guess, that's "Read Hardware Monitoring" and "Write Hardware Monitoring". These not only access the sensors via the registers described above, they do some additional hardware access around it. This is further evidence to support there being some handshaking involved to avoid race conditions - the firmware takes a mutex and appears to hit some other register that may also be used to guard against racing against system management mode. We really, *really* want to be using the firmware methods here rather than touching the sensor chip directly. At this point, direct access isn't so much walking past a sign saying "Danger, keep out", it's a sign saying "Proceed no further or you will die slowly and it will hurt the entire time".

RHWM is referenced from the WMBD method if the first argument to it is RHWM, and WHWM is referenced if the argument is WHWM. WMBD is the WMI dispatcher for the WMI function with identifier "BD" - looking at your _WDG object, which describes the available WMI interfaces, we have the following:

            Name (_WDG, Buffer (0x50)
            {
                /* 0000 */  0xD0, 0x5E, 0x84, 0x97, 0x6D, 0x4E, 0xDE, 0x11,  // .^..mN..
                /* 0008 */  0x8A, 0x39, 0x08, 0x00, 0x20, 0x0C, 0x9A, 0x66,  // .9.. ..f
                /* 0010 */  0x42, 0x43, 0x01, 0x02, 0xA0, 0x47, 0x67, 0x46,  // BC...GgF
                /* 0018 */  0xEC, 0x70, 0xDE, 0x11, 0x8A, 0x39, 0x08, 0x00,  // .p...9..
                /* 0020 */  0x20, 0x0C, 0x9A, 0x66, 0x42, 0x44, 0x01, 0x02,  //  ..fBD..
                /* 0028 */  0x72, 0x0F, 0xBC, 0xAB, 0xA1, 0x8E, 0xD1, 0x11,  // r.......
                /* 0030 */  0x00, 0xA0, 0xC9, 0x06, 0x29, 0x10, 0x00, 0x00,  // ....)...
                /* 0038 */  0xD2, 0x00, 0x01, 0x08, 0x21, 0x12, 0x90, 0x05,  // ....!...
                /* 0040 */  0x66, 0xD5, 0xD1, 0x11, 0xB2, 0xF0, 0x00, 0xA0,  // f.......
                /* 0048 */  0xC9, 0x06, 0x29, 0x10, 0x4D, 0x4F, 0x01, 0x00   // ..).MO..
            })

The format of _WDG is 16 bytes of GUID, 2 bytes of ID or notification data, 1 byte of instance count and 1 byte of flags. The GUID used by asus-wmi corresponds to the first GUID in this file, 97845ED0-4E6D-11DE-8A39-0800200C9A66. That has an ID of 0x4243, or BC - ie, it's not the GUID we're looking for. The next GUID, however, (466747a0-70ec-11de-8a39-0800200c9a66) has an identifier of 0x4344, or BD. So this is the GUID we're looking for. Unfortunately asus-wmi doesn't handle this GUID, so new code will need to be written.

I'm going to close this bug again because it's turned into a generic bug covering different motherboard vendors, and there's no one size fits all solution. For your case the correct way to handle it is for someone to write a driver that uses the 466747a0-70ec-11de-8a39-0800200c9a66 interface to expose the sensor data. I'm afraid I don't have relevant hardware so can't do this myself, but please do open another bug for that.

tl;dr - the kernel message you're seeing is correct. Avoiding it requires a new driver to be written. If you *personally* feel safe in ignoring the risks, you can pass the acpi_enforce_resources=lax option, but that can't be the default because it's unsafe in the general case, and so it isn't the solution to the wider problem.
Comment 38 Jaap de Haan 2021-03-20 07:22:46 UTC
CONFIG_ASUS_WMI=m andI confirmed the module is loaded.

I think I saw some improvements lastly in the support of temperature sensors, I am not so sure because I have no traces of the old state and it's a long time ago I used the UI. I flashed my BIOS recently and hoped things would be solved with that action.

Thanks a lot Matthew for this good explanation and for the first time I understood (at abstract level) what is going on and why it is so. This explanation is something really valuable to be kept and put in a prominent place like kernel Documentation and a known issues text file (then a less asus specific explanation) IMO.

I was nearly as desperate to try to use the `acpi_enforce_resources=lax` setting but without understanding it is for me as an engineer something "hot" and now I really get why it is so, I will for my part keep my fingers away from the setting and hope that someone will find out how to get the FAN values in the normal driver.

Many thanks for the clarification.
Comment 39 Matthew Garrett 2021-03-20 07:51:03 UTC
As noted in https://twitter.com/james_hilliard/status/1373178256615211012, there's actually a driver here: https://github.com/electrified/asus-wmi-sensors/ . I did a quick search earlier, but managed to miss this somehow.
Comment 40 myhateisblind 2021-03-20 07:57:37 UTC
Are you sure about that driver? The github page says:

"Note: X570/B550/TRX40 boards do not have the WMI interface and are not supported."

And those seems to be the chipsets of all or almost all boards reported in this bug.

20 mar. 2021 8:51:07 bugzilla-daemon@bugzilla.kernel.org:

> https://bugzilla.kernel.org/show_bug.cgi?id=204807
> 
> --- Comment #39 from Matthew Garrett (mjg59-kernel@srcf.ucam.org) ---
> As noted in https://twitter.com/james_hilliard/status/1373178256615211012,
> there's actually a driver here:
> https://github.com/electrified/asus-wmi-sensors/ . I did a quick search
> earlier, but managed to miss this somehow.
> 
> -- 
> You may reply to this email to add a comment.
> 
> You are receiving this mail because:
> You are on the CC list for the bug.
Comment 41 Matthew Garrett 2021-03-20 08:16:05 UTC
Interesting, it looks like it uses the same GUID but has a different set of methods. So yes, this driver probably won't work for a bunch of the boards here - it would need to be adapted to add support for the methods that these ones provide.
Comment 42 Artem S. Tashkinov 2021-03-20 15:28:06 UTC
(In reply to Matthew Garrett from comment #37)
> 97845ED0-4E6D-11DE-8A39-0800200C9A66. That has an ID of 0x4243, or BC - ie,
> it's not the GUID we're looking for. The next GUID, however,
> (466747a0-70ec-11de-8a39-0800200c9a66) has an identifier of 0x4344, or BD.
> So this is the GUID we're looking for. Unfortunately asus-wmi doesn't handle
> this GUID, so new code will need to be written.
> 
> I'm going to close this bug again because it's turned into a generic bug
> covering different motherboard vendors, and there's no one size fits all
> solution. For your case the correct way to handle it is for someone to write
> a driver that uses the 466747a0-70ec-11de-8a39-0800200c9a66 interface to
> expose the sensor data. I'm afraid I don't have relevant hardware so can't
> do this myself, but please do open another bug for that.
> 
> tl;dr - the kernel message you're seeing is correct. Avoiding it requires a
> new driver to be written. If you *personally* feel safe in ignoring the
> risks, you can pass the acpi_enforce_resources=lax option, but that can't be
> the default because it's unsafe in the general case, and so it isn't the
> solution to the wider problem.

That's the problem: we have _multiple_ motherboards with _multiple_ different chipsets from _different_ vendors 

1) all having the same glitch
2) all requiring the same workaround
3) working just fine under Windows with no hacks

> My understanding is that asus_wmi/asus_nb_wmi do the same thing as nct6775
> and expose them to hwmon class as well.

And at the same time you're talking about asus_wmi which covers only _certain_ ASUS motherboards, and no one in this discussion has shown it to work or provide the same set of sensors.

And this driver has nothing to do with sensors, linux/drivers/platform/x86/asus-wmi.c:

 * Asus PC WMI hotkey driver

This is not a driver which even tangentially deals with HW sensors found in motherboards affected by this bug.

I don't know why you're trying to sweep this bug under the rug but I really dislike it. The Linux kernel development has always followed common sense principles and it contains a _huge_ number of workarounds just to enable HW which doesn't work according to specs.

At the very least you could printk() this:

"Your motherboard might not exposing ACPI resources correctly, so you might not get access to your HW sensors. You could add "acpi_enforce_resources=lax" to kernel boot parameters to enable monitoring at your own risk. Please refer to https://bugzilla.kernel.org/show_bug.cgi?id=204807 for more information".

And this still paints Linux in a very bad light as users hardly care about if ACPI is implemented according to the specifications or not: however what they really care is whether their hardware works or being supported under Linux regardless out of the box. Most Linux users don't even know `dmesg` exists, so they have no way of knowing how to fix the issue.

Lastly, this bug is not fixed.
Comment 43 Artem S. Tashkinov 2021-03-20 15:33:14 UTC
A small correction of my previous comment:

linux/drivers/platform/x86/asus-nb-wmi.c

/*
 * Asus Notebooks WMI hotkey driver
 *
 * Copyright(C) 2010 Corentin Chary <corentin.chary@gmail.com>
 */

This is not related to lm-sensors in any shape or form. I'm really sad how this situation is getting handled: the bug has been known for over 1.5 years, affects literally hundreds of thousands devices and you're saying that this kernel option might have unintended consequences yet _everyone_ in this thread has enabled it with _zero_ side affects and Windows seemingly has it enabled by default, as no such messages are getting logged in Windows Event Log either when using HWiNFO64 or vendor specific monitoring software.
Comment 44 Artem S. Tashkinov 2021-03-20 15:43:47 UTC
(In reply to Artem S. Tashkinov from comment #42)
> "Your motherboard might not be exposing ACPI resources correctly, so you
> might
> not get access to your HW sensors. You could add
> "acpi_enforce_resources=lax" to kernel boot parameters to enable monitoring
> at your own risk. Please refer to
> https://bugzilla.kernel.org/show_bug.cgi?id=204807 for more information".
 
This message will at least allow various Linux distros to enable the option by default because many are not aware of the bug.
Comment 45 Matthew Garrett 2021-03-20 16:04:47 UTC
Artem,

Nobody is denying there's an issue here. However, the issue is that an additional driver needs to be written for this hardware. Please file a new bug for that and do not keep reopening this one.
Comment 46 Zhang Rui 2021-03-21 18:39:54 UTC
(In reply to Artem S. Tashkinov from comment #43)
> A small correction of my previous comment:
> 
> linux/drivers/platform/x86/asus-nb-wmi.c
> 
> /*
>  * Asus Notebooks WMI hotkey driver
>  *
>  * Copyright(C) 2010 Corentin Chary <corentin.chary@gmail.com>
>  */
> 
> This is not related to lm-sensors in any shape or form.

asus_nb_wmi_init -> asus_wmi_register_driver -> asus_wmi_probe -> asus_wmi_add -> asus_wmi_hwmon_init

Although the warning messages are printed by ACPI code, but this is a conflict between the native nct6775 driver and the Asus wmi driver, because Asus wmi driver accesses the same piece of resources and provide similar functionalities. And I'm familiar with neither of them.

> I'm really sad how
> this situation is getting handled: the bug has been known for over 1.5
> years, affects literally hundreds of thousands devices and you're saying
> that this kernel option might have unintended consequences yet _everyone_ in
> this thread has enabled it with _zero_ side affects and Windows seemingly
> has it enabled by default, as no such messages are getting logged in Windows
> Event Log either when using HWiNFO64 or vendor specific monitoring software.

In Linux, at least for now, I don't see a way to enable native nct6775 driver by default, and, this is true for all the native drivers that have resource conflict with the firmware.

IMO, the rootcause is that Linux does not support override driver A (native driver in this case) when driver B (driver that talks to firmware) is loaded, so we have to disable driver A even if there is only 0.01% possibility that driver B will be loaded when we know there might be a conflict.

what we can do is to write driver B to make this statement true
"ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver" and ignore this message.

(In reply to Artem S. Tashkinov from comment #44)
> (In reply to Artem S. Tashkinov from comment #42)
> > "Your motherboard might not be exposing ACPI resources correctly, so you
> > might
> > not get access to your HW sensors. You could add
> > "acpi_enforce_resources=lax" to kernel boot parameters to enable monitoring
> > at your own risk. Please refer to
> > https://bugzilla.kernel.org/show_bug.cgi?id=204807 for more information".
> 
> This message will at least allow various Linux distros to enable the option
> by default because many are not aware of the bug.

Hmmm, what about following conditions
1. "acpi_enforce_resources=" is a global switch, there might be platforms with more than one conflict, or with another conflict rather than nct6775. we can not validate all of them.
2. we may have new drivers that talk with firmware later, and we can not use "acpi_enforce_resources=lax" then.

But thanks for raising this up, I think this also rings a bell that the current message is kind of misleading.
It is true that ACPI covers a series of devices as described in the ACPI spec. But at the same time, ACPI is an interface. Many drivers, including vendor specific drivers, talks with firmware through the ACPI Interface. They depends on ACPI, but they're actually not covered by the ACPI specification, nor by kernel drivers/acpi code.

"ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver" makes people feel like it is an ACPI problem, but in many cases, it is not, I can only triage them.
Comment 47 Hans de Goede 2021-03-21 19:14:46 UTC
So if someone is willing to spend time on making this work, then here is how I believe this could be made to work (for the case which Matthew Garrett analysed):

1. Modify the nct6775 driver, adding a set of nct6775_register_ops to the nct6775_sio_data struct and have any function which sits "below" the probe() function only use these ops to do register accesses. Combined with having sensors_nct6775_init() set these register-ops to the currently used superio register access functions (so that nothing changes for existing users of the driver).

2. Move the nct6775_sio_data struct declaration to a shared header somewhere under include/linux

3. Have a new WMI driver which defines register-ops compatible with the ones expected by the nct6775_sio_data struct, using the RHWM and WHWM methods which Matthew found (note these should be called through their WMI wrappers) and have this driver instantiate a platform device, with its platdata set to this new nct6775_sio_data struct, allowing the nct6775 driver to access the registers this way, using the mutual-exclusion mechanism build into the RHWM and WHWM methods.

As the drivers/platform/x86 maintainer I would be more then happy to merge a clean driver for step 3. To me this seems quite doable (to someone with some kernel-dev experience + enough time).

Note I believe that this will not be a whole lot of work (but its not trivial either).
Comment 48 Andy Shevchenko 2021-03-22 10:12:49 UTC
Artem,
Matthew gave a really good explanation on techical background what's going on. What you really need is to amend existing driver(s) or provide a new one to fulfill the functionality you want to have.
Comment 49 Artem S. Tashkinov 2021-03-22 10:51:55 UTC
(In reply to Andy Shevchenko from comment #48)
> Matthew gave a really good explanation on techical background what's going
> on. What you really need is to amend existing driver(s) or provide a new one
> to fulfill the functionality you want to have.

I'm not a programmer let alone a person who understand the innards of the Linux kernel to even attempt to fix the issue, not to mention that:

> Note I believe that this will not be a whole lot of work (but its not trivial
> either).

Maybe we have ... kernel developers who can do that instead, for instance lm-sensors maintainers. I don't know. I'm confused. I did my best to report the issue. Meanwhile I'll continue to use the hack since I want to monitor my HW right now - not a few years later when someone finally ventures to scratch the itch. Thank you very much ;-)
Comment 50 Hans de Goede 2021-03-22 11:06:10 UTC
> Maybe we have ... kernel developers who can do that instead

You now kernel developers are humans too, so they need to eat and sleep and stuff too. IOW they don't have unlimited time to spend on helping every Linux user out there without any compensation.

Maybe you have a friend with some kernel-development experience who can help. Or maybe you can find someone who you can pay to fix this for you?
Comment 51 Andy Shevchenko 2021-03-22 11:32:27 UTC
(In reply to Artem S. Tashkinov from comment #49)
> (In reply to Andy Shevchenko from comment #48)
> > Matthew gave a really good explanation on techical background what's going
> > on. What you really need is to amend existing driver(s) or provide a new
> one
> > to fulfill the functionality you want to have.
> 
> I'm not a programmer let alone a person who understand the innards of the
> Linux kernel to even attempt to fix the issue, not to mention that:
> 
> > Note I believe that this will not be a whole lot of work (but its not
> trivial
> > either).
> 
> Maybe we have ... kernel developers who can do that instead, for instance
> lm-sensors maintainers. I don't know. I'm confused. I did my best to report
> the issue. Meanwhile I'll continue to use the hack since I want to monitor
> my HW right now - not a few years later when someone finally ventures to
> scratch the itch. Thank you very much ;-)

Artem,
I feel your pain. Believe me, I have got into the similar situation(s) myself being actually a kernel developer! I'm often being frustrated, but that's how it works in Linux and in OSS in general. The root cause here is the production model used by world of Windows and world of Linux (and besides the downsides like above I prefer the latter). For Windows the drivers are made for *THE product* while in *nix world the drivers try to cover as many products as they can with regard to the similarities and compatibility of the corresponding IPs.
That's why people often see "oh, hey, it works in Windows!" Yes, it works, but if and only if you are using the very same *THE product*. Step right or left will be a suicidal in that model. The Windows model is very fragile because of this and requires 10x times more resources to develop the code. OSS community simply does not have such resources to fulfill a job and due to economical reasons even Micro$oft also found advantages in the OSS model (but not with the drivers, unfortunately). The best help for you and for the rest is to be on the constructive side. You see, you even may yourself to develop a solution and become (a well paid) kernel developer. Or just for fun (look at the example of Intel IPU3 CIO2 camera glue layer (to support Windows only platforms) which is done solely by one guy who declared that he even didn't know C programming language before!

So, please, do not blame people here, it's rather the problem of the model.
Comment 52 frostzeux 2021-03-22 11:35:45 UTC
"That's an effing horrible attitude", Artem (#c34).
*leaving this rant*
Comment 53 Hans de Goede 2021-03-22 14:31:29 UTC
I'm also removing myself from the Cc of this bug because the discussion here does not seem to be productive. If anyone wants to implement the solution which I outlined in command 47, drop me an email at hdegoede@redhat.com .
Comment 54 Mateusz Jończyk 2021-04-11 08:25:36 UTC
Hello,

I was doing some preparatory work to implement the solution in comment 47 - like analysis of source code.

Unfortunately, it seems like this solution would only work for ASUS boards. All the acpidump outputs in this ticket are from ASUS boards. The "RHWM" and "WHWM" methods are from an interface with UID="ASUSWMI", so they look to be asus-specific.

For ASUS boards there exists a better driver:

https://github.com/electrified/asus-wmi-sensors

so there is probably no reason to implement direct access to nct6775.

Are there any benefits from implementing access to nct6775 as outlined above?

Greetings,
Comment 55 Hans de Goede 2021-04-11 09:40:41 UTC
(In reply to Mateusz Jończyk from comment #54)
> For ASUS boards there exists a better driver:
> 
> https://github.com/electrified/asus-wmi-sensors

Interesting I wonder why that has not been submitted upstream. I'll open an issue at its github page for that.

> so there is probably no reason to implement direct access to nct6775.
> 
> Are there any benefits from implementing access to nct6775 as outlined above?

No, that was just meant as a possible solution for the reported problem. I agree that using the WMI interface, which presumably is what Asus' Windows tools use, is better.
Comment 56 Hans de Goede 2021-04-11 09:46:53 UTC
Hmm,

asus-wmi-sensors also is not such a great solution, it seems the WMI interface is buggy on some boards and causes fans to stop or get stuck at max speed, which is quite bad, see:

https://github.com/electrified/asus-wmi-sensors#known-issues

So it seems that the situation with sensors on these boards simply sucks and Asus is to blame here. If even the "official" method of accessing the sensors is buggy then Asus needs to get their firmware fixed and until that is done users are better of without sensors support.
Comment 57 Mateusz Jończyk 2021-04-11 10:18:05 UTC
(In reply to Hans de Goede from comment #56)
> Hmm,
> 
> asus-wmi-sensors also is not such a great solution, it seems the WMI
> interface is buggy on some boards and causes fans to stop or get stuck at
> max speed, which is quite bad, see:
> 
> https://github.com/electrified/asus-wmi-sensors#known-issues

IMHO, this could be caused by access races, not necessarily by a buggy BIOS. The driver may simply not implement correct synchronization methods. It may be necessary to call some ACPI / WMI methods before and after accessing the sensors to avoid resource conflicts.

As is written in the documentation:
> The more frequently the WMI interface is polled the greater the potential for
> this to happen.

I am also not sure if the driver implements correct locking behavior kernel-wise.
Comment 58 Hans de Goede 2021-04-11 10:27:11 UTC
(In reply to Mateusz Jończyk from comment #57)
> (In reply to Hans de Goede from comment #56)
> > Hmm,
> > 
> > asus-wmi-sensors also is not such a great solution, it seems the WMI
> > interface is buggy on some boards and causes fans to stop or get stuck at
> > max speed, which is quite bad, see:
> > 
> > https://github.com/electrified/asus-wmi-sensors#known-issues
> 
> IMHO, this could be caused by access races, not necessarily by a buggy BIOS.
> The driver may simply not implement correct synchronization methods. It may
> be necessary to call some ACPI / WMI methods before and after accessing the
> sensors to avoid resource conflicts.

Perhaps, but usually WMI methods take the locks which they need on entry and release them on exit. I'm not even sure if an ACPI method (which this ultimately is) can hold locks after it exits, I would not be surprised if all acquired locks are automatically dropped on exit from the interpreter.

Also note that the README also states that on some motherboards the problems are fixed in later BIOS versions, which also points to a race inside the AML code and not a bug in the driver.

> As is written in the documentation:
> > The more frequently the WMI interface is polled the greater the potential
> for
> > this to happen.
> 
> I am also not sure if the driver implements correct locking behavior
> kernel-wise.

I did not check, but this should not matter, that may mess up the driver's state, but the WMI code is expected to do its own locking at the AML level, to e.g. protect against similar accesses to the super IO through the ACPI thermal region interface.

Note I'm not claiming that this is definitely not an issue with the driver, it could be. But I've seen a lot of very buggy AML code and I've yet to find a single vendor which does not write very low quality AML code. It seems there is absolutely no code-review done on the AML code and very little QA.
Comment 59 Mateusz Jończyk 2021-04-11 10:30:27 UTC
The Asus X570-Plus TUF Gaming was described in this ticket as not working. It is listed as not supported by this driver on GitHub. So there are some devices without a working WMI interface that would benefit from the handling in comment 47.

> I agree that using the WMI interface, which presumably is what Asus' Windows
> tools use, is better.

It also does not require guessing voltage divider parameters, which makes raw access to nct6775 not that much useful.
Comment 60 Kamil Dudka 2021-04-11 11:20:19 UTC
asus-wmi-sensors was already mentioned in comment #39.  I tried it with ASUS PRIME B360-PLUS but no device was matched by the driver.  It could have been some user error though.
Comment 61 Artem S. Tashkinov 2021-04-12 12:39:57 UTC
(In reply to Matthew Garrett from comment #39)
> As noted in https://twitter.com/james_hilliard/status/1373178256615211012,
> there's actually a driver here:
> https://github.com/electrified/asus-wmi-sensors/ . I did a quick search
> earlier, but managed to miss this somehow.

From its description:

Note: X570/B550/TRX40 boards do not have the WMI interface and are not supported.
Comment 62 Kamil Dudka 2021-04-12 13:25:01 UTC
Yes, my board was neither listed as supported, nor as unsupported/unknown in the mentioned README file.
Comment 63 Sydney Meyer 2021-04-12 22:42:20 UTC
Hello all,

perhaps this is the wrong place to ask such a question, but after reading many sites on the interwebs about the issue, i am left with the impression that most people (me included) do not actually understand the implications introduced by turning on/off knobs like "acpi_enforce_resources=lax". Also, i read a lot, mostly unclear, comments about "hardware damage" and therefore would like to ask, what is actually the recommended way to go about this with the situation as it is now? Is this issue perhaps only relevant for manual fan control, because with or without "acpi_enforce_resources=lax" and the nct6775 kernel module loaded, the system appears to adjust the fan speed for the appropriate load either way and there aren't any noticable differences between CPU temps either. So i guess my question basically boils down to this: Is there actually something to worry about, apart from not beeing able to see/control fan speeds? I just have become a little worried now with all the contradictive information out there, also read (on phoronix) about this [1] and this [2] a few weeks ago. This is a Asus X570 Gaming-E Board with a Ryzen 5950X CPU. As a regular user, am i going to fry my little computer by running Linux on it?

I understand that nobody will guarantee anything, of course, i just felt this might be a good place for a qualified answer, because, obviously, i don't understand any of this low-level stuff.

Thanks a bunch.

[1] Linux 5.11 Drops AMD Zen Voltage/Current Reporting Over Lack Of Documentation 
https://www.phoronix.com/scan.php?page=news_item&px=Linux-5.11-Drops-k10temp-V-C
[2] AMD Ryzen 5000 Temperature Monitoring Support Sent In For Linux 5.12
https://www.phoronix.com/scan.php?page=news_item&px=Zen-3-Desktop-CPU-k10temp
Comment 64 Hans de Goede 2021-04-13 06:11:25 UTC
Sydney, I understand that all the discussion can be somewhat confusing.

It should be perfectly safe to run Linux on your computer (but as you said no there is no warranty), by default Windows also does not come with any software to monitor the nct6775 sensors. So when installing Linux without making any changes your computer will run the same way as with a pristine (no extra sw installed) windows install.

Under Linux you will even be able to monitor the CPU temperature using the CPU's builtin temp-sensors. What does not work is monitoring other temperatures, voltages and fan-speeds. Nor controlling fan-speeds.  But typically a modern motherboard will automatically control the CPU fan speed based on temperature, without needing the OS to do anything; Also most users typically use their computer for other things then to monitor the computers temps and voltages.

Matthew rightly advises against using "acpi_enforce_resources=lax" because that opens races between the firmware and Linux which could result in writing to another superIO register then intended. This can definitely lead to e.g. stopping the fans even though the CPU is running hot, which is not good but all modern CPUs have builtin overtemp protection, so at the worst the system will simply shutdown (1). 

Theoretically this could also lead to worse outcomes, such us changes your CPU or RAM voltage which could damage your hardware. I am aware of at least one semi-related case where RAM got seriously overvolted damaging both the RAM and the CPU, this was not with a Super-IO solution though, but with I2C attached sensor probing.

1) Repeatedly overheating your CPU to where it automatically shuts down is not good for your CPU's health though and will likely shorten its lifetime.

TL;DR: Don't use "acpi_enforce_resources=lax", otherwise running Linux should be safe and everything should work fine.
Comment 65 Sydney Meyer 2021-04-13 22:04:13 UTC
Hello Hans,

thank you for taking the time to answer my question.

Your analogy to a (if there is such a thing) "pristine" ~20GB Windows installation makes indeed sense and i imagined something like this already without understanding it properly, but it is indeed reassuring, hearing this from someone who has a much, much better understanding of the subsystem at hand.

Personally, i don't have any need for monitoring voltages or manually adjusting fan curves, etc.

Also, over the past ~15 years, i have not once been let down by following kernel developers advice, because even if i did not fully understand the issue or even just at hindsight, there has always been deductive, and like Artem has ascertained, sometimes frustrating, yes, but always deductive reasoning behing the decisions and defaults, like it appears to be the case (lack of documentation and/or vendor support) here. A kind of established trust, really. And even if incorrect, _always_ with best knowledge and conscience. IMO, this is all a user can ask for, and unfortunately, albeit mostly in the commercial SW ecosystem, not a given anymore. I would trade these virtues for warranty any day.

TL;DR: Much thanks for answering the question this detailed. Flatter/sweet-talk. Will link your post for people with similar concerns. Big thanks again and have a nice week, Hans et al.
Comment 66 Hans de Goede 2021-04-14 07:58:21 UTC
Sydney, thank you for your kind words, you have put a smile on my face, so thank you.
Comment 67 Artem S. Tashkinov 2021-04-15 09:27:53 UTC
(In reply to Hans de Goede from comment #64)
> 
> Matthew rightly advises against using "acpi_enforce_resources=lax" because
> that opens races between the firmware and Linux which could result in
> writing to another superIO register then intended. This can definitely lead
> to e.g. stopping the fans even though the CPU is running hot, which is not
> good but all modern CPUs have builtin overtemp protection, so at the worst
> the system will simply shutdown (1). 
> 

Multiple users use acpi_enforce_resources=lax and I haven't seen a single report that it's ever broken anything.

AFAIK no one has used this hack to control fans using PWM, so that might indeed lead to unintended consequences.
Comment 68 myhateisblind 2021-04-15 09:30:04 UTC
I use it for that, and had no problem... yet.

15 abr. 2021 11:27:55 bugzilla-daemon@bugzilla.kernel.org:

> https://bugzilla.kernel.org/show_bug.cgi?id=204807
> 
> --- Comment #67 from Artem S. Tashkinov (aros@gmx.com) ---
> (In reply to Hans de Goede from comment #64)
>> 
>> Matthew rightly advises against using "acpi_enforce_resources=lax" because
>> that opens races between the firmware and Linux which could result in
>> writing to another superIO register then intended. This can definitely lead
>> to e.g. stopping the fans even though the CPU is running hot, which is not
>> good but all modern CPUs have builtin overtemp protection, so at the worst
>> the system will simply shutdown (1).
>> 
> 
> Multiple users use acpi_enforce_resources=lax and I haven't seen a single
> report that it's ever broken anything.
> 
> AFAIK no one has used this hack to control fans using PWM, so that might
> indeed
> lead to unintended consequences.
> 
> -- 
> You may reply to this email to add a comment.
> 
> You are receiving this mail because:
> You are on the CC list for the bug.
Comment 69 Hans de Goede 2021-04-15 09:39:48 UTC
(In reply to Artem S. Tashkinov from comment #67)
> Multiple users use acpi_enforce_resources=lax and I haven't seen a single
> report that it's ever broken anything.

<sigh> Yet I have been on the receiving end of a bug-report where I had to explain to a user that the lm_sensors sensors-detect script had overvolted his RAM ruining both his expensive high-end RAM as well as his expensive top of the line CPU. The user was surprisingly relaxed about all this, which I really appreciated.

And that was while the script was not doing anything which we (the developers) considered dangerous. But the motherboard had a funky setup causing a SMbus *read* transaction to change the voltage.

Mucking with this stuff can be dangerous and as Matthew has explained in his thorough analysis of the DSDT the DSDT is actually accessing the superio and if that races with a Linux kernel access a wrong register may be read from, or worse written to.

Using acpi_enforce_resources=lax simply is dangerous and we are not going to change the default, period, full-stop.

I welcome further discussions here about how we can *safely* solve hwmon access on various motherboards.

Please stop discussing acpi_enforce_resources=lax, that is not a safe option to use and more discussion about it is not productive.
Comment 70 doomwarriorx 2021-04-21 17:09:13 UTC
Created attachment 296451 [details]
acpidump for Pro B550-C

can confirm the issue with ASUS System Product Name/Pro B550M-C, BIOS 0214 10/22/2020

The bug still exists if asus_wmi & eeepc_wmi is blacklisted. Does the acpi_wmi still claim the address space even if no consumer/driver is available?
Comment 71 Bernhard Seibold 2021-04-28 21:46:31 UTC
Created attachment 296529 [details]
Add support for access via Asus WMI to nct6775
Comment 72 Hans de Goede 2021-04-28 21:56:44 UTC
(In reply to Bernhard Seibold from comment #71)
> Created attachment 296529 [details]
> Add support for access via Asus WMI to nct6775

Nice, have you submitted this to kernel's hwmon subsystem maintainer for inclusion into the mainline kernel ?
Comment 73 Bernhard Seibold 2021-04-29 10:09:07 UTC
(In reply to Hans de Goede from comment #72)
> (In reply to Bernhard Seibold from comment #71)
> > Created attachment 296529 [details]
> > Add support for access via Asus WMI to nct6775
> 
> Nice, have you submitted this to kernel's hwmon subsystem maintainer for
> inclusion into the mainline kernel ?

No, I just finished writing this. I cannot test if it still works correctly for non-WMI cases, and I think using device address zero for WMI is probably a bit hackish.

Please also note that this patch only adds access via WMI for i/o port 0x295, while superio access is still using the "traditional" method. There are however also WMI methods for superio access, at least on my board, and it would probably be safer to use those as well. However I would propose to first split the superio functionality into a separate driver. Comments in nct6775 seem to imply that this was/is intended to be done at some point.
Comment 74 Hans de Goede 2021-04-29 10:18:00 UTC
> Please also note that this patch only adds access via WMI for i/o port 0x295,
> while superio access is still using the "traditional" method.

Ah I missed that, yes that needs to be resolved before this is suitable for upstream. Thank you for your work on this.
Comment 75 Bernhard Seibold 2021-05-04 22:08:25 UTC
Created attachment 296645 [details]
Add support for access via Asus WMI to nct6775 (Rev 2)

Here's an updated patch that also accesses the Super-I/O ports via WMI. Please note that it also adds an ACPI resource check for that IO region. So it might actually make the driver work on less hardware, although that check should probably have been there in the first place.
Comment 76 Artem S. Tashkinov 2021-05-05 03:12:19 UTC
(In reply to Bernhard Seibold from comment #75)
> Created attachment 296645 [details]
> Add support for access via Asus WMI to nct6775 (Rev 2)
> 
> Here's an updated patch that also accesses the Super-I/O ports via WMI.
> Please note that it also adds an ACPI resource check for that IO region. So
> it might actually make the driver work on less hardware, although that check
> should probably have been there in the first place.

Seems to work here, thank you very much!

nct6775: Found NCT6798D or compatible chip at 0x0:0x290
nct6775: Using Asus WMI to access chip

$ sensors

nct6798-isa-0000
Adapter: ISA adapter

I'm only curious why it continues to say ISA adapter which doesn't seem technically correct.

Anyways, please submit for Linux 5.13 - it mustn't be too late as we are now in a merge window.

Tested-by: Artem S. Tashkinov <aros@gmx.com>
Comment 77 Vittorio Roberto Alfieri 2021-07-04 09:45:15 UTC
(In reply to Bernhard Seibold from comment #75)
> Created attachment 296645 [details]
> Add support for access via Asus WMI to nct6775 (Rev 2)
> 
> Here's an updated patch that also accesses the Super-I/O ports via WMI.
> Please note that it also adds an ACPI resource check for that IO region. So
> it might actually make the driver work on less hardware, although that check
> should probably have been there in the first place.

Hi,
I can confirm that this patch works for me too. 

dmesg |grep nct6775
[   43.723698] nct6775: Found NCT6798D or compatible chip at 0x0:0x290
[   43.723890] nct6775: Using Asus WMI to access chip

sensors nct6798-* | head -2
nct6798-isa-0000
Adapter: ISA adapter

mobo: ASUS ROG STRIX B550-F GAMING (WI-FI) - bios version 2403
kernel: 5.13.0 (patched)

Thanks Bernhard!

Have a nice sunday everyone.
Comment 78 Gregory Duhamel 2021-07-25 23:53:39 UTC
Is there any chance this patch reach upstream ? Thanks a lot Guys !
Comment 79 Bernhard Seibold 2021-07-29 18:37:41 UTC
(In reply to Gregory Duhamel from comment #78)
> Is there any chance this patch reach upstream ? Thanks a lot Guys !

I submitted the patch, but it was rejected and I don't intend to continue working on it, sorry.

https://www.spinics.net/lists/linux-hwmon/msg11260.html
Comment 80 Andy Shevchenko 2021-07-30 06:06:28 UTC
(In reply to Bernhard Seibold from comment #79)
> (In reply to Gregory Duhamel from comment #78)
> > Is there any chance this patch reach upstream ? Thanks a lot Guys !
> 
> I submitted the patch, but it was rejected and I don't intend to continue
> working on it, sorry.
> 
> https://www.spinics.net/lists/linux-hwmon/msg11260.html

To be honest I have no evidence it was rejected, rather additional work is needed. This is standard practice in OSS projects, especially big ones like kernel to reshape patch(es) a few times when it will satisfy all parties.
Comment 81 danglingpointerexception@gmail.com 2021-08-21 16:19:54 UTC
Hi All, I've got the same issue with a ASUS ROG STRIX X570-E Gaming.

We need this patch merged!  Can anyone with influence or clout help push this along so we can get this resolved?

There must be hundreds of thousands affected by this!
Comment 82 Andy Shevchenko 2021-08-21 17:08:09 UTC
(In reply to danglingpointerexception@gmail.com from comment #81)
> Hi All, I've got the same issue with a ASUS ROG STRIX X570-E Gaming.
> 
> We need this patch merged!  Can anyone with influence or clout help push
> this along so we can get this resolved?
> 
> There must be hundreds of thousands affected by this!

The best what you can do is to go to the mailing list and discuss it there with a respective maintainer.
Comment 83 Artem S. Tashkinov 2021-08-21 23:24:16 UTC
(In reply to Andy Shevchenko from comment #82)
> 
> The best what you can do is to go to the mailing list and discuss it there
> with a respective maintainer.

Meanwhile this bug is not resolved and there's no (accepted) patch. The bug status is wrong and misleading but I'm not going to argue :-)
Comment 84 Andy Shevchenko 2021-08-22 08:47:13 UTC
(In reply to Artem S. Tashkinov from comment #83)
> (In reply to Andy Shevchenko from comment #82)
> > 
> > The best what you can do is to go to the mailing list and discuss it there
> > with a respective maintainer.
> 
> Meanwhile this bug is not resolved and there's no (accepted) patch. The bug
> status is wrong and misleading but I'm not going to argue :-)

First of all, read comment #80, second, do not misinterpret the bug status. It shows exactly the current state of affairs. If nobody wants to work further on the offered change, it's not a problem of the community. Linux kernel does not work on "take it or leave it" terms. So, instead of whining here, roll up your sleeves and finish the job, that would be much better!
Comment 85 Denis Pauk 2021-08-30 20:47:29 UTC
Created attachment 298539 [details]
POC: Add support for access via Asus WMI to nct6775 by board name detect

I have added only small list of board names(/sys/class/dmi/id/board_name), could you add yours and check?

P.S.: I have not checked with real devices.
Comment 86 Tom Lloyd 2021-08-31 12:53:45 UTC
Please add my board name "TUF GAMING B550-PLUS".

I'm happy to roll the patch into my kernel and make whatever checks are needed, but I'm new to this so could use some guidance.  Email me if you want me to test on my box?
Comment 87 to.eivind 2021-09-04 10:48:32 UTC
(In reply to Denis Pauk from comment #85)
> Created attachment 298539 [details]
> POC: Add support for access via Asus WMI to nct6775 by board name detect
> 
> I have added only small list of board names(/sys/class/dmi/id/board_name),
> could you add yours and check?
> 
> P.S.: I have not checked with real devices.

Thank you all for your great work.

I added my board "ROG STRIX B550-F GAMING (WI-FI)" and added patch against 5.14.1-arch1-1 BIOS 2423 08/10/2021.

$ dmesg | grep nct6775
[    6.878846] nct6775: Found NCT6798D or compatible chip at 0x0:0x290
[    6.879018] nct6775: Using Asus WMI to access chip

$ sensors
nct6798-isa-0000
Adapter: ISA adapter
in0:                      472.00 mV (min =  +0.00 V, max =  +1.74 V)
in1:                      1000.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in2:                        3.38 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in3:                        3.33 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in4:                      1000.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in5:                      960.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in6:                      280.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in7:                        3.36 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in8:                        3.33 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in9:                      904.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in10:                     304.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in11:                       1.02 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in12:                       1.02 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in13:                     408.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in14:                     304.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
fan1:                      714 RPM  (min =    0 RPM)
fan2:                      776 RPM  (min =    0 RPM)
fan3:                      708 RPM  (min =    0 RPM)
fan4:                      870 RPM  (min =    0 RPM)
fan5:                        0 RPM  (min =    0 RPM)
fan6:                        0 RPM  (min =    0 RPM)
fan7:                        0 RPM  (min =    0 RPM)
SYSTIN:                    +28.0°C  (high = +80.0°C, hyst = +75.0°C)  sensor = thermistor
CPUTIN:                    +35.5°C  (high = +80.0°C, hyst = +75.0°C)  sensor = thermistor
AUXTIN0:                   +82.5°C    sensor = thermistor
AUXTIN1:                   +49.0°C    sensor = thermistor
AUXTIN2:                   -60.0°C    sensor = thermistor
AUXTIN3:                   +78.0°C    sensor = thermistor
PECI Agent 0 Calibration:  +35.5°C  
PCH_CHIP_CPU_MAX_TEMP:      +0.0°C  
PCH_CHIP_TEMP:              +0.0°C  
PCH_CPU_TEMP:               +0.0°C  
intrusion0:               ALARM
intrusion1:               ALARM
beep_enable:              disabled


Would be very, very nice if someone has the knowledge to make this acceptable for upstream.
Comment 88 Denis Pauk 2021-09-04 20:46:33 UTC
Created attachment 298669 [details]
POC: Add support for access via Asus WMI to nct6775 by board/vendor name detect

Updated version with check vendor name and fix for possible issues with non ASUS motherboards, added names of motherboards have mentioned in bug. I will also check possible way for use functions pointers instead conditional checks equal to access_wmi. After that I will try to send patch to review.

(In reply to comment #87)
> I added my board "ROG STRIX B550-F GAMING (WI-FI)" and added patch against
> 5.14.1-arch1-1 BIOS 2423 08/10/2021.

Thank you, I have added your board also.

(In reply to comment #86)
> Please add my board name "TUF GAMING B550-PLUS".

Thank you, I have added it to list. What the distro do you use? 

For debian it can be:
* git clone git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
* cd linux-stable
* git check v5.14
* cp /boot/config-5.10.0-8-amd64 .config
* make CC="ccache gcc" -j 32
* make CC="ccache gcc" -j 32 bindeb-pkg
* sudo dpkg -i ../linux-image-5.14.0+_5.14.0+-1_amd64.deb

Look to https://wiki.debian.org/BuildADebianKernelPackage
Comment 89 Barnabás Pőcze 2021-09-04 21:07:28 UTC
As a side note: using dkms is much faster than recompiling the whole kernel.
Comment 90 Pär Ekholm 2021-09-05 10:41:57 UTC
Can you please also add my board: "TUF GAMING X570-PLUS".

Thank you for working on the patch!
Comment 91 Artem S. Tashkinov 2021-09-05 10:46:04 UTC
(In reply to pehlm from comment #90)
> Can you please also add my board: "TUF GAMING X570-PLUS".
> 
> Thank you for working on the patch!

I'm pretty sure all X570 based ASUS motherboards with this chip have to work via WMI and adding them one by one will needlessly inflate the patch. Blacklisting could be used instead (though I'm quite sure there will be no SKUs to blacklist).
Comment 92 Andy Shevchenko 2021-09-05 11:23:29 UTC
(In reply to Artem S. Tashkinov from comment #91)
> (In reply to pehlm from comment #90)
> > Can you please also add my board: "TUF GAMING X570-PLUS".
> > 
> > Thank you for working on the patch!
> 
> I'm pretty sure all X570 based ASUS motherboards with this chip have to work
> via WMI and adding them one by one will needlessly inflate the patch.
> Blacklisting could be used instead (though I'm quite sure there will be no
> SKUs to blacklist).

I even might agree with you, but here is a dilemma: do you want to have the patch accepted in mainline (*), or do you want a better code right now?

*) this is what maintainer wants.

@Denis Pauk, you may consider to have an array of strings and use match_string() call instead of plenty of strcmp():s.
Comment 93 Denis Pauk 2021-09-07 20:35:57 UTC
Created attachment 298703 [details]
POC: Add support for access via Asus WMI to nct6775 use match_string

* Use match_string for filter boards
* use function pointers for superio
Comment 94 dflogeras2 2021-09-08 00:00:07 UTC
Thanks Denis!  Can confirm the module loads successfully on an Asus PRIME B460-PLUS with the following:

[608513.608260] acpi PNP0C14:02: duplicate WMI GUID 05901221-D566-11D1-B2F0-00A0C9062910 (first instance was on PNP0C14:01)
[608513.608293] acpi PNP0C14:03: duplicate WMI GUID 05901221-D566-11D1-B2F0-00A0C9062910 (first instance was on PNP0C14:01)
[608513.608355] acpi PNP0C14:04: duplicate WMI GUID 05901221-D566-11D1-B2F0-00A0C9062910 (first instance was on PNP0C14:01)
[608513.608404] acpi PNP0C14:05: duplicate WMI GUID 05901221-D566-11D1-B2F0-00A0C9062910 (first instance was on PNP0C14:01)
[608513.609331] nct6775: Using Asus WMI to access chip
[608513.609437] nct6775: Found NCT6798D or compatible chip at 0x2e:0x290

I'm not sure about the duplicate warnings, maybe because I am still running a kernel with acpi_enforce_resources=lax (did not reboot before inserting the patched module).
Comment 95 Artem S. Tashkinov 2021-09-08 00:16:54 UTC
(In reply to dflogeras2 from comment #94)
> Thanks Denis!  Can confirm the module loads successfully on an Asus PRIME
> B460-PLUS with the following:
> 
> [608513.608260] acpi PNP0C14:02: duplicate WMI GUID
> 05901221-D566-11D1-B2F0-00A0C9062910 (first instance was on PNP0C14:01)
> [608513.608293] acpi PNP0C14:03: duplicate WMI GUID
> 05901221-D566-11D1-B2F0-00A0C9062910 (first instance was on PNP0C14:01)
> [608513.608355] acpi PNP0C14:04: duplicate WMI GUID
> 05901221-D566-11D1-B2F0-00A0C9062910 (first instance was on PNP0C14:01)
> [608513.608404] acpi PNP0C14:05: duplicate WMI GUID
> 05901221-D566-11D1-B2F0-00A0C9062910 (first instance was on PNP0C14:01)
> [608513.609331] nct6775: Using Asus WMI to access chip
> [608513.609437] nct6775: Found NCT6798D or compatible chip at 0x2e:0x290
> 
> I'm not sure about the duplicate warnings, maybe because I am still running
> a kernel with acpi_enforce_resources=lax (did not reboot before inserting
> the patched module).

That's bug 201885 which is not related to this one.
Comment 96 Tom Lloyd 2021-09-08 18:37:20 UTC
(In reply to Denis Pauk from comment #88)
> Created attachment 298669 [details]
> POC: Add support for access via Asus WMI to nct6775 by board/vendor name
> detect
> 
> Updated version with check vendor name and fix for possible issues with non
> ASUS motherboards, added names of motherboards have mentioned in bug. I will
> also check possible way for use functions pointers instead conditional
> checks equal to access_wmi. After that I will try to send patch to review.
> 
> (In reply to comment #87)
> > I added my board "ROG STRIX B550-F GAMING (WI-FI)" and added patch against
> > 5.14.1-arch1-1 BIOS 2423 08/10/2021.
> 
> Thank you, I have added your board also.
> 
> (In reply to comment #86)
> > Please add my board name "TUF GAMING B550-PLUS".
> 
> Thank you, I have added it to list. What the distro do you use? 
> 
> For debian it can be:
> * git clone
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
> * cd linux-stable
> * git check v5.14
> * cp /boot/config-5.10.0-8-amd64 .config
> * make CC="ccache gcc" -j 32
> * make CC="ccache gcc" -j 32 bindeb-pkg
> * sudo dpkg -i ../linux-image-5.14.0+_5.14.0+-1_amd64.deb
> 
> Look to https://wiki.debian.org/BuildADebianKernelPackage

Denis,

I'm on Gentoo, and already have the kernel sources unpacked and ready to go (5.13.13-gentoo).  I did the following:

# rmmod nct6775
# cd /usrc/src/linux
# patch -i ~/nct6775_wmi_v3.patch -p1
# make modules -j12
# mv /lib/modules/5.13.13-gentoo-splig-3-sensors/kernel/drivers/hwmon/nct6775.ko /lib/modules/5.13.13-gentoo-splig-3-sensors/kernel/drivers/hwmon/nct6775.ko.orig
# cp drivers/hwmon/nct6775.ko /lib/modules/5.13.13-gentoo-splig-3-sensors/kernel/drivers/hwmon/nct6775.ko
# modprobe nct6775

"sensors" output remains the same:
nct6798-isa-0290
Adapter: ISA adapter
in0:                      376.00 mV (min =  +0.00 V, max =  +1.74 V)
in1:                      1000.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in2:                        3.39 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in3:                        3.30 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in4:                        1.01 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in5:                      880.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in6:                      256.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in7:                        3.39 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in8:                        3.28 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in9:                      904.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in10:                     264.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in11:                       1.02 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in12:                       1.04 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in13:                     368.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in14:                     272.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
fan1:                        0 RPM  (min =    0 RPM)
fan2:                      716 RPM  (min =    0 RPM)
fan3:                      496 RPM  (min =    0 RPM)
fan4:                      327 RPM  (min =    0 RPM)
fan5:                        0 RPM  (min =    0 RPM)
fan6:                        0 RPM  (min =    0 RPM)
fan7:                        0 RPM  (min =    0 RPM)
SYSTIN:                    +33.0°C  (high = +80.0°C, hyst = +75.0°C)  sensor = thermistor
CPUTIN:                    +35.0°C  (high = +80.0°C, hyst = +75.0°C)  sensor = thermistor
AUXTIN0:                   +85.0°C    sensor = thermistor
AUXTIN1:                   +55.0°C    sensor = thermistor
AUXTIN2:                   -61.0°C    sensor = thermistor
AUXTIN3:                   +79.0°C    sensor = thermistor
PECI Agent 0 Calibration:  +34.5°C  
PCH_CHIP_CPU_MAX_TEMP:      +0.0°C  
PCH_CHIP_TEMP:              +0.0°C  
PCH_CPU_TEMP:               +0.0°C  
intrusion0:               ALARM
intrusion1:               ALARM
beep_enable:              disabled

dmesg with tree module:
[ 3596.867638] nct6775: Found NCT6798D or compatible chip at 0x2e:0x290
[ 3596.867642] ACPI Warning: SystemIO range 0x0000000000000295-0x0000000000000296 conflicts with OpRegion 0x0000000000000290-0x0000000000000299 (\AMW0.SHWM) (20210331/utaddress-204)
[ 3596.867645] ACPI: OSL: Resource conflict; ACPI support missing from driver?
[ 3596.867646] ACPI: OSL: Resource conflict: System may be unstable or behave erratically

dmesg with patched module:
[ 3681.885428] nct6775: Using Asus WMI to access chip
[ 3681.885468] nct6775: Found NCT6798D or compatible chip at 0x2e:0x290

/proc/cmdline:
BOOT_IMAGE=/boot/vmlinuz-5.13.13-gentoo-splig-3-sensors root=/dev/nvme0n1p3 ro module_blacklist=nouveau acpi_enforce_resources=lax


I hope that's of some use.  The differing dmesg output suggests that the patch is helping, but shouldn't there be a change (improvement) to the sensors output?
Comment 97 Ben 2021-09-08 19:49:42 UTC
(In reply to Tom Lloyd from comment #96)
> 
> I hope that's of some use.  The differing dmesg output suggests that the
> patch is helping, but shouldn't there be a change (improvement) to the
> sensors output?
  
  
You already worked around your problem by adding the line: acpi_enforce_resources=lax, so with or without the patch, your 'sensors' output will work.

With the patch, sensors uses WMI. Without the patch, sensors is only working because you've switched to 'lax'.
Comment 98 Artem S. Tashkinov 2021-09-08 20:03:41 UTC
(In reply to Tom Lloyd from comment #96)
> 
> I hope that's of some use.  The differing dmesg output suggests that the
> patch is helping, but shouldn't there be a change (improvement) to the
> sensors output?

The patch which is being worked on here is simply changing the method of accessing hardware, it does _not_ change the underlying driver which deciphers the sensor inputs. 

To improve the output you've got to create e.g. /etc/sensors.d/nct6798.conf and describe your desired configuration there but that's outside the scope of this bugzilla. Please read lm-sensors documentations for that.

I've already done that for the ASUS TUF GAMING X570-PLUS (WI-FI) motherboard: https://github.com/lm-sensors/lm-sensors/pull/216 but my configuration is very incomplete since the documentation for this chip is likely proprietary and not available publicly. Even HWiNFO64 fails to decipher multiple inputs and shows them as is and that's the best application for that under Windows. Then certain inputs may simply not be connected to anything and basically report white noise.

If you have close contacts at ASUS they may give you the data but that's far from certain. For some reasons hardware monitoring for motherboards, GPUs and CPUs is veiled in secrecy and protected by patents and NDA.
Comment 99 Tom Lloyd 2021-09-08 22:16:47 UTC
That all makes sense, thanks both for the clarification.  I can confirm then that the patch works on my hardware "TUF GAMING B550-PLUS".  Good luck getting it into the tree :)
Comment 100 Denis Pauk 2021-09-09 16:19:24 UTC
I have sent https://bugzilla.kernel.org/attachment.cgi?id=298703 version to review: https://lkml.org/lkml/2021/9/8/840
Comment 101 Sahan Fernando 2021-09-11 00:12:25 UTC
```
$ sudo dmidecode   | grep -i B550
	Product Name: ROG STRIX B550-F GAMING
$
```

The patch worked for me after adding my board name, could you please also add "ROG STRIX B550-F GAMING"?

Thank you for working on this patch.
Comment 102 Jonathan 2021-09-13 18:03:57 UTC
Created attachment 298771 [details]
System Info after applying latest patch
Comment 103 Jonathan 2021-09-13 18:07:58 UTC
Hi,
(oh. Could've put my comment in the attachment comment... duh)
I applied Denis Pauk patch today, (how I did it described in https://gist.github.com/greenbigfrog/26f948c9d86f1cb2fd23bfeaa23ca068 ). While I'm not sure if I did everything correctly, I can see nct6775 pulling in the wmi module now, so I'm fairly certain I'm running the patch.
And yet I'm somehow still getting the acpi access warning, and no further sensor output.
Did I do something wrong?

System: Asus TUF Gaming X570-Plus (Wi-Fi) with 5600X
Comment 104 Igor 2021-09-13 18:52:22 UTC
Hi guys,

I have tried the patch for my "ROG STRIX B550-I GAMING", LGFM:
```
nct6798-isa-0290
Adapter: ISA adapter
Vcore:                    472.00 mV (min =  +0.00 V, max =  +1.74 V)
in1:                        1.02 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
AVSB:                       3.39 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
3VCC:                       3.38 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
+12V:                      12.19 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in5:                      856.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
+5V:                        1.48 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
3.3V:                       3.38 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
Vbat:                       3.47 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in9:                      896.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in13:                     280.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in14:                     280.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
fan1:                      750 RPM  (min =    0 RPM)
CPU Fan:                   585 RPM  (min =    0 RPM)
CHA_FAN1:                    0 RPM  (min =    0 RPM)
SYSTIN:                    +34.0°C  (high = +80.0°C, hyst = +75.0°C)  sensor = thermistor
CPUTIN:                    +33.5°C  (high = +80.0°C, hyst = +75.0°C)  sensor = thermistor
AUXTIN0:                   +80.5°C    sensor = thermistor
AUXTIN1:                   +34.0°C    sensor = thermistor
AUXTIN2:                   +34.0°C    sensor = thermistor
AUXTIN3:                   +86.0°C    sensor = thermistor
PECI Agent 0 Calibration:  +33.0°C  
beep_enable:              disabled
```

The TUF-GAMING-X570-PLUS.conf config file was used from the comment 98.
Just wonder, so high temperature on AUXTIN0 and on AUXTIN3 is something real? And what it could be? Or it could be because the TUF-GAMING-X570-PLUS.conf is badly applicable/adjusted for another MB model?
Comment 105 Denis Pauk 2021-09-13 21:16:07 UTC
(In reply to Jonathan from comment #103)
> Hi,
> (oh. Could've put my comment in the attachment comment... duh)
> I applied Denis Pauk patch today, (how I did it described in
> https://gist.github.com/greenbigfrog/26f948c9d86f1cb2fd23bfeaa23ca068 ).
> While I'm not sure if I did everything correctly, I can see nct6775 pulling
> in the wmi module now, so I'm fairly certain I'm running the patch.
> And yet I'm somehow still getting the acpi access warning, and no further
> sensor output.
> Did I do something wrong?
> 
> System: Asus TUF Gaming X570-Plus (Wi-Fi) with 5600X

Could you please check with original patch from Bernhard Seibold?
And check what is value in "/sys/class/dmi/id/board_name" ?
Comment 106 Jonathan 2021-09-13 22:29:44 UTC
(In reply to Denis Pauk from comment #105)
> (In reply to Jonathan from comment #103)
> > Hi,
> > (oh. Could've put my comment in the attachment comment... duh)
> > I applied Denis Pauk patch today, (how I did it described in
> > https://gist.github.com/greenbigfrog/26f948c9d86f1cb2fd23bfeaa23ca068 ).
> > While I'm not sure if I did everything correctly, I can see nct6775 pulling
> > in the wmi module now, so I'm fairly certain I'm running the patch.
> > And yet I'm somehow still getting the acpi access warning, and no further
> > sensor output.
> > Did I do something wrong?
> > 
> > System: Asus TUF Gaming X570-Plus (Wi-Fi) with 5600X
> 
> Could you please check with original patch from Bernhard Seibold?
> And check what is value in "/sys/class/dmi/id/board_name" ?

Sure. I'll try the "original" patch tomorrow.

```
cat /sys/class/dmi/id/board_name                                                                                                                                                                                                          
TUF GAMING X570-PLUS (WI-FI)
```
Comment 107 Eugene Shalygin 2021-09-14 12:04:12 UTC
(In reply to Denis Pauk from comment #100)
> I have sent https://bugzilla.kernel.org/attachment.cgi?id=298703 version to
> review: https://lkml.org/lkml/2021/9/8/840

Could you please add C8H motherboards to the patch at the next iteration?

"ROG CROSSHAIR VIII HERO"
"ROG CROSSHAIR VIII DARK HERO"
Comment 108 Damir Perisa 2021-09-14 17:11:17 UTC
i can confirm for "ROG CROSSHAIR VIII DARK HERO" (Rev X.0x):

[    3.360424] nct6775: Found NCT6798D or compatible chip at 0x2e:0x290
[    3.360433] ACPI Warning: SystemIO range 0x0000000000000295-0x0000000000000296 conflicts with OpRegion 0x0000000000000290-0x0000000000000299 (\AMW0.SHWM) (20210604/utaddress-204)
Comment 109 Jonathan 2021-09-14 17:31:31 UTC
Created attachment 298799 [details]
dmesg for boot with Denis's patch applied

(In reply to Jonathan from comment #106)
> (In reply to Denis Pauk from comment #105)
> > (In reply to Jonathan from comment #103)
> > > Hi,
> > > (oh. Could've put my comment in the attachment comment... duh)
> > > I applied Denis Pauk patch today, (how I did it described in
> > > https://gist.github.com/greenbigfrog/26f948c9d86f1cb2fd23bfeaa23ca068 ).
> > > While I'm not sure if I did everything correctly, I can see nct6775
> pulling
> > > in the wmi module now, so I'm fairly certain I'm running the patch.
> > > And yet I'm somehow still getting the acpi access warning, and no further
> > > sensor output.
> > > Did I do something wrong?
> > > 
> > > System: Asus TUF Gaming X570-Plus (Wi-Fi) with 5600X
> > 
> > Could you please check with original patch from Bernhard Seibold?
> > And check what is value in "/sys/class/dmi/id/board_name" ?
> 
> Sure. I'll try the "original" patch tomorrow.
> 
> ```
> cat /sys/class/dmi/id/board_name                                            
> 
> TUF GAMING X570-PLUS (WI-FI)
> ```

I've tested both patches now. I had trouble getting Bernhard's to run via dkms, so I built a custom kernel. Worked flawlessly afterwards.
Out of fairness (since I'm really not that sure if my dkms attempts yesterday actually worked), I also built a kernel with Denis's patch. Didn't change much.
I've attached dmesg for Denis's patch.
Comment 110 Denis Pauk 2021-09-14 20:39:22 UTC
Created attachment 298805 [details]
POC: Add support for access via Asus WMI to nct6775 with debug

Add more debug information about what is wrong with match vendor/board names.

(In reply to Jonathan from comment #109)
> Created attachment 298799 [details]
> dmesg for boot with Denis's patch applied

Could you add your board to list and recheck?
Comment 111 Eugene Shalygin 2021-09-15 00:14:01 UTC
Most of these boards, as you probably know already, seem to not provide readings for all the available sensors via the Nuvoton chip. For example, water temperature sensors are unavailable. Readings from those sensors are available via the embedded controller registers. Thus I created a little HWMON driver [1] to read them using WMI method 'BREC' (probably stands for Block Read Embedded Controller). The driver currently supports only three boards (ROG Crosshair VIII Hero, ROG Crosshair VIII Dark Hero, ROG STRIX X570-E GAMING). ROG Crosshair VIII Formula should not differ, but need someone with the hardware to test.

While working on these sensors for the Libre Hardware Monitor project, we found that the Nuvoton 6798D chip provides sensors readings for configured in the BIOS QFan sources in its registers [2]. Maybe those are worth displaying with the nct6775 driver? They can include sensors that are otherwise are available from the embedded controller only.

If you want to add support for your boards, feel free to submit patches to either project (and notify me to update the driver for HWMON from LHM if needed, please).

[1] https://github.com/zeule/asus-wmi-ec-sensors
[2] https://github.com/LibreHardwareMonitor/LibreHardwareMonitor/issues/533
Comment 112 matt-testalltheway 2021-09-15 00:19:49 UTC
please add:


cat /sys/class/dmi/id/board_name

ROG STRIX X570-F GAMING


can confirm latest debug.diff is working, many thanks:


Now follows a summary of the probes I have just done.
Just press ENTER to continue: 

Driver `nct6775':
  * ISA bus, address 0x290
    Chip `Nuvoton NCT6798D Super IO Sensors' (confidence: 9)

Driver `k10temp' (autoloaded):
  * Chip `AMD Family 17h thermal sensors' (confidence: 9)

Do you want to overwrite /etc/sysconfig/lm_sensors? (YES/no): 
Unloading i2c-dev... OK

-

Sep 15 01:55:35 desk kernel: nct6775: Using Asus WMI to access 0xc1 chip.
Sep 15 01:55:35 desk kernel: nct6775: Found NCT6798D or compatible chip at 0x2e:0x290

Sep 15 02:02:41 desk systemd[1]: Starting Hardware Monitoring Sensors...
Sep 15 02:02:41 desk kernel: nct6775: Using Asus WMI to access 0xc1 chip.
Sep 15 02:02:41 desk kernel: nct6775: Found NCT6798D or compatible chip at 0x2e:0x290
Sep 15 02:02:41 desk systemd[1]: Finished Hardware Monitoring Sensors.

-

[root@desk testM]# sensors
amdgpu-pci-0700
Adapter: PCI adapter
vddgfx:      950.00 mV 
fan1:         835 RPM  (min =    0 RPM, max = 3200 RPM)
edge:         +39.0°C  (crit = +91.0°C, hyst = -273.1°C)
power1:       44.15 W  (cap = 277.00 W)

nct6798-isa-0290
Adapter: ISA adapter
in0:                      888.00 mV (min =  +0.00 V, max =  +1.74 V)
in1:                      992.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in2:                        3.38 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in3:                        3.30 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in4:                        1.02 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in5:                      960.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in6:                      256.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in7:                        3.38 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in8:                        3.33 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in9:                      904.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in10:                     480.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in11:                     496.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in12:                       1.03 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in13:                     344.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in14:                     256.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
fan1:                      678 RPM  (min =    0 RPM)
fan2:                      575 RPM  (min =    0 RPM)
fan3:                     1050 RPM  (min =    0 RPM)
fan4:                      738 RPM  (min =    0 RPM)
fan5:                        0 RPM  (min =    0 RPM)
fan6:                        0 RPM  (min =    0 RPM)
SYSTIN:                    +28.0°C  (high = +80.0°C, hyst = +75.0°C)  sensor = thermistor
CPUTIN:                    +33.0°C  (high = +80.0°C, hyst = +75.0°C)  sensor = thermistor
AUXTIN0:                   +86.0°C    sensor = thermistor
AUXTIN1:                   +28.0°C    sensor = thermistor
AUXTIN2:                   +26.0°C    sensor = thermistor
AUXTIN3:                   +91.0°C    sensor = thermistor
PECI Agent 0 Calibration:  +33.5°C  
PCH_CHIP_CPU_MAX_TEMP:      +0.0°C  
PCH_CHIP_TEMP:              +0.0°C  
PCH_CPU_TEMP:               +0.0°C  
intrusion0:               ALARM
intrusion1:               ALARM
beep_enable:              disabled

k10temp-pci-00c3
Adapter: PCI adapter
Tctl:         +32.6°C  
Tdie:         +32.6°C  
Tccd1:        +39.5°C
Comment 113 Jonathan 2021-09-15 10:02:53 UTC
(In reply to Denis Pauk from comment #110)
> Created attachment 298805 [details]
> POC: Add support for access via Asus WMI to nct6775 with debug
> 
> Add more debug information about what is wrong with match vendor/board names.
> 
> (In reply to Jonathan from comment #109)
> > Created attachment 298799 [details]
> > dmesg for boot with Denis's patch applied
> 
> Could you add your board to list and recheck?

This patch works, after adding "TUF GAMING X570-PLUS (WI-FI)".
(At first it didn't, but then I noticed I did forget the closing bracket on "(WI-FI")
Comment 114 Denis Pauk 2021-09-18 08:55:17 UTC
Patch series are applied by Guenter Roeck. https://lkml.org/lkml/2021/9/17/1079

Thank you everyone!
Comment 115 Andy Shevchenko 2021-09-18 09:47:29 UTC
To be more precise it's here: https://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging.git/commit/?id=cd0b8e410937

Thank you, Denis!
Comment 116 Pär Ekholm 2021-09-18 15:58:21 UTC
Thank you very much for your work, Denis!
Comment 117 matt-testalltheway 2021-09-19 05:50:30 UTC
(In reply to Denis Pauk from comment #114)
> Patch series are applied by Guenter Roeck.
> https://lkml.org/lkml/2021/9/17/1079
> 
> Thank you everyone!

guess i was a bit late to the party and "ROG STRIX X570-F GAMING" did not get added (as per comment 112).. but hey thanks for this patch :)
Comment 118 Kamil Dudka 2021-09-19 07:31:05 UTC
Thanks!  I confirm the patch works for me with ASUS PRIME B360-PLUS and linux-5.14.5 after adding the board on the white-list:

--- a/drivers/hwmon/nct6775.c
+++ b/drivers/hwmon/nct6775.c
@@ -4986,6 +4986,7 @@ static int __init nct6775_find(int sioaddr, struct nct6775_sio_data *sio_data)
 static struct platform_device *pdev[2];

 static const char * const asus_wmi_boards[] = {
+   "PRIME B360-PLUS",
    "PRIME B460-PLUS",
    "ROG CROSSHAIR VIII DARK HERO",
    "ROG CROSSHAIR VIII HERO",


# sensors nct6796-isa-0290
nct6796-isa-0290
Adapter: ISA adapter
Vcore:                    376.00 mV (min =  +0.00 V, max =  +1.74 V)
in1:                        1.02 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
AVCC:                       3.46 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
+3.3V:                      3.42 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in4:                        1.03 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in5:                      144.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in6:                      120.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
3VSB:                       3.46 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
Vbat:                       3.22 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in9:                        1.05 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in10:                     152.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in11:                     128.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in12:                     136.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in13:                     120.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in14:                     136.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
fan1:                     1073 RPM  (min =    0 RPM)
fan2:                     1214 RPM  (min =    0 RPM)
fan3:                        0 RPM  (min =    0 RPM)
fan4:                        0 RPM  (min =    0 RPM)
fan5:                        0 RPM  (min =    0 RPM)
fan6:                        0 RPM  (min =    0 RPM)
fan7:                        0 RPM  (min =    0 RPM)
SYSTIN:                    +28.0°C  (high = +98.0°C, hyst = +95.0°C)  sensor = thermistor
CPUTIN:                    +33.5°C  (high = +80.0°C, hyst = +75.0°C)  sensor = thermistor
AUXTIN0:                  +111.0°C    sensor = thermistor
AUXTIN1:                  +118.0°C    sensor = thermistor
AUXTIN2:                  +117.0°C    sensor = thermistor
AUXTIN3:                  +118.0°C    sensor = thermistor
PECI Agent 0:              +37.0°C  (high = +98.0°C, hyst = +95.0°C)
                                    (crit = +100.0°C)
PECI Agent 0 Calibration:  +32.0°C  
PCH_CHIP_CPU_MAX_TEMP:      +0.0°C  
PCH_CHIP_TEMP:              +0.0°C  
intrusion0:               ALARM
intrusion1:               ALARM
beep_enable:              disabled
Comment 119 nutodafozo 2021-09-19 11:33:09 UTC
does this patch change how kernel reads the sensor values (e.g. to that buggy asus wmi interface) or it merely sets specifically for nct67 module what "acpi_enforce_resources=lax" does?
Comment 120 Artem S. Tashkinov 2021-09-19 11:52:18 UTC
(In reply to nutodafozo from comment #119)
> does this patch change how kernel reads the sensor values (e.g. to that
> buggy asus wmi interface) or

The underlying module/driver which reads sensors data is the same and this patch doesn't touch it.

> it merely sets specifically for nct67 module what
> "acpi_enforce_resources=lax" does?

Exactly.
Comment 121 Robert Swiecki 2021-09-19 13:32:49 UTC
FYI, tested also with "ROG CROSSHAIR VIII FORMULA", works well.

[72758.077595] nct6775: Using Asus WMI to access chip
[72758.077637] nct6775: Found NCT6798D or compatible chip at 0x2e:0x290

nct6798-isa-0290
Adapter: ISA adapter
in0:                      936.00 mV (min =  +0.00 V, max =  +1.74 V)
in1:                        1.01 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in2:                        3.39 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in3:                        3.31 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in4:                        1.74 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in5:                      592.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in6:                        1.08 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in7:                        3.39 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in8:                        3.31 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in9:                      912.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in10:                      80.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in11:                      96.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in12:                       1.02 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in13:                       1.34 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in14:                     904.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
fan1:                      422 RPM  (min =    0 RPM)
fan2:                      991 RPM  (min =    0 RPM)
fan3:                        0 RPM  (min =    0 RPM)
fan4:                        0 RPM  (min =    0 RPM)
fan5:                      626 RPM  (min =    0 RPM)
fan6:                        0 RPM  (min =    0 RPM)
fan7:                        0 RPM  (min =    0 RPM)
SYSTIN:                    +35.0°C  (high = +80.0°C, hyst = +75.0°C)  sensor = thermistor
CPUTIN:                    +38.0°C  (high = +80.0°C, hyst = +75.0°C)  sensor = thermistor
AUXTIN0:                   +22.0°C    sensor = thermistor
AUXTIN1:                  +104.0°C    sensor = thermistor
AUXTIN2:                   +98.0°C    sensor = thermistor
AUXTIN3:                   +31.0°C    sensor = thermistor
PECI Agent 0 Calibration:  +39.0°C  
PCH_CHIP_CPU_MAX_TEMP:      +0.0°C  
PCH_CHIP_TEMP:              +0.0°C  
PCH_CPU_TEMP:               +0.0°C  
intrusion0:               ALARM
intrusion1:               ALARM
beep_enable:              disabled
Comment 122 Kamil Pietrzak 2021-09-19 14:38:47 UTC
I confrm patch works on my "TUF GAMING Z490-PLUS (WI-FI)".

[1295150.017048] nct6775: Found NCT6798D or compatible chip at 0x0:0x290

nct6798-isa-0000
Adapter: ISA adapter
in0:                      835.00 mV (min =  +0.00 V, max =  +1.94 V)
in1:                        5.00 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in2:                        3.38 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in3:                        3.33 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in4:                       12.00 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in5:                      784.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in6:                      1000.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in7:                        3.38 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in8:                        3.12 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in9:                        1.06 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in10:                       1.36 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in11:                     960.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in12:                       1.04 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in13:                     1000.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in14:                       1.02 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
fan1:                      369 RPM  (min =    0 RPM)
fan2:                      389 RPM  (min =    0 RPM)
fan3:                      398 RPM  (min =    0 RPM)
fan4:                        0 RPM  (min =    0 RPM)
fan5:                        0 RPM  (min =    0 RPM)
fan6:                        0 RPM  (min =    0 RPM)
fan7:                        0 RPM  (min =    0 RPM)
SYSTIN:                    +30.0°C  (high = +80.0°C, hyst = +75.0°C)  sensor = thermistor
CPUTIN:                    +33.5°C  (high = +80.0°C, hyst = +75.0°C)  sensor = thermistor
AUXTIN0:                   +26.0°C    sensor = thermistor
AUXTIN1:                    +7.0°C    sensor = thermistor
AUXTIN2:                   +28.0°C    sensor = thermistor
AUXTIN3:                   +25.0°C    sensor = thermistor
PECI Agent 0 Calibration:  +32.5°C  
PCH_CHIP_CPU_MAX_TEMP:      +0.0°C  
PCH_CHIP_TEMP:              +0.0°C  
PCH_CPU_TEMP:               +0.0°C  
intrusion0:               OK
intrusion1:               ALARM
beep_enable:              disabled


I also noticed that some voltage values reported by nct6775 differs from the ones reported by Asus software on Windows.

I changed voltage scaling factors to those listed below and now voltages are reported like on Asus software on Windows.

/*
 * Some of the voltage inputs have internal scaling, the tables below
 * contain 8 (the ADC LSB in mV) * scaling factor * 100
 */
static const u16 scale_in[15] = {
	888, 4000, 1600, 1600, 9600, 800, 800, 1600, 1600, 1600, 1600, 1600, 800,
	800, 800
};
Comment 123 Denis Pauk 2021-09-19 22:04:14 UTC
(In reply to Kamil Pietrzak from comment #122)
> I confrm patch works on my "TUF GAMING Z490-PLUS (WI-FI)".
> 
> [1295150.017048] nct6775: Found NCT6798D or compatible chip at 0x0:0x290
....
> 
> I also noticed that some voltage values reported by nct6775 differs from the
> ones reported by Asus software on Windows.
> 
> I changed voltage scaling factors to those listed below and now voltages are
> reported like on Asus software on Windows.
> 
> /*
>  * Some of the voltage inputs have internal scaling, the tables below
>  * contain 8 (the ADC LSB in mV) * scaling factor * 100
>  */
> static const u16 scale_in[15] = {
>       888, 4000, 1600, 1600, 9600, 800, 800, 1600, 1600, 1600, 1600, 1600,
> 800,
>       800, 800
> };

It looks like need to update code with custom scale values in relation to board name. And it can be in future patches. 

Also need to look what functionality is nondestructive and can be merged in:
* https://gitlab.com/CalcProgrammer1/OpenRGB/-/blob/master/OpenRGB.patch
* https://github.com/zeule/asus-wmi-ec-sensors/blob/master/asus-wmi-ec-sensors.c
* https://github.com/electrified/asus-wmi-sensors/
and cover maximum boards.

OpenRGB looks as good candidate for merge, as I see it uses i2c bus instead asuswmi, and we already have ground for custom logic, it should be possible if we have list of boards where such access is implemented by ASUS?.
Comment 124 mirh 2021-09-19 22:52:35 UTC
(In reply to Eugene Shalygin from comment #111)
> Most of these boards, as you probably know already, seem to not provide
> readings for all the available sensors via the Nuvoton chip. For example, 
> [...] we found that the Nuvoton 6798D chip provides sensors readings for
> configured in the BIOS QFan sources in its registers [2]. Maybe those are
> worth displaying with the nct6775 driver? They can include sensors that
> are otherwise are available from the embedded controller only.

I can't really vouch for high end desktop motherboards, but at least as far as laptops are concerned this has been the case since forever about everywhere (ranging from "somewhat nitpicky" lacks to "kinda important" ones)
https://github.com/daringer/asus-fan/issues/13
https://github.com/daringer/asus-fan/issues/44#issuecomment-487380638

I don't know how dangerous accessing EC could be (be it directly, or through possible ACPI methods.. in some cases datasheets may even be available), but something else that isn't just vanilla WMI is needed. 

https://sourceforge.net/p/acpi4asus/mailman/message/7375427/
Btw following the breadcrumb trail of the asus linux drivers history.. it seems like different/older machines may have used 'ECRW' in its place (or if not any I found that to still be present on my 2016 X756UX).
Comment 125 Denis Pauk 2021-09-20 12:37:12 UTC
Created attachment 298887 [details]
Add support for access via Asus WMI to nct6775 (2021.09.20)

Updated patch with support:
---
+       "PRIME B360-PLUS",
+       "PRIME B460-PLUS",
+       "ROG CROSSHAIR VIII DARK HERO",
+       "ROG CROSSHAIR VIII FORMULA",
+       "ROG CROSSHAIR VIII HERO",
+       "ROG CROSSHAIR VIII IMPACT",
+       "ROG STRIX B550-E GAMING",
+       "ROG STRIX B550-F GAMING",
+       "ROG STRIX B550-F GAMING (WI-FI)",
+       "ROG STRIX X570-F GAMING",
+       "ROG STRIX Z390-E GAMING",
+       "ROG STRIX Z490-I GAMING",
+       "TUF GAMING B550M-PLUS",
+       "TUF GAMING B550M-PLUS (WI-FI)",
+       "TUF GAMING B550-PLUS",
+       "TUF GAMING X570-PLUS",
+       "TUF GAMING X570-PLUS (WI-FI)",
+       "TUF GAMING X570-PRO (WI-FI)",
+       "TUF GAMING Z490-PLUS (WI-FI)",
---
Comment 126 Igor 2021-09-20 13:33:57 UTC
(In reply to Denis Pauk from comment #125)
> Created attachment 298887 [details]
> Add support for access via Asus WMI to nct6775 (2021.09.20)
> 
> Updated patch with support:

Could you please add my motherboard as well?

cat /sys/class/dmi/id/board_name
ROG STRIX B550-I GAMING

I mention it in the comment 104 above.
Comment 127 Denis Pauk 2021-09-21 14:45:48 UTC
Created attachment 298905 [details]
Add support for access via Asus WMI to nct6775 (2021.09.21)

Added support by ASUSWMI:
--
+       "PRIME B360-PLUS",
+       "PRIME B460-PLUS",
+       "PRIME X570-PRO",
+       "ROG CROSSHAIR VIII DARK HERO",
+       "ROG CROSSHAIR VIII FORMULA",
+       "ROG CROSSHAIR VIII HERO",
+       "ROG CROSSHAIR VIII IMPACT",
+       "ROG STRIX B550-E GAMING",
+       "ROG STRIX B550-F GAMING",
+       "ROG STRIX B550-I GAMING",
+       "ROG STRIX B550-F GAMING (WI-FI)",
+       "ROG STRIX X570-F GAMING",
+       "ROG STRIX Z390-E GAMING",
+       "ROG STRIX Z490-I GAMING",
+       "TUF GAMING B550M-PLUS",
+       "TUF GAMING B550M-PLUS (WI-FI)",
+       "TUF GAMING B550-PLUS",
+       "TUF GAMING X570-PLUS",
+       "TUF GAMING X570-PLUS (WI-FI)",
+       "TUF GAMING X570-PRO (WI-FI)",
+       "TUF GAMING Z490-PLUS (WI-FI)",
--

I have added i2c adapter code from OpenRGB code:
--
+       "PRIME B450M-GAMING",
+       "PRIME X370-PRO",
+       "PRIME X399-A",
+       "PRIME X470-PRO",
+       "PRIME Z270-A",
+       "PRIME Z370-A",
+       "ROG CROSSHAIR VI HERO",
+       "ROG STRIX B350-F GAMING",
+       "ROG STRIX B450-F GAMING",
+       "ROG STRIX X399-E GAMING",
+       "ROG STRIX Z270-E",
+       "ROG STRIX Z370-E",
+       "ROG STRIX Z490-E GAMING",
+       "TUF B450 PLUS GAMING",
--

Could anyone with such boards check that it still works with OpenRGB? It uses incompatible with ASUSWMI method. If it works, i will try to port to use AsusWMI code.
Comment 128 Denis Pauk 2021-09-25 13:33:37 UTC
Created attachment 298971 [details]
Add support for access via Asus WMI (2021.09.25)

Support by nct6775:ASUSWMI:
---
+	"PRIME B360-PLUS",
+	"PRIME B460-PLUS",
+	"PRIME X570-PRO",
+	"ROG CROSSHAIR VIII DARK HERO",
+	"ROG CROSSHAIR VIII FORMULA",
+	"ROG CROSSHAIR VIII HERO",
+	"ROG CROSSHAIR VIII IMPACT",
+	"ROG STRIX B550-E GAMING",
+	"ROG STRIX B550-F GAMING",
+	"ROG STRIX B550-F GAMING (WI-FI)",
+	"ROG STRIX B550-I GAMING",
+	"ROG STRIX X570-F GAMING",
+	"ROG STRIX Z390-E GAMING",
+	"ROG STRIX Z490-I GAMING",
+	"TUF GAMING B550M-PLUS",
+	"TUF GAMING B550M-PLUS (WI-FI)",
+	"TUF GAMING B550-PLUS",
+	"TUF GAMING B550-PRO",
+	"TUF GAMING X570-PLUS",
+	"TUF GAMING X570-PLUS (WI-FI)",
+	"TUF GAMING X570-PRO (WI-FI)",
+	"TUF GAMING Z490-PLUS (WI-FI)",
---

Support nct6775:i2c (OpenRGB code):
--
+       "PRIME B450M-GAMING",
+       "PRIME X370-PRO",
+       "PRIME X399-A",
+       "PRIME X470-PRO",
+       "PRIME Z270-A",
+       "PRIME Z370-A",
+       "ROG CROSSHAIR VI HERO",
+       "ROG STRIX B350-F GAMING",
+       "ROG STRIX B450-F GAMING",
+       "ROG STRIX X399-E GAMING",
+       "ROG STRIX Z270-E",
+       "ROG STRIX Z370-E",
+       "ROG STRIX Z490-E GAMING",
+       "TUF B450 PLUS GAMING",
--

Support ASUS WSI asus_wmi_sensors:native (https://github.com/electrified/asus-wmi-sensors):
---
+	"ROG CROSSHAIR VII HERO (WI-FI)",
+	"ROG CROSSHAIR VII HERO",
+	"ROG CROSSHAIR VI HERO (WI-FI AC)",
+	"CROSSHAIR VI HERO",
+	"ROG CROSSHAIR VI EXTREME",
+	"ROG ZENITH EXTREME",
+	"ROG ZENITH EXTREME ALPHA",
+	"PRIME X399-A",
+	"PRIME X470-PRO",
+	"ROG STRIX X399-E GAMING",
+	"ROG STRIX B450-E GAMING",
+	"ROG STRIX B450-F GAMING",
+	"ROG STRIX B450-I GAMING",
+	"ROG STRIX X470-I GAMING",
+	"ROG STRIX X470-F GAMING",
----

Support ASUS WSI asus_wmi_sensors:ec (https://github.com/zeule/asus-wmi-ec-sensors/blob/master):
---
+	[BOARD_R_C8H] = "ROG CROSSHAIR VIII HERO",
+	[BOARD_R_C8DH] = "ROG CROSSHAIR VIII DARK HERO",
+	[BOARD_R_C8F] = "ROG CROSSHAIR VIII FORMULA",
+	[BOARD_RS_X570_E_G] = "ROG STRIX X570-E GAMING",
+	[BOARD_RS_B550_E_G] = "ROG STRIX B550-E GAMING",
---

(In reply to Kamil Pietrzak from comment #122)
> static const u16 scale_in[15] = {
>       888, 4000, 1600, 1600, 9600, 800, 800, 1600, 1600, 1600, 1600, 1600,
> 800,
>       800, 800
> };

@Kamil Pietrzak Could you please check that scale applied to your board correctly?

(In reply to Eugene Shalygin from comment #111)
> Thus I created a little HWMON driver [1] to read them using WMI method
> 'BREC'.

@Eugene Shalygin Could you please check that combined version is still worked?
Comment 129 Eugene Shalygin 2021-09-25 14:47:30 UTC
(In reply to Denis Pauk from comment #128)

> @Eugene Shalygin Could you please check that combined version is still
> worked?

Thank you for your efforts to mainline these drivers! I have a couple of changes and questions to the EC part. Is a review going on somewhere where I can participate? Otherwise here are the main points:

1. I'm pretty sure the B550-E GAMING board has no EC sensors. Other B550 boards I've seen DSDT from provide a dummy BREC() function.
2. The "Water" fan sensor should have been named "Water_pump" or alike.
3. There is probably an AIO fan sensor at (2, 0x00, 0xB8) EC, but I did not yet find time to check. Maybe someone has this header connected and can do a test for us?

I'll try to test with hardware later today. Thank you for your work, Denis!
Comment 130 Kamil Pietrzak 2021-09-25 15:37:12 UTC
(In reply to Denis Pauk from comment #128)

> @Kamil Pietrzak Could you please check that scale applied to your board
> correctly?

I confirm voltages defined in "static const u16 scale_in_z490[15]" are applied correctly to my motherboard "TUF GAMING Z490-PLUS (WI-FI)".

Motherboard "TUF GAMING Z490-PLUS (WI-FI)" is using Nuvoton NCT6798D Super I/O, so probably all motherboards that use same Nuvoton chip may benefit from those new voltage scaling factors.
Maybe variable "static const u16 scale_in_z490" could have some more generic name related to NCT6798D.
Here I have to admit that I figured out those voltage scaling factors by try and error (to match voltages to those shown in Asus software on Windows), cause I could not find Nuvoton NCT6798D documentation on Nuvoton website.

Also I think it is probaby safe to add motherboard "TUF GAMING Z490-PLUS" to supported boards, case as far as I know the only difference between "TUF GAMING Z490-PLUS" and "TUF GAMING Z490-PLUS (WI-FI)" is Intel Wi-Fi 6 AX201 chip on the latter.
Comment 131 Denis Pauk 2021-09-25 18:51:58 UTC
(In reply to Eugene Shalygin from comment #129)
> Thank you for your efforts to mainline these drivers! I have a couple of
> changes and questions to the EC part. Is a review going on somewhere where I
> can participate? Otherwise here are the main points:
> 
I have not sent it to review yet. I prefer to have checked at least one motherboard from each group before send for review. Especially i2c adapter. 

(In reply to Eugene Shalygin from comment #129)
> 1. I'm pretty sure the B550-E GAMING board has no EC sensors. Other B550
> boards I've seen DSDT from provide a dummy BREC() function.

As for me it has returned reasonable values for "ROG STRIX B550-E GAMING":
----
asuswmiecsensors-isa-0000
Adapter: ISA adapter
Chipset:      +32.0°C  
CPU:          +22.0°C  
Motherboard:  +22.0°C  
T_Sensor:    +216.0°C  
VRM:          +28.0°C  

k10temp-pci-00c3
Adapter: PCI adapter
Tctl:         +25.1°C  
Tdie:         +25.1°C  
Tccd1:        +22.5°C  
Tccd2:        +24.5°C 
----

Maybe it has other valuable sensors, I have used some lucky values for now that looks as reasonable. Motherboard for sure has T_Sensor and AIO_PUMP by https://rog.asus.com/motherboards/rog-strix/rog-strix-b550-e-gaming-model/spec.

(In reply to Kamil Pietrzak from comment #130)
> Motherboard "TUF GAMING Z490-PLUS (WI-FI)" is using Nuvoton NCT6798D Super
> I/O, so probably all motherboards that use same Nuvoton chip may benefit
> from those new voltage scaling factors.  

What do you think about use kernel mode parameter for use custom value until we will have some approve that other motherboards with NCT6798D has same scale factors?
Comment 132 Andy Shevchenko 2021-09-26 04:54:18 UTC
(In reply to Denis Pauk from comment #131)
> (In reply to Eugene Shalygin from comment #129)
> > Thank you for your efforts to mainline these drivers! I have a couple of
> > changes and questions to the EC part. Is a review going on somewhere where
> I
> > can participate? Otherwise here are the main points:
> > 
> I have not sent it to review yet. I prefer to have checked at least one
> motherboard from each group before send for review. Especially i2c adapter. 

Don't forget to split per logical change (to me sounds like new code contains 3 to 5 logical pieces, hence the number of patches).

> (In reply to Kamil Pietrzak from comment #130)
> > Motherboard "TUF GAMING Z490-PLUS (WI-FI)" is using Nuvoton NCT6798D Super
> > I/O, so probably all motherboards that use same Nuvoton chip may benefit
> > from those new voltage scaling factors.  
> 
> What do you think about use kernel mode parameter for use custom value until
> we will have some approve that other motherboards with NCT6798D has same
> scale factors?

Once added, parameter may not be removed (because we don't break user space). So, this parameter is not so critical and I am definitely against adding it.

The compromise would be to name after the tested board (with probably comment in the commit message and/or the code that this is possible to have the same for all NCT6789D chips) and when confirmed, rename as Kamil proposed.
Comment 133 Denis Pauk 2021-10-05 20:32:24 UTC
Created attachment 299111 [details]
Add support for access via Asus WMI (2021.10.05)

Patch with same list of supported boards, additionally applied changes from review https://lkml.org/lkml/2021/10/2/189.

(In reply to Kamil Pietrzak from comment #130)
> I confirm voltages defined in "static const u16 scale_in_z490[15]" are
> applied correctly to my motherboard "TUF GAMING Z490-PLUS (WI-FI)".
I afraid next patch will be without scaling :-( https://lkml.org/lkml/2021/10/5/707
Comment 134 Eugene Shalygin 2021-10-05 20:47:29 UTC
Denis, 

thank you for pulling the new changes!

Could you explain, please, why did you merge the asus-wmi-sensors and asus-wmi-ec-sensors drivers? As far as I understand, asus-wmi-sensors can fetch data from all available sensors, including those provided by EC, but the WMI methods it relies upon are removed from the new ASUS boards.
Comment 135 Denis Pauk 2021-10-05 21:00:54 UTC
(In reply to Eugene Shalygin from comment #134)
> Denis, 
> 
> thank you for pulling the new changes!
> 
> Could you explain, please, why did you merge the asus-wmi-sensors and
> asus-wmi-ec-sensors drivers? As far as I understand, asus-wmi-sensors can
> fetch data from all available sensors, including those provided by EC, but
> the WMI methods it relies upon are removed from the new ASUS boards.
Both drivers have used same entry point and difference as I see that: old boards return some list of sensors with names, new one always returns zero as count of sensors and requires some hardcoded list of sensors. And list of old and new boards is not intersected. 

As for me, when we have 30% of similar code better have one driver for both cases. Currently driver has 1126 lines.

(I have not calculated real size of shared code so it can be bigger or less.)

Do you like to be in MAINTAINERS list?

Also good news, no EC also will go in next round of updated patches. 
https://github.com/electrified/asus-wmi-sensors/issues/78
Comment 136 Eugene Shalygin 2021-10-05 21:26:45 UTC
(In reply to Denis Pauk from comment #135)
> Both drivers have used same entry point and difference as I see that: old
> boards return some list of sensors with names,

There was a special WMI interface that reads all sensors, both from SIO and EC, kind implementing what you did for the nct driver and what is implemented in the asus-wmi-ec-sensors, but done fully in the DSDT code. 

> new one always returns zero as count of sensors

There is simply no that high-level WMI interface in the new boards.

> and requires some hardcoded list of sensors.

I now think the EC sensors are at the same registers for the old and new boards.  

> And list of old and new boards is not intersected. 

Exactly! So half the driver will be a dead code anyway. Would it be better to load only one of the small drivers? Also, for the old boards the nct6775 driver will load and asus-wmi-sensors provides duplicate readings.  
> 
> As for me, when we have 30% of similar code better have one driver for both
> cases. Currently driver has 1126 lines.

The only shared code between those two is the HWMON interface functions, which is more or less the same for many HWMON drivers. 

So, would it be simpler to provide 3 drivers: nct6775, asus-wmi-sensors, asus-wmi-ec-sensors?

> Do you like to be in MAINTAINERS list?

Yes, please. I still have work to do with that (even not all available sensors are discovered yet).
Comment 137 Kamil Pietrzak 2021-10-05 22:02:44 UTC
(In reply to Denis Pauk from comment #133)
> I afraid next patch will be without scaling :-(
> https://lkml.org/lkml/2021/10/5/707

I am not kernel developer but I also think per motherboard voltage scalling is bad idea in terms of maintenance. For the same reason hardcoding any of board models in module code is rather bad idea and personally I would prefer to use for example use_wmi=y module parameter or similar when resource conflict occurs on module loading.

With regard to current voltage scaling factors for nct6798d chip, they are most likely not correct and probably will require future changes. For example I can't see +12V and +5V is sensors output when using current voltage scalling factors.
Comment 138 Kamil Pietrzak 2021-10-06 11:08:07 UTC
(In reply to Kamil Pietrzak from comment #137)
> With regard to current voltage scaling factors for nct6798d chip, they are
> most likely not correct and probably will require future changes. For
> example I can't see +12V and +5V is sensors output when using current
> voltage scalling factors.

Partially responding to my own comment here.

Due to the lack of publicly available documentation for NCT6798D chip I checked docs for similar chips (NCT6791D, NCT6791D).

Looks like voltages like +12V and +5V can be connected to any one of general purpose voltage inputs on the SuperIO chip. So on one motherboard +12V can be connected to pin VIN0, but on another one with same SuperIO chip it can possibly be connected to other general purpose voltage pin like VIN1, VIN2, VIN3 etc. In that case it will not be possible to properly scale these voltages without hardcoding motherboard models in module code, so scaling should take place in userspace apps like lm_sensors. The only voltages that can be safely scaled in module code are Vcore, AVSB, 3VCC, 3VSB, VBAT. Pins to which they are connected should not change between different motherboards. So it looks like current voltage scaling factors are as accurate as it can be without hardcoding motherboards models.

However, I am still curious about Vcore voltage readings on my TUF Z490 board in BIOS and Asus software. According to docs Vcore should be calculated with formula 

Detected Voltage = Reading * 0.008 V

but Asus in BIOS and in their software on Windows calculate it probably with some additional scaling factor, most likely something like

Detected Voltage = Reading * 0.008 V * 1.11

The only reason that comes to my mind for calculating Vcore in that way is that they (Asus) wanted BIOS/software Vcore readings to be more accurate in relation to voltage readings using for example multimeter near the CPU socket.
Comment 139 Denis Pauk 2021-10-10 10:12:38 UTC
Created attachment 299159 [details]
Add support for access via Asus WMI (2021.10.10)

Rebased over https://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging.git/log/?h=hwmon-next

Sent to LKML(without unchecked i2c logic): https://lkml.org/lkml/2021/10/10/65
Comment 140 Feliks 2021-10-14 13:26:54 UTC
Can someone add my board please? It is an Asus PRIME X570-P, I guess the sensor should be, if not exactly the same, as the one from an Asus PRIME X570-Pro.
I cannot use Linux with my board since the CPU fan is spinning at maximum speed making an extreme amount of noise, due to that module's sensor readings which are wrong.
Comment 141 Eugene Shalygin 2021-10-14 18:41:59 UTC
I would like to ask for an assistance to understand why reading EC sensors takes so much time (1 second). Could, please, users of boards with sensors published in EC registers (we currently aware of the following models:  Pro WS X570-ACE, ROG Crosshair VIII Hero, ROG Crosshair VIII Dark Hero, ROG Crosshair VIII Formula, G STRIX B550-E GAMING, ROG STRIX X570-E GAMING) measure how long does it take to read all 256 EC registers in their system and report back the time and the board name?

# modprobe ec_sys
# time cat /sys/kernel/debug/ec/ec0/io > /dev/null

Thank you!
Comment 142 Andy Shevchenko 2021-10-14 19:54:34 UTC
(In reply to Feliks from comment #140)
> Can someone add my board please? It is an Asus PRIME X570-P, I guess the
> sensor should be, if not exactly the same, as the one from an Asus PRIME
> X570-Pro.
> I cannot use Linux with my board since the CPU fan is spinning at maximum
> speed making an extreme amount of noise, due to that module's sensor
> readings which are wrong.

You have to test yourself before anybody else will add it.

(In reply to Eugene Shalygin from comment #141)
> I would like to ask for an assistance to understand why reading EC sensors
> takes so much time (1 second). Could, please, users of boards with sensors
> published in EC registers (we currently aware of the following models:  Pro
> WS X570-ACE, ROG Crosshair VIII Hero, ROG Crosshair VIII Dark Hero, ROG
> Crosshair VIII Formula, G STRIX B550-E GAMING, ROG STRIX X570-E GAMING)
> measure how long does it take to read all 256 EC registers in their system
> and report back the time and the board name?
> 
> # modprobe ec_sys
> # time cat /sys/kernel/debug/ec/ec0/io > /dev/null

It won't mean anything. The each register read separately may take a long time since EC is a separate uController that may be interrupted at any time by any higher priority task (to be sure you have to have a look into the firmware source code). So, I'll be not surprised if 1s in some cases is not enough. Not I'm against the shrtening the timeouts, but somebody should really know what they are about and why.
Comment 143 Eugene Shalygin 2021-10-14 20:04:55 UTC
(In reply to Andy Shevchenko from comment #142)

> It won't mean anything. The each register read separately may take a long
> time since EC is a separate uController that may be interrupted at any time
> by any higher priority task (to be sure you have to have a look into the
> firmware source code). So, I'll be not surprised if 1s in some cases is not
> enough. Not I'm against the shrtening the timeouts, but somebody should
> really know what they are about and why.

I'm looking for a rough estimate. All my other machines need less than 0.3 s for that, and only the ASUS one never completes in less than 2.7 s.

Note You need to log in before you can comment on or make changes to this bug.