Bug 218685

Summary: asus-nb-wmi fails to load due to conflict with amd-pmf
Product: Drivers Reporter: al0uette
Component: Platform_x86Assignee: drivers_platform_x86 (drivers_platform_x86)
Status: RESOLVED CODE_FIX    
Severity: normal CC: jwrdegoede, mario.limonciello, shyam-sundar.s-k
Priority: P3    
Hardware: AMD   
OS: Linux   
Kernel Version: Subsystem:
Regression: No Bisected commit-id:
Attachments: dmesg when asus-nb-wmi fail
acpidump
log of amd-pmf

Description al0uette 2024-04-06 02:34:18 UTC
Created attachment 306090 [details]
dmesg when asus-nb-wmi fail

Description:

asus-nb-wmi module fails with 

[    5.410848] asus-nb-wmi asus-nb-wmi: probe with driver asus-nb-wmi failed with error -17

Reason:

Both amd-pmf and asus-nb-wmi register platform-profile. However according to linux/drivers/acpi/platform_profile.c, there could be only one platform_profile, and the second module who register platform_profile will fail to load with -EEXIST.
Comment 1 Artem S. Tashkinov 2024-04-08 14:52:37 UTC
Hans, could you take a look please?
Comment 2 Hans de Goede 2024-04-08 15:18:11 UTC
Mario, this bug is about amd-pmf and asus-nb-wmi registering a platform_profile handler with the second one trying to register it failing with -EEXIST .

Do you have any idea what to do here. Should one of them always get priority over the other ?
Comment 3 Mario Limonciello (AMD) 2024-04-08 18:52:15 UTC
I think we need to look over an acpidump to better understand it.

But this sounds like a bug in asus-nb-wmi that it's not checking something at init that it should.

I would expect only amd-pmf in a properly configured system.
Comment 4 Hans de Goede 2024-04-09 09:59:29 UTC
al0uette can you provide an acpidump please?

To do this first install the acpica-tools package and then run:

sudo acpidump -o acpidump.txt

and then attach the generated acpidump.txt file here in bugzilla.
Comment 5 al0uette 2024-04-09 15:28:38 UTC
Created attachment 306114 [details]
acpidump
Comment 6 al0uette 2024-04-09 15:29:11 UTC
Yes, definitely
Comment 7 al0uette 2024-04-09 15:37:25 UTC
BTW amd-pmf also throw another error on my machine:

amd-pmf AMDI0102:00: acpi_walk_resources failed :5

(which I guess is AE_NOT_FOUND?), is it because there's a bug in my BIOS? Should I open another issue to report it?
Comment 8 Mario Limonciello (AMD) 2024-04-09 15:48:25 UTC
It looks to me that there is no _CRS as part of AMDI0102 in that acpidump.  However there is an "XCRS" method in SSDT8 which I would guess means that BIOS has a static _CRS entry that it can patch in/out at boot time based on pre-boot BIOS settings by changing the _ to an X:

            Method (XCRS, 0, NotSerialized)
            {
                Name (RBUF, ResourceTemplate ()
                {
                    QWordMemory (ResourceConsumer, PosDecode, MinNotFixed, MaxNotFixed, NonCacheable, ReadOnly,
                        0x0000000000000000, // Granularity
                        0x000000FDAAAAAAAA, // Range Minimum
                        0x000000FDBBBBBBBB, // Range Maximum
                        0x0000000000000000, // Translation Offset
                        0x0000000011111111, // Length
                        ,, , AddressRangeMemory, TypeStatic)
                })
                Return (RBUF) /* \_SB_.PMF_.XCRS.RBUF */
            }

I think that we probably should decrease that message to debug, AFAICT this machine doesn't support smart PC solution builder.
Comment 9 Mario Limonciello (AMD) 2024-04-09 15:51:15 UTC
Can you please share a log with dynamic debug turned on for amd-pmf?  You can rmmod/modprobe it if you want rather than a full boot.
Comment 10 al0uette 2024-04-09 15:57:10 UTC
(In reply to Mario Limonciello (AMD) from comment #9)
> Can you please share a log with dynamic debug turned on for amd-pmf?  You
> can rmmod/modprobe it if you want rather than a full boot.

Can you tell me how to do that? And I'll upload it tomorrow cuz I'm going to sleep now
Comment 11 Mario Limonciello (AMD) 2024-04-09 16:30:20 UTC
sudo dmesg --clear
sudo rmmod amd-pmf
sudo modprobe amd-pmf dyndbg='+p'
sudo dmesg > dmesg.txt
Comment 12 al0uette 2024-04-10 01:15:23 UTC
Created attachment 306115 [details]
log of amd-pmf
Comment 13 Mario Limonciello (AMD) 2024-04-10 04:49:21 UTC
OK, from that log and your acpidump it looks to me that it supports these PMF functions:

APMF_FUNC_GET_SYS_PARAMS
APMF_FUNC_SBIOS_REQUESTS
APMF_FUNC_OS_POWER_SLIDER_UPDATE


The one that is specifically causing the conflict the ASUS driver is APMF_FUNC_OS_POWER_SLIDER_UPDATE.  This is what causes the ACPI power profile to get registered:

[   71.242252] amd-pmf AMDI0102:00: SPS enabled and Platform Profiles registered

When the profile is moved it calls APMF() with an arg of 8, which calls PMF8():

                    Case (0x08)
                    {
                        PMF8 (Arg1)
                    }

PMF8 outputs some debug using M460 but eventually calls APX8():

            Method (PMF8, 1, Serialized)
            {
                M460 ("FEA-ASL-\\_SB.PMF.APMF Function 8 call PMF8\n", Zero, Zero, Zero, Zero, Zero, Zero)
                CreateByteField (Arg0, 0x02, M490)
                M460 ("  Slider Event Notification: 0x%x\n", M490, Zero, Zero, Zero, Zero, Zero)
                If (CondRefOf (\_SB.APX8))
                {
                    M460 ("  Call OEM ACPI APX8\n", Zero, Zero, Zero, Zero, Zero, Zero)
                    \_SB.APX8 (M490)
                }
            }

This updates a "PMTP" field:

        Method (APX8, 1, Serialized)
        {
            PMTP = (0xD6080000 | Arg0)
        }

That PTMP field is part of an Operation Region used for Port I/O:

        OperationRegion (TP80, SystemIO, 0x80, 0x04)
        Field (TP80, DWordAcc, NoLock, Preserve)
        {
            PMTP,   32
        }

To me this looks like the system should probably be responding to power slider events from the amd-pmf driver.  Do you find that various CPU coefficients don't get updated even when you've changed the power profile when amd-pmf is bound?

If they do get updated, then I think the bug should be in asus-wmi.c that it shouldn't be loading a platform profile because amd-pmf is taking this role.  Perhaps the DSTS method has an unknown bit to indicate this?
Comment 14 al0uette 2024-04-10 05:03:02 UTC
(In reply to Mario Limonciello (AMD) from comment #13)
> Do you find that various CPU
> coefficients don't get updated even when you've changed the power profile
> when amd-pmf is bound?

If you're talking about energy_performance_profile, then I do have such a problem. You can see another issue opened by me:

https://bugzilla.kernel.org/show_bug.cgi?id=218686
Comment 15 Mario Limonciello (AMD) 2024-04-10 05:19:57 UTC
No; not that.  I'm talking about things like sPPT and fPPT.
Try reading /sys/kernel/debug/amd_pmf/current_power_limits and then changing the platform profile to power saver or performance.
See if any of those limits change.

For reference; this is what I see on a Framework 13 AMD:

(in balanced):
spl:35000 fppt:35000 sppt:33000 sppt_apu_only:41000 stt_min:28000 stt[APU]:0 stt[HS2]: 0

(in power saver):
spl:15000 fppt:15000 sppt:15000 sppt_apu_only:41000 stt_min:15000 stt[APU]:0 stt[HS2]: 0

(in performance):
spl:30000 fppt:30000 sppt:30000 sppt_apu_only:41000 stt_min:30000 stt[APU]:128 stt[HS2]: 0
Comment 16 al0uette 2024-04-10 05:29:10 UTC
[root@Zephyrus:~]# cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
performance
performance
performance
performance
performance
performance
performance
performance
performance
performance
performance
performance
performance
performance
performance
performance

[root@Zephyrus:~]# cat /sys/kernel/debug/amd_pmf/current_power_limits
spl:45000 fppt:65000 sppt:65000 sppt_apu_only:54000 stt_min:0 stt[APU]:0 stt[HS2]: 0

[root@Zephyrus:~]# echo "powersave" | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
powersave

[root@Zephyrus:~]# cat /sys/kernel/debug/amd_pmf/current_power_limits
spl:45000 fppt:65000 sppt:65000 sppt_apu_only:54000 stt_min:0 stt[APU]:0 stt[HS2]: 0

Seems that they don't change
Comment 17 Mario Limonciello (AMD) 2024-04-10 05:35:19 UTC
OK - can you please do an experiment where you modify the PMF driver to not register a platform profile but let the asus-wmi one do it?  Try reading that same file in the different states when asus-wmi is controlling it.
Comment 18 Mario Limonciello (AMD) 2024-04-10 05:35:56 UTC
Wait; you're not changing the right thing.  You should be changing /sys/firmware/acpi/platform_profile
Comment 19 al0uette 2024-04-10 05:45:29 UTC
Oh sorry for that, here's the new result(amd-pmf)

[low-power]
sudo cat /sys/kernel/debug/amd_pmf/current_power_limits                    
spl:45000 fppt:65000 sppt:65000 sppt_apu_only:54000 stt_min:0 stt[APU]:0 stt[HS2]: 0

[balanced]
sudo cat /sys/kernel/debug/amd_pmf/current_power_limits                    
spl:45000 fppt:65000 sppt:65000 sppt_apu_only:54000 stt_min:0 stt[APU]:0 stt[HS2]: 0

[performance]
sudo cat /sys/kernel/debug/amd_pmf/current_power_limits                                                                                                                      
spl:45000 fppt:65000 sppt:65000 sppt_apu_only:54000 stt_min:0 stt[APU]:0 stt[HS2]: 0
Comment 20 al0uette 2024-04-10 06:05:26 UTC
(with asus-nb-wmi)

[quiet](asus-nb-wmi don't have low-power profile)
spl:35000 fppt:65000 sppt:44000 sppt_apu_only:54000 stt_min:0 stt[APU]:0 stt[HS2]: 0
[balanced]
spl:35000 fppt:65000 sppt:44000 sppt_apu_only:54000 stt_min:0 stt[APU]:0 stt[HS2]: 0
[performance]
spl:35000 fppt:80000 sppt:44000 sppt_apu_only:54000 stt_min:0 stt[APU]:0 stt[HS2]: 0
Comment 21 al0uette 2024-04-10 06:09:16 UTC
(In reply to al0uette from comment #20)
> (with asus-nb-wmi)
> 
> [quiet](asus-nb-wmi don't have low-power profile)
> spl:35000 fppt:65000 sppt:44000 sppt_apu_only:54000 stt_min:0 stt[APU]:0
> stt[HS2]: 0
> [balanced]
> spl:35000 fppt:65000 sppt:44000 sppt_apu_only:54000 stt_min:0 stt[APU]:0
> stt[HS2]: 0
> [performance]
> spl:35000 fppt:80000 sppt:44000 sppt_apu_only:54000 stt_min:0 stt[APU]:0
> stt[HS2]: 0

The above data is taken when I unplugged power cable, here's the value with power cable connected:

[quiet]
spl:35000 fppt:65000 sppt:55000 sppt_apu_only:54000 stt_min:0 stt[APU]:0 stt[HS2]: 0
[balanced]
spl:45000 fppt:65000 sppt:65000 sppt_apu_only:54000 stt_min:0 stt[APU]:0 stt[HS2]: 0
[performance]
spl:80000 fppt:80000 sppt:80000 sppt_apu_only:54000 stt_min:0 stt[APU]:0 stt[HS2]: 0
Comment 22 al0uette 2024-04-10 06:10:16 UTC
(In reply to al0uette from comment #19)
> Oh sorry for that, here's the new result(amd-pmf)
> 
> [low-power]
> sudo cat /sys/kernel/debug/amd_pmf/current_power_limits                    
> spl:45000 fppt:65000 sppt:65000 sppt_apu_only:54000 stt_min:0 stt[APU]:0
> stt[HS2]: 0
> 
> [balanced]
> sudo cat /sys/kernel/debug/amd_pmf/current_power_limits                    
> spl:45000 fppt:65000 sppt:65000 sppt_apu_only:54000 stt_min:0 stt[APU]:0
> stt[HS2]: 0
> 
> [performance]
> sudo cat /sys/kernel/debug/amd_pmf/current_power_limits                     
> 
> spl:45000 fppt:65000 sppt:65000 sppt_apu_only:54000 stt_min:0 stt[APU]:0
> stt[HS2]: 0

This result is also taken with power cable connected
Comment 23 Mario Limonciello (AMD) 2024-04-10 11:22:06 UTC
It looks like the BIOS has a bug that it's advertisement of that bit even though it doesn't work.
I guess in Windows this is a no-op.

Hans - there's a few ways to handle this I can think of.
1. Quirk against this system and BIOS to ignore that bit in AMD PMF.
2. Allow multiple drivers to register for platform profile.
3. Do nothing, let Asus fix in BIOS.

What do you want to do?
Comment 24 Mario Limonciello (AMD) 2024-04-10 11:24:41 UTC
And before we decide al0uette did you already update to the latest BIOS available by Asus? Issue still exists?
Comment 25 al0uette 2024-04-10 11:29:19 UTC
(In reply to Mario Limonciello (AMD) from comment #24)
> And before we decide al0uette did you already update to the latest BIOS
> available by Asus? Issue still exists?

Definitely, I've updated to the latest BIOS(version 303)
Comment 26 al0uette 2024-04-10 11:30:09 UTC
And the issue still exists, which is why I'm reporting this issue
Comment 27 Hans de Goede 2024-04-10 11:31:01 UTC
(In reply to Mario Limonciello (AMD) from comment #23)
> Hans - there's a few ways to handle this I can think of.
> 1. Quirk against this system and BIOS to ignore that bit in AMD PMF.
> 2. Allow multiple drivers to register for platform profile.
> 3. Do nothing, let Asus fix in BIOS.
> 
> What do you want to do?

I wish we could rely on 3. but I don't think it is very likely Asus is actually going to fix this and al0uette has just confirmed this still happens with the latest available BIOS :|

2. seems complicated and will likely be messy, so that leaves 1. I'm afraid.
Comment 28 al0uette 2024-04-10 11:35:13 UTC
(In reply to Mario Limonciello (AMD) from comment #23)
> It looks like the BIOS has a bug that it's advertisement of that bit even
> though it doesn't work.
> I guess in Windows this is a no-op.
> 
> Hans - there's a few ways to handle this I can think of.
> 1. Quirk against this system and BIOS to ignore that bit in AMD PMF.
> 2. Allow multiple drivers to register for platform profile.
> 3. Do nothing, let Asus fix in BIOS.
> 
> What do you want to do?

I really doubt whether ASUS will fix this BIOS bug. I contacted ASUS consumer support today for the CPPC problem and they seem unwilling to provide Linux support
Comment 29 Mario Limonciello (AMD) 2024-04-10 11:49:20 UTC
I did do something a few years ago to let multiple drivers react to profile changes. This was in advance of amd pmf driver existing but I think the same concepts can potentially apply.

https://lore.kernel.org/linux-acpi/20211026180535.9096-4-mario.limonciello@amd.com/T/

It "could" be dusted off to find the rough edges instead of quirking.
Comment 30 Hans de Goede 2024-04-10 11:52:13 UTC
(In reply to Mario Limonciello (AMD) from comment #29)
> I did do something a few years ago to let multiple drivers react to profile
> changes. This was in advance of amd pmf driver existing but I think the same
> concepts can potentially apply.
> 
> https://lore.kernel.org/linux-acpi/20211026180535.9096-4-mario.
> limonciello@amd.com/T/
> 
> It "could" be dusted off to find the rough edges instead of quirking.

So how would this work in a case like this where one platform-profile provider supports "quiet" and the other "low-power" (and both support "balanced" + "performance"  ?
Comment 31 Mario Limonciello (AMD) 2024-04-10 11:58:02 UTC
I'm actually unsure what the quiet profile means compared to low power. I would think it's better for Asus driver to use low power.

That's what userspace power profiles daemon uses anyway.


But yes if there are different mapping for different drivers that could be problematic the way it was done in that series. 

There could also be a problem of device ordering.

So I guess the way to solve it would be to make all drivers that want it request it enabled with the profiles they support. 

To userspace the superset would be reported.

Individual driver callbacks if they're requested to switch to an unsupported state would be a noop.
Comment 32 Hans de Goede 2024-04-10 12:28:57 UTC
I'm actually unsure what the quiet profile means compared to low power. I would think it's better for Asus driver to use low power.

Note that the actual CPU power-limits between balanced and quiet do not change on this laptop, so I guess quiet chooses a fan curve with less aggressive ramping up of the fans on higher temps. IOW quiet != low-power.

Anyways this is exactly what I meant with how allowing multiple providers will be messy. I really think just adding a DMI quirk for this hopefully single broken model is best.

And if we see this more often maybe we need a way to prioritize one provider over the other, e.g. pass a priority level when registering and if a higher priority provider shows up unregister the previous one. Although with a complex driver like amd-pmf getting an unregister-platform-profile callback for this called from the core might be tricky to implement.
Comment 33 Mario Limonciello (AMD) 2024-04-10 12:44:38 UTC
OK, in that case let's see dmidecode output to pick the best string to use for a quirk.
Comment 34 Mario Limonciello (AMD) 2024-04-10 14:24:13 UTC
I've posted a quirk based on what was in your dmesg attached to this bug.

https://lore.kernel.org/platform-driver-x86/20240410140956.385-1-mario.limonciello@amd.com/T/#me44972a836808229f63a8afa58aee837146f26ad

Please have a try with that. If it doesn't work, please get dmidecode attached and share the dynamic debug output when amd-pmf is loaded.
Comment 35 al0uette 2024-04-10 14:38:35 UTC
(In reply to Mario Limonciello (AMD) from comment #34)
> I've posted a quirk based on what was in your dmesg attached to this bug.
> 
> https://lore.kernel.org/platform-driver-x86/20240410140956.385-1-mario.
> limonciello@amd.com/T/#me44972a836808229f63a8afa58aee837146f26ad
> 
> Please have a try with that. If it doesn't work, please get dmidecode
> attached and share the dynamic debug output when amd-pmf is loaded.

OK, thank you, I'll try it. Should I dmidecode before or after applying this patch?
Comment 36 Mario Limonciello (AMD) 2024-04-10 14:45:59 UTC
Only need dmidecode if it doesn't work. Doesn't matter when you call it.
Comment 37 al0uette 2024-04-10 14:54:21 UTC
Got it. I'm compiling kernel now
Comment 38 al0uette 2024-04-10 15:22:45 UTC
On 6.9.0-rc1 it works perfectly fine now. Thank you Mario and Hans
Comment 39 Mario Limonciello (AMD) 2024-04-12 11:21:40 UTC
Just to be crystal clear you mean 6.9-rc1 + ghY patch series right?
Comment 40 Mario Limonciello (AMD) 2024-04-12 13:04:04 UTC
*my
Comment 41 al0uette 2024-04-12 13:32:28 UTC
(In reply to Mario Limonciello (AMD) from comment #39)
> Just to be crystal clear you mean 6.9-rc1 + ghY patch series right?

To be accurate, it's 6.9.0-rc1 from tiwai's branch(https://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git, commit f17dcc83a813d62cc249ade09167acdbb0c3d5e3), plus your patch series and a patch to fix volume control for cirrus amps. The content of that patch is as follows:

From 28668199b2a6f7df30d9192fe23de48859a988e7 Mon Sep 17 00:00:00 2001
From: "Luke D. Jones" <luke@ljones.dev>
Date: Wed, 3 Apr 2024 12:05:16 +1300
Subject: [PATCH] Test patch for GA403U series

Signed-off-by: Luke D. Jones <luke@ljones.dev>
---
 sound/pci/hda/patch_realtek.c | 20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c
index cdcb28aa9d7b..1c08bf26b15e 100644
--- a/sound/pci/hda/patch_realtek.c
+++ b/sound/pci/hda/patch_realtek.c
@@ -7467,6 +7467,8 @@ enum {
 	ALC285_FIXUP_CS35L56_I2C_2,
 	ALC285_FIXUP_CS35L56_I2C_4,
 	ALC285_FIXUP_ASUS_GA403U,
+	ALC285_FIXUP_ASUS_GA403U_HEADSET_MIC,
+	ALC285_FIXUP_ASUS_GA403U_I2C_SPEAKER2_TO_DAC1,
 };
 
 /* A special fixup for Lenovo C940 and Yoga Duet 7;
@@ -9690,6 +9692,22 @@ static const struct hda_fixup alc269_fixups[] = {
 		.type = HDA_FIXUP_FUNC,
 		.v.func = alc285_fixup_asus_ga403u,
 	},
+	[ALC285_FIXUP_ASUS_GA403U_HEADSET_MIC] = {
+		.type = HDA_FIXUP_PINS,
+		.v.pins = (const struct hda_pintbl[]) {
+			{ 0x19, 0x03a11050 },
+			{ 0x1b, 0x03a11c30 },
+			{ }
+		},
+		.chained = true,
+		.chain_id = ALC285_FIXUP_ASUS_GA403U_I2C_SPEAKER2_TO_DAC1
+	},
+	[ALC285_FIXUP_ASUS_GA403U_I2C_SPEAKER2_TO_DAC1] = {
+		.type = HDA_FIXUP_FUNC,
+		.v.func = alc285_fixup_speaker2_to_dac1,
+		.chained = true,
+		.chain_id = ALC285_FIXUP_ASUS_GA403U,
+	},
 };
 
 static const struct snd_pci_quirk alc269_fixup_tbl[] = {
@@ -10143,7 +10161,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
 	SND_PCI_QUIRK(0x1043, 0x1a83, "ASUS UM5302LA", ALC294_FIXUP_CS35L41_I2C_2),
 	SND_PCI_QUIRK(0x1043, 0x1a8f, "ASUS UX582ZS", ALC245_FIXUP_CS35L41_SPI_2),
 	SND_PCI_QUIRK(0x1043, 0x1b11, "ASUS UX431DA", ALC294_FIXUP_ASUS_COEF_1B),
-	SND_PCI_QUIRK(0x1043, 0x1b13, "ASUS U41SV/GA403U", ALC285_FIXUP_ASUS_GA403U),
+	SND_PCI_QUIRK(0x1043, 0x1b13, "ASUS U41SV/GA403U", ALC285_FIXUP_ASUS_GA403U_HEADSET_MIC),
 	SND_PCI_QUIRK(0x1043, 0x1b93, "ASUS G614JVR/JIR", ALC245_FIXUP_CS35L41_SPI_2),
 	SND_PCI_QUIRK(0x1043, 0x1bbd, "ASUS Z550MA", ALC255_FIXUP_ASUS_MIC_NO_PRESENCE),
 	SND_PCI_QUIRK(0x1043, 0x1c03, "ASUS UM3406HA", ALC287_FIXUP_CS35L41_I2C_2),
-- 
2.44.0

I guess other parts doesn't really matters, since the major differences are in audio part.