Bug 215683

Summary: Display brightness goes to minimum on first AC power plug event
Product: ACPI Reporter: Werner Sembach [TUXEDO] (wse)
Component: Power-VideoAssignee: acpi_power-video
Status: RESOLVED CODE_FIX    
Severity: normal CC: jwrdegoede, mario.limonciello, rui.zhang
Priority: P1    
Hardware: AMD   
OS: Linux   
Kernel Version: 5.17-rc6 (Mainline), 5.15.0-22 (Ubuntu) Subsystem:
Regression: No Bisected commit-id:
Attachments: ACPI Dump in Binary for Clevo NL50RU BIOS 1.07.08TTR3
ACPI dump as binary for TongFang PF5NU1G bios version N.1.07.A05

Description Werner Sembach [TUXEDO] 2022-03-14 17:18:33 UTC
After boot the first time AC power barrel blug is plugied in or out of a Clevo NL50RU or NL50NU laptop, the display brightness goes to minimum. It can Still be adjusted afterwards. It does not change again on subsequent AC power plug events until next reboot.

It does not happen when acpi_backlight boot parameter is explicitly set to any available value.

I noticed that the first call to acpi_video_get_backlight_type() happens by acpi_video_bus_add()-> acpi_video_bus_register_backlight(). Here the default value "acpi_backlight_video" is returned because backlight_device_get_by_type(BACKLIGHT_RAW) returns a NULL.
Later during boot, acpi_video_get_backlight_type() is called again, this time by acpi_video_backlight_notify_work(). But here it returns "acpi_backlight_native", because backlight_device_get_by_type(BACKLIGHT_RAW) now returns a pointer.

When setting the boot parameter explicitly this change in return value does not happen, because then the boot parameter value is returned in both cases.
Comment 1 Werner Sembach [TUXEDO] 2022-03-14 18:32:05 UTC
Forgot to add the devices are AMD APUs only devices with a 4700U and a 5500U respectifly.

Pleas let me know which information can be helpfull for debugging.
Comment 2 Werner Sembach [TUXEDO] 2022-03-15 14:21:29 UTC
Some more information after some digging and printf debugging:

This functions registers some stuff when acpi_backlight=video: https://elixir.bootlin.com/linux/v5.16.14/source/drivers/acpi/acpi_video.c#L1859

When acpi_backlight!=video this function deregisters this stuff again, if registered: https://elixir.bootlin.com/linux/v5.16.14/source/drivers/acpi/acpi_video.c#L2259

When setting acpi_backlight explicitly, it either stays registered or it is never registered so the second function is a noop.

The actuall call to reduce screen brigtnes happens in this https://elixir.bootlin.com/linux/v5.16.14/source/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c#L818 respectifly this https://elixir.bootlin.com/linux/v5.16.14/source/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c#L448 function.

When on acpi_backlight=video the backlicht seems to be controlled via this atif_handler/SBIOS requests (I don't know what atif or SBIOS stands for). When on acpi_backlight!=video only the ATIF_DGPU_DISPLAY_EVENT is triggered by the SBIOS.

However when the registering and unregistering mentioned above happened, the first call to amdgpu_atif_handler catches some dangling SBIOS change brightness event not properly deleted by the unregister it seems. This sets the brightness to a value of 4, which is the observed bug in the first post.
Comment 3 Werner Sembach [TUXEDO] 2022-03-15 14:22:37 UTC
dmesg output of dangling sbios event with drm.debug=0x2 and backlight.dyndbg=+p :

[ 44.596105] [drm:amdgpu_acpi_event [amdgpu]] event, device_class = video, type = 0x81
[ 44.596351] [drm:amdgpu_acpi_event [amdgpu]] SBIOS pending requests: 0x90
[ 44.596487] [drm:amdgpu_acpi_event [amdgpu]] ATIF: 2 pending SBIOS requests
[ 44.596615] [drm:amdgpu_acpi_event [amdgpu]] Changing brightness to 4
[ 44.596891] backlight: set brightness to 4
Comment 4 Mario Limonciello (AMD) 2022-07-11 23:50:56 UTC
Can this still reproduce on 5.19-rc6?

Something very similar was reported on some other OEM's Ryzen 6000 systems.  It was fixed in their SBIOS.  Can you share an acpidump?  Maybe we can match the events to see where this is happening.

But most likely this also sounds like it's an SBIOS bug, that it's sending an extra event, and it should be reported up to the OEM to be fixed.
Comment 5 Werner Sembach [TUXEDO] 2022-07-12 10:00:10 UTC
Created attachment 301401 [details]
ACPI Dump in Binary for Clevo NL50RU BIOS 1.07.08TTR3

ofc here is the acpi dump.
TTR3 is a version of the bios that supports s3 instead of modern standby. However the behavior is the same between this bios and the vanilla Clevo bios. So the issue is unrelated.

I also found several TongFang AMD devices having the exact same issue. So it's not specific to one OEM, but maybe a bug in the code that is provided by AMD to the OEMs?
Comment 6 Werner Sembach [TUXEDO] 2022-07-12 10:02:03 UTC
*the 5.19-rc6 kernel already has a workaround quirk for NL50RU and NL50NU included.

But i will test a TongFang deice with the same issue and report back.
Comment 7 Werner Sembach [TUXEDO] 2022-07-12 10:20:55 UTC
Created attachment 301402 [details]
ACPI dump as binary for TongFang PF5NU1G bios version N.1.07.A05

Same issue with 5.19-rc6 on the TongFang device
Comment 8 Mario Limonciello (AMD) 2022-07-12 15:57:06 UTC
> So it's not specific to one OEM, but maybe a bug in the code that is provided
> by AMD to the OEMs?

AMD provides code to IBVs, and IBVs provide code to OEMs.  It's more "likely" to be a problem with a particular IBV here.

> *the 5.19-rc6 kernel already has a workaround quirk for NL50RU and NL50NU
> included.

Can you please share more details about this workaround quirk?  I'm not familiar with it.
Comment 9 Werner Sembach [TUXEDO] 2022-07-12 16:35:21 UTC
Here is the patch: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c844d22fe0c0b37dc809adbdde6ceb6462c43acf

It does the same as setting the acpi_backlight=... parameter.

Like I wrote above, it doesn't matter to what, as all 3 interfaces are fully functional. Just switching after boot causes problems, but that is the default behaviour of the current implementation.

Sorry for my noob question: What does IBV stand for?
Comment 10 Mario Limonciello (AMD) 2022-07-12 17:11:58 UTC
Ah thanks for sharing.  So it seems that calling acpi_video_unregister_backlight is what actually "causes" this problem then.

The comments in the code make it sound pretty risky to switch the behavior around.

I wonder if maybe we can get a new heuristic added to that for ACPI_OSI_WIN_10 or newer with higher preference above registering and unregistering acpi video?
Comment 11 Mario Limonciello (AMD) 2022-07-12 17:12:14 UTC
> IBV
Independent BIOS vendor
Comment 12 Mario Limonciello (AMD) 2022-07-12 20:32:03 UTC
So you know - there is a series working on overhauling this registration stuff.  Can you see how things behave with that series?

https://patchwork.freedesktop.org/series/104087/#rev2
Comment 13 Werner Sembach [TUXEDO] 2022-07-13 10:47:40 UTC
(In reply to Mario Limonciello (AMD) from comment #12)
> So you know - there is a series working on overhauling this registration
> stuff.  Can you see how things behave with that series?
> 
> https://patchwork.freedesktop.org/series/104087/#rev2

The link seems to be down atm, but looking at the mailing list I think I know which patchseries you are refering to. Will try it later.

Thanks for the IBV explanation.

For completness and reference: The Clevo Barebones use an Insyde BIOS while TongFang is using an AMI BIOS.
Comment 14 Mario Limonciello (AMD) 2022-07-13 14:47:18 UTC
I see that series is reverting your quirks too.  If it works for you, you should add a link to this bug report and a Tested-by for it.
Comment 15 Hans de Goede 2022-07-13 16:19:35 UTC
(In reply to Mario Limonciello (AMD) from comment #14)
> I see that series is reverting your quirks too.  If it works for you, you
> should add a link to this bug report and a Tested-by for it.

Right, that series removes the ping-pong where we first register /sys/class/backlight/acpi_video0 and then remove it again when e.g. /sys/class/backlight/amdgpu_bl shows up. And the whole point of the quirk was also to remove that ping-pong since that was causing trouble on the quirked models. So now that the ping-pong is generally fixed the quirks should no longer be necessary.

Note I did add wse to the Cc of the patch removing the quirk.
Comment 16 Werner Sembach [TUXEDO] 2022-07-13 16:45:34 UTC
(In reply to Hans de Goede from comment #15)
> (In reply to Mario Limonciello (AMD) from comment #14)
> > I see that series is reverting your quirks too.  If it works for you, you
> > should add a link to this bug report and a Tested-by for it.
> 
> Right, that series removes the ping-pong where we first register
> /sys/class/backlight/acpi_video0 and then remove it again when e.g.
> /sys/class/backlight/amdgpu_bl shows up. And the whole point of the quirk
> was also to remove that ping-pong since that was causing trouble on the
> quirked models. So now that the ping-pong is generally fixed the quirks
> should no longer be necessary.

Can confirm.

I have one device left to test. Then I will formaly answer on the mailing list.

But I guess the patch series is only for upcomming kernel and not for stable?
So distros using a lts kernel still need the quirk?

> 
> Note I did add wse to the Cc of the patch removing the quirk.

Yes the mail arived shortly after Mario mentioned the patchset, that's why I found it so fast when the link wasn't working for some reason.

Just had some issues compiling drm-tip today (undefined variable because of some revert unrelated to the patchset).
Comment 17 Mario Limonciello (AMD) 2022-07-13 16:50:38 UTC
> So distros using a lts kernel still need the quirk?

With so many subsystems touched I think it's unlikely that this can come back to stable.

Hans,

As that patch series is still under review and how many different maintainers need to review and ACK, I would guess this isn't going to make 5.20.

What do you think about adding a quirk for the missing Tong Fang device for 5.20 that can CC stable, and then in your revert patch drop the Tong Fang device too?
Comment 18 Werner Sembach [TUXEDO] 2022-07-13 16:59:24 UTC
(In reply to Mario Limonciello (AMD) from comment #17)
> > So distros using a lts kernel still need the quirk?
> 
> With so many subsystems touched I think it's unlikely that this can come
> back to stable.
> 
> Hans,
> 
> As that patch series is still under review and how many different
> maintainers need to review and ACK, I would guess this isn't going to make
> 5.20.
> 
> What do you think about adding a quirk for the missing Tong Fang device for
> 5.20 that can CC stable, and then in your revert patch drop the Tong Fang
> device too?

For reference: The TongFang quirks are already on the mailing list: https://lore.kernel.org/linux-acpi/20220707180953.605246-1-wse@tuxedocomputers.com/T/#u
Comment 19 Hans de Goede 2022-07-13 17:23:30 UTC
> But I guess the patch series is only for upcomming kernel and not for stable?

Correct.

> So distros using a lts kernel still need the quirk?

Yes.

> What do you think about adding a quirk for the missing Tong Fang device for
> 5.20 that can CC stable, and then in your revert patch drop the Tong Fang
> device too?

That sounds good to me, go for it.

> For reference: The TongFang quirks are already on the mailing list:
> https://lore.kernel.org/linux-acpi/20220707180953.605246-1-wse@tuxedocomputers.com/T/#u

Thanks for the pointer, I'll give these my Reviewed-by.
Comment 20 Werner Sembach [TUXEDO] 2022-10-19 16:05:02 UTC
I found more TF devices with this problem. I sent the patch directly to stable, but that might have been wrong?

How do you post commits that are only required on stable and not on upstream?
Comment 21 Hans de Goede 2022-10-19 19:59:07 UTC
(In reply to wse from comment #20)
> I found more TF devices with this problem. I sent the patch directly to
> stable, but that might have been wrong?
> 
> How do you post commits that are only required on stable and not on upstream?

You need to explain in the commit message that they are only applicable to stable and why this is being fixed in a different way in the stable series.

Since this is just adding DMI quirks I expect the stable series maintainers to be fine with this, but you do need to explain why things are done this way; and this explanation needs to be in the commit message, so that other people will know where the changes come from / why there is no upstream commit id referenced.

If you want you can send an updated version of the patches directly to me first and then I can check the commit message. At a minimum for the next version / submission to stable@... please Cc me then I can give my Reviewed-by to the patches.
Comment 22 Mario Limonciello (AMD) 2022-10-20 02:47:16 UTC
As Hans' series is landed in 6.1-rc1, I think we can close this issue now too.