Bug 195455 - Bisected: Notebook display enabled despite closed lid and external displays
Summary: Bisected: Notebook display enabled despite closed lid and external displays
Status: CLOSED CODE_FIX
Alias: None
Product: ACPI
Classification: Unclassified
Component: Power-Lid (show other bugs)
Hardware: Intel Linux
: P1 normal
Assignee: Lv Zheng
URL:
Keywords:
: 197361 (view as bug list)
Depends on:
Blocks:
 
Reported: 2017-04-18 17:39 UTC by Steffen Weber
Modified: 2017-10-23 16:05 UTC (History)
6 users (show)

See Also:
Kernel Version: 4.10.11
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
acpidump of HP EliteBook 840 G1 (753.42 KB, text/plain)
2017-04-24 07:38 UTC, Steffen Weber
Details
Disassembled DSDT of Thinkpad X200 (510.42 KB, text/x-csrc)
2017-04-25 17:21 UTC, Julian Wiedmann
Details
[PATCH] Reduce kernel side wrong usages of lid status (3.61 KB, patch)
2017-04-26 01:56 UTC, Lv Zheng
Details | Diff
[PATCH] ACPI: button: Obselete acpi_lid_open() invocations (1.69 KB, patch)
2017-04-26 06:39 UTC, Lv Zheng
Details | Diff
[PATCH] ACPI: button: Fix lid notification (2.77 KB, patch)
2017-04-26 06:39 UTC, Lv Zheng
Details | Diff
[PATCH] ACPI: button: Do not relay fake events (2.11 KB, patch)
2017-04-26 06:40 UTC, Lv Zheng
Details | Diff
[PATCH] ACPI: button: Use different strategy for user/kernel spaces (2.39 KB, patch)
2017-04-27 08:51 UTC, Lv Zheng
Details | Diff
PATCH] ACPI: button: Always notify kernel drivers using _LID returning value (1.32 KB, patch)
2017-05-04 08:43 UTC, Lv Zheng
Details | Diff
[PATCH] ACPI: button: Some usage models still require button.lid_init_state=method (2.03 KB, patch)
2017-05-04 08:43 UTC, Lv Zheng
Details | Diff
dmesg output for test-case 0 (39.59 KB, text/plain)
2017-05-04 13:31 UTC, Steffen Weber
Details
dmesg output for test-case 1 (38.57 KB, text/plain)
2017-05-04 13:31 UTC, Steffen Weber
Details
dmesg output for test-case 2 (39.75 KB, text/plain)
2017-05-04 13:31 UTC, Steffen Weber
Details
dmesg output for test-case 3 (39.78 KB, text/plain)
2017-05-04 13:32 UTC, Steffen Weber
Details

Description Steffen Weber 2017-04-18 17:39:45 UTC
I'm using an HP EliteBook 840 G1 attached to a docking station with two external monitors. The notebook lid is closed, I only use the external monitors.

When I boot my notebook with Linux 4.10.9 then everything is working fine: The the two external monitors are automatically used as "Primary" + "Secondary" and the built-in display is shown as "LID Closed" in the Gnome Display settings.

But when I boot my notebook with Linux 4.10.10+ then the built-in display is used, too (in addition to the two external monitors) although its lid is closed.

I can workaround this issue by using "button.lid_init_state=method" and thereby essentially reverting the 4.10.10 commit 8d5dd97f55563634ad830ea47c709bc96606ad65 (upstream commit 77e9a4aa9de10cc1418bf9a892366988802a8025).

Is this intended behaviour?

I'm using Gentoo Linux with Systemd 233, X.Org 1.19.3 and Gnome 3.22.2.
Comment 1 Lv Zheng 2017-04-19 03:09:58 UTC
Can "button.lid_init_state=ignore" work this around?
Comment 2 Lv Zheng 2017-04-19 03:13:09 UTC
IMO, this is not a regression.

In fact, we need to make "button.lid_init_state=ignore" the default value.
But that requires systemd guys to change accordingly.
So it is finally become a compromization to use "button.lid_init_state=open" as the default value in order to make other bug reports muted.

You cannot say "hey, there is a bug in kernel" while actually the bug is in the userspace. It's endless. ;)

Thanks
Lv
Comment 3 Steffen Weber 2017-04-19 11:08:03 UTC
Thank you for the feedback!

Using "button.lid_init_state=ignore" does not help. But I'm fine with using the workaround until systemd is updated. Just wanted to ensure that this behaviour is intentional (for the moment).

Is it correct that the systemd developers are aware of this issue? Can you point me to a systemd commit or bug report?
Comment 4 Lv Zheng 2017-04-24 07:28:01 UTC
> Using "button.lid_init_state=ignore" does not help.

That's weird. Could you upload acpidump output for further investigation?

Anyway we need to document any possible application problems for different use cases and workarounds.

Thanks
Lv
Comment 5 Steffen Weber 2017-04-24 07:38:37 UTC
Created attachment 255975 [details]
acpidump of HP EliteBook 840 G1

Please see the attached acpidump output.

(And btw, I've tried "button.lid_init_state=ignore" again, unfortunately it really does not work for me.)
Comment 6 Lv Zheng 2017-04-25 08:12:36 UTC
        Device (LID)
        {
            Name (_HID, EisaId ("PNP0C0D"))  // _HID: Hardware ID
            Method (_LID, 0, NotSerialized)  // _LID: Lid Status
            {
                Store (\_SB.PCI0.LPCB.EC0.CLID, Local0)
                Return (Local0)
            }
        }

It's notified in _Q0A:

                    Method (_Q0A, 0, NotSerialized)  // _Qxx: EC Query
                    {
                        Store (\_GPE.VLET (), Local4)
                        Notify (\_SB.LID, 0x80)
                        \_SB.PCI0.ACEL.AJAL ()
                        \_SB.WMID.WGWE (0x08, Add (\_SB.LID._LID (), 0x01))
                    }

I guess _Q0A is not triggered after being resumed.
So the only lid state notification the userspace can receive after being resumed is an open event, which breaks this usage model.

Since this cannot be worked around by button.lid_init_state=ignore, couldn't we just think this is a wrong usage model? Or a usage model shouldn't be implemented by the ACPI control method lid device?

Thanks and best regards
Lv
Comment 7 Lv Zheng 2017-04-25 08:54:34 UTC
That behavior should be similar as button.lid_init_state=ignore.
So I still don't know why it cannot work.

Could you:
1. boot the kernel with dyndbg=\"file ec.c +p\" button.lid_init_state=ignore and try this case and post post-resume dmesg output here.
2. boot the kernel with dyndbg=\"file ec.c +p\" button.lid_init_state=open and try this case and post post-resume dmesg output here.

Let's see if _Q0A is triggered in the 2 tests.

Thanks
Lv
Comment 8 Julian Wiedmann 2017-04-25 17:21:43 UTC
Created attachment 256021 [details]
Disassembled DSDT of Thinkpad X200

Same issue, =ignore doesn't help either.
Comment 9 Lv Zheng 2017-04-26 01:56:33 UTC
Created attachment 256045 [details]
[PATCH] Reduce kernel side wrong usages of lid status
Comment 10 Lv Zheng 2017-04-26 01:58:18 UTC
In theory, graphics driver should be able to bring monitors back without knowing the ACPI lid status.
So this sounds more like a usage model problem - graphics drivers or userspace tools have chosen to trust some unreliable information to make the decision.

Is that possible to file a bug on freedesktop.org to fix the user side rather than reverting the lid driver back to the wrong behavior to trigger other problems?

Here I have a commit that may reduce possibilities to trigger problems from the kernel side. You can apply it and try again.
Comment 11 Lv Zheng 2017-04-26 06:36:07 UTC
I guess the reporters are all using i915.
It looks nouveau developers have already been aware of this problem.
In nouveau drivers, external monitor users should use nouveau_ignorelid module parameter.
http://www.linuxquestions.org/questions/linux-newbie-8/blank-screen-on-lid-close-nomatter-what-i-do-4175445108/

So probably we should just raise this to freedesktop.org to let i915 to change.

Thanks
Lv
Comment 12 Lv Zheng 2017-04-26 06:39:01 UTC
Created attachment 256047 [details]
[PATCH] ACPI: button: Obselete acpi_lid_open() invocations

I split the attachment 256045 [details] into 3 commits.
So that we can judge which one is useful.
So please use this and the follow up 2 commmits instead.
Comment 13 Lv Zheng 2017-04-26 06:39:45 UTC
Created attachment 256049 [details]
[PATCH] ACPI: button: Fix lid notification

This patch contains improvements.
Comment 14 Lv Zheng 2017-04-26 06:40:25 UTC
Created attachment 256051 [details]
[PATCH] ACPI: button: Do not relay fake events
Comment 15 Lv Zheng 2017-04-26 07:04:36 UTC
IMO, the best choice for both nouveau and i915 lvds drivers is:
1. apply the patches in this bug;
2. register lid notifier, and if there is no lid notification recieved after resuming, the drivers should think there is a broken (possibly, but maybe it's just no such usage model) acpi lid and automatically fall into a behavior that is similar as ignorelid;
3. remove nouveau_ignorelid module parameter as it can be automatically detected via lid notifier.
Comment 16 Steffen Weber 2017-04-26 07:53:27 UTC
First of all, thank you for looking into this issue!

I'm a little confused about Comment 7 (and the email you sent me via CC). Both mention suspend + resume. For me, this issue has nothing to do with suspend + resume. I'm experiencing this issue on a regular boot or reboot. (Maybe it happens with suspend + resume, too, I have not tried that.)

Furthermore, the issue is _not_ that "the external monitor couldn't be lit on" (quote from email). The external displays are working fine in both cases (Linux 4.10.10- and Linux 4.10.11+). The issue is that on Linux 4.10.11+ the _internal_ display is lit on even when its lid is closed (unless I use the "button.lid_init_state=method" workaround).

So to clarify, the situation is as follows: My notebook is docked with its lid _closed_ and two external monitors are attached to dock. When I now switch the notebook on its _internal_ display is lit on. This does not make sense because the lid is closed. This issue exists since Linux 4.10.11.

Anyways, for testing the patches, I've removed "button.lid_init_state=method" from my grub config. I've then applied patches 1+2 to Linux 4.10.12. Unfortunately, that does not make any difference: the internal display is lit on despite its lid being closed (i.e. no difference from vanilla 4.10.12).

When I try to compile Linux 4.10.12 with patches 1+2+3 I get an error:

  CC      drivers/acpi/button.o
drivers/acpi/button.c: In function ‘acpi_lid_initialize_state’:
drivers/acpi/button.c:378:9: error: too few arguments to function ‘acpi_lid_update_state’
   (void)acpi_lid_update_state(device);
         ^
drivers/acpi/button.c:358:12: note: declared here
 static int acpi_lid_update_state(struct acpi_device *device,
            ^
make[2]: *** [scripts/Makefile.build:295: drivers/acpi/button.o] Error 1
make[1]: *** [scripts/Makefile.build:553: drivers/acpi] Error 2
make: *** [Makefile:988: drivers] Error 2
Comment 17 Lv Zheng 2017-04-26 08:43:53 UTC
> Furthermore, the issue is _not_ that "the external monitor couldn't be lit
> on" (quote from email). The external displays are working fine in both cases
> (Linux 4.10.10- and Linux 4.10.11+). The issue is that on Linux 4.10.11+ the
> _internal_ display is lit on even when its lid is closed (unless I use the
> "button.lid_init_state=method" workaround).

OK, I see. The other case is here: Bug 187271
But root cause should be same.

> Anyways, for testing the patches, I've removed "button.lid_init_state=method"
> from my grub config. I've then applied patches 1+2 to Linux 4.10.12.
> Unfortunately, that does not make any difference: the internal display is lit
> on despite its lid being closed (i.e. no difference from vanilla 4.10.12).

I see, as you said button.lid_init_state=ignore cannot work this around.
Which is similar as the change - do not send notifications.
So applying these patches won't fix your issues.
But if graphics drivers are going to change, IMO, they should based on these changes, - without the additional "open" event.

> When I try to compile Linux 4.10.12 with patches 1+2+3 I get an error:
OK, I'll check and update the patches.

Thanks
Lv
Comment 18 Lv Zheng 2017-04-26 08:49:51 UTC
Hi, Steffen:

I couldn't see problems of in the patches.
This block surely contains the correct change for the line you reported:
@@ -368,7 +371,7 @@ static void acpi_lid_initialize_state(st
 {
 	switch (lid_init_state) {
 	case ACPI_BUTTON_LID_INIT_OPEN:
-		(void)acpi_lid_notify_state(device, 1);
+		(void)acpi_lid_notify_state(device, 1, false);
 		break;
 	case ACPI_BUTTON_LID_INIT_IGNORE:
 	default:
The error indicates that you failed to apply this block.
Please check again and we need the test result of 1+2+3.

Thanks
Lv
Comment 19 Lv Zheng 2017-04-26 08:53:59 UTC
Forgot to mentione:
I'm using linux-pm.git/linux-next branch.
You can find it here:
https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/?h=linux-next

Maybe it contains something more than the upstream kernel.
Could you try this branch?

Thanks
Comment 20 Steffen Weber 2017-04-26 11:43:03 UTC
An occurence of "acpi_lid_update_state(device)" inside a "case ACPI_BUTTON_LID_INIT_METHOD" a few lines above needed to be changed. I'm not sure whether "true" or "false" should be passed as a second argument but for this test this should be irrelevant (because its inside "case ACPI_BUTTON_LID_INIT_METHOD") and I've made it to compile by passing "false".

I then rebooted with Linux 4.10.12 and patches 1+2+3 but unfortunately that has not solved the problem.

When using the "button.lid_init_state=method" workaround I see a boot message "ACPI : button: Notify initial lid state with _LID return value". This message does not appear when I boot with patches 1+2+3 and without the workaround. It seems like to lid is assumed to be open in this case.

Would it still make sense to try the linux-pm.git/linux-next branch or was that just a suggestion to solve the compile error?
Comment 21 Lv Zheng 2017-04-27 08:38:49 UTC
> An occurence of "acpi_lid_update_state(device)" inside a "case
> ACPI_BUTTON_LID_INIT_METHOD" a few lines above needed to be changed. I'm not
> sure whether "true" or "false" should be passed as a second argument but for
> this test this should be irrelevant (because its inside "case
> ACPI_BUTTON_LID_INIT_METHOD") and I've made it to compile by passing "false".

ACPI_BUTTON_LID_INIT_METHOD is not in the upstream.
I surely want to use "false" and force graphics side to make change, but you may want to use "true" in order to still be able to use the quirk. :)

> I then rebooted with Linux 4.10.12 and patches 1+2+3 but unfortunately that
> has not solved the problem.

Yes, I see, as button.lid_init_state=ignore cannot solve this.

> When using the "button.lid_init_state=method" workaround I see a boot message
> "ACPI : button: Notify initial lid state with _LID return value". This
> message does not appear when I boot with patches 1+2+3 and without the
> workaround. It seems like to lid is assumed to be open in this case.

You can try to use "true" in the above code, and see if you still can use the quirk. I think the answer is yes but may need your feedback here.

Thanks and best regards
Lv
Comment 22 Lv Zheng 2017-04-27 08:40:20 UTC
To: Steffen & Julian

Could you tell me your kernel graphics driver?
nouveau or i915?

Maybe for nouveau, you can just use the quirk already implemented by it.
See
http://www.linuxquestions.org/questions/linux-newbie-8/blank-screen-on-lid-close-nomatter-what-i-do-4175445108/
for reference.

Thanks
Lv
Comment 23 Lv Zheng 2017-04-27 08:51:52 UTC
Created attachment 256089 [details]
[PATCH] ACPI: button: Use different strategy for user/kernel spaces

To: Steffen

Even without graphics guys' support, you won't need to restore the METHOD commit, IMO, you only need this commit on top of 1+2+3.

Thanks
Lv
Comment 24 Steffen Weber 2017-04-27 14:18:13 UTC
> Could you tell me your kernel graphics driver?

CONFIG_DRM_I915=y
# CONFIG_DRM_I915_ALPHA_SUPPORT is not set
CONFIG_DRM_I915_CAPTURE_ERROR=y
CONFIG_DRM_I915_COMPRESS_ERROR=y
CONFIG_DRM_I915_USERPTR=y
# CONFIG_DRM_I915_GVT is not set

> Even without graphics guys' support, you won't need to restore the METHOD
> commit, IMO, you only need this commit on top of 1+2+3.

I've downloaded "linux-pm-acpi-4.11-final.tar.gz" and applied patches 1+2+3+4.

When booting _with_ "button.lid_init_state=method" an error is logged ("Booting kernel: `method' invalid for parameter `button.lid_init_state'") and the problem is still there.

When booting _without_ "button.lid_init_state=method" no error is logged and the problem is still there, too. The new message introduced by patch 4 ("Notify initial lid state to users space as open and kernel drivers using _LID returnig value") is _not_ logged. I'm not sure if it is supposed to show up (I guess not because my initial lid state is supposed to be "close", not "open").
Comment 25 Steffen Weber 2017-04-27 14:18:50 UTC
# CONFIG_DRM_NOUVEAU is not set
Comment 26 Julian Wiedmann 2017-04-27 19:54:24 UTC
Lv,

the GPU uses i915. Testing the patches will take a few days, sorry...
Comment 27 Lv Zheng 2017-05-04 06:22:01 UTC
Opened a freedesktop.org ticket here:
https://bugs.freedesktop.org/show_bug.cgi?id=100923

I'll continue to help to find a better choice before it is fixed there.
BTW, have you guys tried to use "nomodeset" boot parameter with the affected kernels?
Comment 28 Lv Zheng 2017-05-04 08:31:22 UTC
To: Steffen

> When booting _without_ "button.lid_init_state=method" no error is logged and
> the problem is still there, too. The new message introduced by patch 4
> ("Notify initial lid state to users space as open and kernel drivers using
> _LID returnig value") is _not_ logged. I'm not sure if it is supposed to show
> up (I guess not because my initial lid state is supposed to be "close", not
> "open").

You can try both "button.lid_init_state=quirk" and "nomodeset", or either of them, and upload the 3 dmesg outputs here.

The patch 4 is not supposed to stay reporting open to both kernel/user space. but is trying to still report _LID returning value to lvds drivers (for this usage model), in the mean while report open to userspace (for systemd usage model).
However I can help to prepare a series with less changes.
Maybe I did too many changes and some of them messed the things up.
Comment 29 Lv Zheng 2017-05-04 08:43:15 UTC
Created attachment 256189 [details]
PATCH] ACPI: button: Always notify kernel drivers using _LID returning value

Update patch 4
Comment 30 Lv Zheng 2017-05-04 08:43:49 UTC
Created attachment 256191 [details]
[PATCH] ACPI: button: Some usage models still require button.lid_init_state=method

PATCH 5
Comment 31 Lv Zheng 2017-05-04 08:49:45 UTC
You can try to use PATCH 1-5. (Note PATCH 4 is updated).
And try:

1. boot with nomodeset only, tell us what happens then
2. boot with button.lid_init_state=open, tell us what happens then and upload the dmesg output here
3. boot with button.lid_init_state=close, tell us what happens then and upload the dmesg output here

Thanks
Lv
Comment 32 Lv Zheng 2017-05-04 08:51:23 UTC
Please ignore comment 28 and focus on comment 31. Sorry for my changing mind and the comment noise.

Thanks
Lv
Comment 33 Steffen Weber 2017-05-04 13:30:51 UTC
I've compiled "linux-pm-acpi-4.11-final.tar.gz" with patches 1+2+3+4(new)+5.

0. boot without any boot parameters: bug (internal display lit on even when docked)

1. boot with nomodeset only: X.org failed to start with "open /dev/dri/card0: No such file or directory" and "no screens found". Maybe this cannot work with the "modesetting" X.org driver? I've tried switching to the Intel/i915 X.org driver but for some reason that failed with "modprobe: FATAL: Module i915 not found in directory /lib/modules/4.11.0-rc7" (I don't understand why it's looking for a kernel module when I compiled with CONFIG_DRM_I915=y).

2. boot with button.lid_init_state=open: bug (internal display lit on even when docked)

3. boot with button.lid_init_state=close: works! The internal display is not used when docked. It is used when not docked. :-)

I'll attach the dmesg outputs.

Please note that I won't have access to the dock / external monitors for 1-2 weeks and therefore won't be able to test further patches until then.
Comment 34 Steffen Weber 2017-05-04 13:31:28 UTC
Created attachment 256193 [details]
dmesg output for test-case 0
Comment 35 Steffen Weber 2017-05-04 13:31:39 UTC
Created attachment 256195 [details]
dmesg output for test-case 1
Comment 36 Steffen Weber 2017-05-04 13:31:50 UTC
Created attachment 256197 [details]
dmesg output for test-case 2
Comment 37 Steffen Weber 2017-05-04 13:32:01 UTC
Created attachment 256199 [details]
dmesg output for test-case 3
Comment 38 Lv Zheng 2017-05-05 00:21:13 UTC
> 1. boot with nomodeset only: X.org failed to start with "open /dev/dri/card0:
> No such file or directory" and "no screens found". Maybe this cannot work
> with the "modesetting" X.org driver? I've tried switching to the Intel/i915
> X.org driver but for some reason that failed with "modprobe: FATAL: Module
> i915 not found in directory /lib/modules/4.11.0-rc7" (I don't understand why
> it's looking for a kernel module when I compiled with CONFIG_DRM_I915=y).

I'm not sure, we can ask this on freedesktop.org

> 2. boot with button.lid_init_state=open: bug (internal display lit on even
> when docked)
> 3. boot with button.lid_init_state=close: works! The internal display is not
> used when docked. It is used when not docked. :-)

The difference between the 2 modes is: notification sent to userspace is different, kernel space is same (_LID returning value).
So I guess your issue is because of some userspace programs.
I don't know which program is. so this isn't a kernel problem.
I suggest you to monitor this on freedesktop.org.
We'll restore some quirk mode for you to use your current usage model.
I'm not sure which will be restored - method mode or the new close mode.
I'll discuss this internally with our teammates.

Maybe Julian's problem is different. Let's see whether his problem is from user space or from kernel space.
So keep this opened.

Thanks
Lv
Comment 39 Julian Wiedmann 2017-05-09 00:53:20 UTC
Thanks Lv!
button.lid_init_state=close is what my system needs as well.
Comment 40 Lv Zheng 2017-05-09 08:55:50 UTC
Thanks for the test.

In my patches, I tried to unified kernel space notifications, to fix both suspend/resume loop problem and external lid problem in 1 lid driver mode.
But that seems impossible as distinguishing kernel space/user space events doesn't help. Both usage models are implemented in the user space, 1 from freedesktop.org, 1 from systemd.
The kernel driver is not possible to make 2 contradictory usage models working without introducing platform quirks.

So IMO, users should also help us to raise this issue to the related community.
I'll mark this as resolved.

Thanks
Lv
Comment 41 Steffen Weber 2017-05-11 20:31:01 UTC
Will the code enabling button.lid_init_state=close be part of a future Linux kernel release?
Comment 42 Joachim Frieben 2017-05-22 03:50:01 UTC
This looks like a duplicate of bug 187271 where this behaviour had already been reported and discussed, e.g., see https://bugzilla.kernel.org/show_bug.cgi?id=187271#c47 where I also pointed out that this issue is by no means limited to Intel video devices: it also occurs on AMD Radeon video devices , namely model HD 3470 (RV620). A test kernel including patch https://bugzilla.kernel.org/show_bug.cgi?id=187271#c49 had no effect on my system.
Comment 43 Lv Zheng 2017-07-04 01:24:36 UTC
Closing...
Comment 44 Frédéric Parrenin 2017-10-23 16:05:25 UTC
*** Bug 197361 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.