Bug 195455
Description
Steffen Weber
2017-04-18 17:39:45 UTC
Can "button.lid_init_state=ignore" work this around? IMO, this is not a regression. In fact, we need to make "button.lid_init_state=ignore" the default value. But that requires systemd guys to change accordingly. So it is finally become a compromization to use "button.lid_init_state=open" as the default value in order to make other bug reports muted. You cannot say "hey, there is a bug in kernel" while actually the bug is in the userspace. It's endless. ;) Thanks Lv Thank you for the feedback! Using "button.lid_init_state=ignore" does not help. But I'm fine with using the workaround until systemd is updated. Just wanted to ensure that this behaviour is intentional (for the moment). Is it correct that the systemd developers are aware of this issue? Can you point me to a systemd commit or bug report? > Using "button.lid_init_state=ignore" does not help.
That's weird. Could you upload acpidump output for further investigation?
Anyway we need to document any possible application problems for different use cases and workarounds.
Thanks
Lv
Created attachment 255975 [details]
acpidump of HP EliteBook 840 G1
Please see the attached acpidump output.
(And btw, I've tried "button.lid_init_state=ignore" again, unfortunately it really does not work for me.)
Device (LID) { Name (_HID, EisaId ("PNP0C0D")) // _HID: Hardware ID Method (_LID, 0, NotSerialized) // _LID: Lid Status { Store (\_SB.PCI0.LPCB.EC0.CLID, Local0) Return (Local0) } } It's notified in _Q0A: Method (_Q0A, 0, NotSerialized) // _Qxx: EC Query { Store (\_GPE.VLET (), Local4) Notify (\_SB.LID, 0x80) \_SB.PCI0.ACEL.AJAL () \_SB.WMID.WGWE (0x08, Add (\_SB.LID._LID (), 0x01)) } I guess _Q0A is not triggered after being resumed. So the only lid state notification the userspace can receive after being resumed is an open event, which breaks this usage model. Since this cannot be worked around by button.lid_init_state=ignore, couldn't we just think this is a wrong usage model? Or a usage model shouldn't be implemented by the ACPI control method lid device? Thanks and best regards Lv That behavior should be similar as button.lid_init_state=ignore. So I still don't know why it cannot work. Could you: 1. boot the kernel with dyndbg=\"file ec.c +p\" button.lid_init_state=ignore and try this case and post post-resume dmesg output here. 2. boot the kernel with dyndbg=\"file ec.c +p\" button.lid_init_state=open and try this case and post post-resume dmesg output here. Let's see if _Q0A is triggered in the 2 tests. Thanks Lv Created attachment 256021 [details]
Disassembled DSDT of Thinkpad X200
Same issue, =ignore doesn't help either.
Created attachment 256045 [details]
[PATCH] Reduce kernel side wrong usages of lid status
In theory, graphics driver should be able to bring monitors back without knowing the ACPI lid status. So this sounds more like a usage model problem - graphics drivers or userspace tools have chosen to trust some unreliable information to make the decision. Is that possible to file a bug on freedesktop.org to fix the user side rather than reverting the lid driver back to the wrong behavior to trigger other problems? Here I have a commit that may reduce possibilities to trigger problems from the kernel side. You can apply it and try again. I guess the reporters are all using i915. It looks nouveau developers have already been aware of this problem. In nouveau drivers, external monitor users should use nouveau_ignorelid module parameter. http://www.linuxquestions.org/questions/linux-newbie-8/blank-screen-on-lid-close-nomatter-what-i-do-4175445108/ So probably we should just raise this to freedesktop.org to let i915 to change. Thanks Lv Created attachment 256047 [details] [PATCH] ACPI: button: Obselete acpi_lid_open() invocations I split the attachment 256045 [details] into 3 commits. So that we can judge which one is useful. So please use this and the follow up 2 commmits instead. Created attachment 256049 [details]
[PATCH] ACPI: button: Fix lid notification
This patch contains improvements.
Created attachment 256051 [details]
[PATCH] ACPI: button: Do not relay fake events
IMO, the best choice for both nouveau and i915 lvds drivers is: 1. apply the patches in this bug; 2. register lid notifier, and if there is no lid notification recieved after resuming, the drivers should think there is a broken (possibly, but maybe it's just no such usage model) acpi lid and automatically fall into a behavior that is similar as ignorelid; 3. remove nouveau_ignorelid module parameter as it can be automatically detected via lid notifier. First of all, thank you for looking into this issue! I'm a little confused about Comment 7 (and the email you sent me via CC). Both mention suspend + resume. For me, this issue has nothing to do with suspend + resume. I'm experiencing this issue on a regular boot or reboot. (Maybe it happens with suspend + resume, too, I have not tried that.) Furthermore, the issue is _not_ that "the external monitor couldn't be lit on" (quote from email). The external displays are working fine in both cases (Linux 4.10.10- and Linux 4.10.11+). The issue is that on Linux 4.10.11+ the _internal_ display is lit on even when its lid is closed (unless I use the "button.lid_init_state=method" workaround). So to clarify, the situation is as follows: My notebook is docked with its lid _closed_ and two external monitors are attached to dock. When I now switch the notebook on its _internal_ display is lit on. This does not make sense because the lid is closed. This issue exists since Linux 4.10.11. Anyways, for testing the patches, I've removed "button.lid_init_state=method" from my grub config. I've then applied patches 1+2 to Linux 4.10.12. Unfortunately, that does not make any difference: the internal display is lit on despite its lid being closed (i.e. no difference from vanilla 4.10.12). When I try to compile Linux 4.10.12 with patches 1+2+3 I get an error: CC drivers/acpi/button.o drivers/acpi/button.c: In function ‘acpi_lid_initialize_state’: drivers/acpi/button.c:378:9: error: too few arguments to function ‘acpi_lid_update_state’ (void)acpi_lid_update_state(device); ^ drivers/acpi/button.c:358:12: note: declared here static int acpi_lid_update_state(struct acpi_device *device, ^ make[2]: *** [scripts/Makefile.build:295: drivers/acpi/button.o] Error 1 make[1]: *** [scripts/Makefile.build:553: drivers/acpi] Error 2 make: *** [Makefile:988: drivers] Error 2 > Furthermore, the issue is _not_ that "the external monitor couldn't be lit > on" (quote from email). The external displays are working fine in both cases > (Linux 4.10.10- and Linux 4.10.11+). The issue is that on Linux 4.10.11+ the > _internal_ display is lit on even when its lid is closed (unless I use the > "button.lid_init_state=method" workaround). OK, I see. The other case is here: Bug 187271 But root cause should be same. > Anyways, for testing the patches, I've removed "button.lid_init_state=method" > from my grub config. I've then applied patches 1+2 to Linux 4.10.12. > Unfortunately, that does not make any difference: the internal display is lit > on despite its lid being closed (i.e. no difference from vanilla 4.10.12). I see, as you said button.lid_init_state=ignore cannot work this around. Which is similar as the change - do not send notifications. So applying these patches won't fix your issues. But if graphics drivers are going to change, IMO, they should based on these changes, - without the additional "open" event. > When I try to compile Linux 4.10.12 with patches 1+2+3 I get an error: OK, I'll check and update the patches. Thanks Lv Hi, Steffen: I couldn't see problems of in the patches. This block surely contains the correct change for the line you reported: @@ -368,7 +371,7 @@ static void acpi_lid_initialize_state(st { switch (lid_init_state) { case ACPI_BUTTON_LID_INIT_OPEN: - (void)acpi_lid_notify_state(device, 1); + (void)acpi_lid_notify_state(device, 1, false); break; case ACPI_BUTTON_LID_INIT_IGNORE: default: The error indicates that you failed to apply this block. Please check again and we need the test result of 1+2+3. Thanks Lv Forgot to mentione: I'm using linux-pm.git/linux-next branch. You can find it here: https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/?h=linux-next Maybe it contains something more than the upstream kernel. Could you try this branch? Thanks An occurence of "acpi_lid_update_state(device)" inside a "case ACPI_BUTTON_LID_INIT_METHOD" a few lines above needed to be changed. I'm not sure whether "true" or "false" should be passed as a second argument but for this test this should be irrelevant (because its inside "case ACPI_BUTTON_LID_INIT_METHOD") and I've made it to compile by passing "false". I then rebooted with Linux 4.10.12 and patches 1+2+3 but unfortunately that has not solved the problem. When using the "button.lid_init_state=method" workaround I see a boot message "ACPI : button: Notify initial lid state with _LID return value". This message does not appear when I boot with patches 1+2+3 and without the workaround. It seems like to lid is assumed to be open in this case. Would it still make sense to try the linux-pm.git/linux-next branch or was that just a suggestion to solve the compile error? > An occurence of "acpi_lid_update_state(device)" inside a "case > ACPI_BUTTON_LID_INIT_METHOD" a few lines above needed to be changed. I'm not > sure whether "true" or "false" should be passed as a second argument but for > this test this should be irrelevant (because its inside "case > ACPI_BUTTON_LID_INIT_METHOD") and I've made it to compile by passing "false". ACPI_BUTTON_LID_INIT_METHOD is not in the upstream. I surely want to use "false" and force graphics side to make change, but you may want to use "true" in order to still be able to use the quirk. :) > I then rebooted with Linux 4.10.12 and patches 1+2+3 but unfortunately that > has not solved the problem. Yes, I see, as button.lid_init_state=ignore cannot solve this. > When using the "button.lid_init_state=method" workaround I see a boot message > "ACPI : button: Notify initial lid state with _LID return value". This > message does not appear when I boot with patches 1+2+3 and without the > workaround. It seems like to lid is assumed to be open in this case. You can try to use "true" in the above code, and see if you still can use the quirk. I think the answer is yes but may need your feedback here. Thanks and best regards Lv To: Steffen & Julian Could you tell me your kernel graphics driver? nouveau or i915? Maybe for nouveau, you can just use the quirk already implemented by it. See http://www.linuxquestions.org/questions/linux-newbie-8/blank-screen-on-lid-close-nomatter-what-i-do-4175445108/ for reference. Thanks Lv Created attachment 256089 [details]
[PATCH] ACPI: button: Use different strategy for user/kernel spaces
To: Steffen
Even without graphics guys' support, you won't need to restore the METHOD commit, IMO, you only need this commit on top of 1+2+3.
Thanks
Lv
> Could you tell me your kernel graphics driver? CONFIG_DRM_I915=y # CONFIG_DRM_I915_ALPHA_SUPPORT is not set CONFIG_DRM_I915_CAPTURE_ERROR=y CONFIG_DRM_I915_COMPRESS_ERROR=y CONFIG_DRM_I915_USERPTR=y # CONFIG_DRM_I915_GVT is not set > Even without graphics guys' support, you won't need to restore the METHOD > commit, IMO, you only need this commit on top of 1+2+3. I've downloaded "linux-pm-acpi-4.11-final.tar.gz" and applied patches 1+2+3+4. When booting _with_ "button.lid_init_state=method" an error is logged ("Booting kernel: `method' invalid for parameter `button.lid_init_state'") and the problem is still there. When booting _without_ "button.lid_init_state=method" no error is logged and the problem is still there, too. The new message introduced by patch 4 ("Notify initial lid state to users space as open and kernel drivers using _LID returnig value") is _not_ logged. I'm not sure if it is supposed to show up (I guess not because my initial lid state is supposed to be "close", not "open"). # CONFIG_DRM_NOUVEAU is not set Lv, the GPU uses i915. Testing the patches will take a few days, sorry... Opened a freedesktop.org ticket here: https://bugs.freedesktop.org/show_bug.cgi?id=100923 I'll continue to help to find a better choice before it is fixed there. BTW, have you guys tried to use "nomodeset" boot parameter with the affected kernels? To: Steffen
> When booting _without_ "button.lid_init_state=method" no error is logged and
> the problem is still there, too. The new message introduced by patch 4
> ("Notify initial lid state to users space as open and kernel drivers using
> _LID returnig value") is _not_ logged. I'm not sure if it is supposed to show
> up (I guess not because my initial lid state is supposed to be "close", not
> "open").
You can try both "button.lid_init_state=quirk" and "nomodeset", or either of them, and upload the 3 dmesg outputs here.
The patch 4 is not supposed to stay reporting open to both kernel/user space. but is trying to still report _LID returning value to lvds drivers (for this usage model), in the mean while report open to userspace (for systemd usage model).
However I can help to prepare a series with less changes.
Maybe I did too many changes and some of them messed the things up.
Created attachment 256189 [details]
PATCH] ACPI: button: Always notify kernel drivers using _LID returning value
Update patch 4
Created attachment 256191 [details]
[PATCH] ACPI: button: Some usage models still require button.lid_init_state=method
PATCH 5
You can try to use PATCH 1-5. (Note PATCH 4 is updated). And try: 1. boot with nomodeset only, tell us what happens then 2. boot with button.lid_init_state=open, tell us what happens then and upload the dmesg output here 3. boot with button.lid_init_state=close, tell us what happens then and upload the dmesg output here Thanks Lv Please ignore comment 28 and focus on comment 31. Sorry for my changing mind and the comment noise. Thanks Lv I've compiled "linux-pm-acpi-4.11-final.tar.gz" with patches 1+2+3+4(new)+5. 0. boot without any boot parameters: bug (internal display lit on even when docked) 1. boot with nomodeset only: X.org failed to start with "open /dev/dri/card0: No such file or directory" and "no screens found". Maybe this cannot work with the "modesetting" X.org driver? I've tried switching to the Intel/i915 X.org driver but for some reason that failed with "modprobe: FATAL: Module i915 not found in directory /lib/modules/4.11.0-rc7" (I don't understand why it's looking for a kernel module when I compiled with CONFIG_DRM_I915=y). 2. boot with button.lid_init_state=open: bug (internal display lit on even when docked) 3. boot with button.lid_init_state=close: works! The internal display is not used when docked. It is used when not docked. :-) I'll attach the dmesg outputs. Please note that I won't have access to the dock / external monitors for 1-2 weeks and therefore won't be able to test further patches until then. Created attachment 256193 [details]
dmesg output for test-case 0
Created attachment 256195 [details]
dmesg output for test-case 1
Created attachment 256197 [details]
dmesg output for test-case 2
Created attachment 256199 [details]
dmesg output for test-case 3
> 1. boot with nomodeset only: X.org failed to start with "open /dev/dri/card0: > No such file or directory" and "no screens found". Maybe this cannot work > with the "modesetting" X.org driver? I've tried switching to the Intel/i915 > X.org driver but for some reason that failed with "modprobe: FATAL: Module > i915 not found in directory /lib/modules/4.11.0-rc7" (I don't understand why > it's looking for a kernel module when I compiled with CONFIG_DRM_I915=y). I'm not sure, we can ask this on freedesktop.org > 2. boot with button.lid_init_state=open: bug (internal display lit on even > when docked) > 3. boot with button.lid_init_state=close: works! The internal display is not > used when docked. It is used when not docked. :-) The difference between the 2 modes is: notification sent to userspace is different, kernel space is same (_LID returning value). So I guess your issue is because of some userspace programs. I don't know which program is. so this isn't a kernel problem. I suggest you to monitor this on freedesktop.org. We'll restore some quirk mode for you to use your current usage model. I'm not sure which will be restored - method mode or the new close mode. I'll discuss this internally with our teammates. Maybe Julian's problem is different. Let's see whether his problem is from user space or from kernel space. So keep this opened. Thanks Lv Thanks Lv! button.lid_init_state=close is what my system needs as well. Thanks for the test. In my patches, I tried to unified kernel space notifications, to fix both suspend/resume loop problem and external lid problem in 1 lid driver mode. But that seems impossible as distinguishing kernel space/user space events doesn't help. Both usage models are implemented in the user space, 1 from freedesktop.org, 1 from systemd. The kernel driver is not possible to make 2 contradictory usage models working without introducing platform quirks. So IMO, users should also help us to raise this issue to the related community. I'll mark this as resolved. Thanks Lv Will the code enabling button.lid_init_state=close be part of a future Linux kernel release? This looks like a duplicate of bug 187271 where this behaviour had already been reported and discussed, e.g., see https://bugzilla.kernel.org/show_bug.cgi?id=187271#c47 where I also pointed out that this issue is by no means limited to Intel video devices: it also occurs on AMD Radeon video devices , namely model HD 3470 (RV620). A test kernel including patch https://bugzilla.kernel.org/show_bug.cgi?id=187271#c49 had no effect on my system. Closing... *** Bug 197361 has been marked as a duplicate of this bug. *** |