Bug 191211
Summary: | Razer Blade Stealth laptop fails to send lid open on suspend resume | ||
---|---|---|---|
Product: | ACPI | Reporter: | Jason Dewayne Clinton (me) |
Component: | EC | Assignee: | Lv Zheng (lv.zheng) |
Status: | CLOSED CODE_FIX | ||
Severity: | normal | CC: | rui.zhang |
Priority: | P1 | ||
Hardware: | Intel | ||
OS: | Linux | ||
Kernel Version: | 4.9 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: | [BROKEN] Attempt to clear EC after resume on Razer laptops |
Description
Jason Dewayne Clinton
2016-12-26 18:20:59 UTC
Also affects earlier Razer Blade (14" units) and Razer Blade Stealth (Skylake): https://www.reddit.com/r/razer/comments/447vrn/razer_blade_stealth_linux/czp1qw3/ . I'm looking at a DMI quirk in ACPI ec.c to see if that will fix it. Created attachment 249701 [details]
[BROKEN] Attempt to clear EC after resume on Razer laptops
Much to my disappointment, this patch doesn't work. Stranger:
pr_debug("Detected system needing EC poll on resume.\n");
never shows up after resume so I'm not sure what I'm doing wrong (or if this code path ever worked?).
I've reached the limit of my knowledge of ACPI without having intimate platform knowledge. Hopefully someone can pick this up who knows what might be wrong here.
I don't think clear_on_resume quirk is useful now, it's quite old, invented for old code base. Looking at the code: if (!test_bit(EC_FLAGS_QUERY_PENDING, &ec->flags)) advance_transaction(ec); The EC driver polls events right after resuming just because of the above code in __acpi_ec_enable_event(), regardless of the quirk: acpi_ec_resume acpi_ec_enable_event __acpi_ec_enable_event You should try to boot the kernel with button.lid_init_state=open to see if this is the known bug related to an old kernel issue and an old userspace workaround. Please find detailed description in Documentation/acpi/acpi-lid.txt. (In reply to Jason Dewayne Clinton from comment #2) > I've reached the limit of my knowledge of ACPI without having intimate > platform knowledge. Hopefully someone can pick this up who knows what might > be wrong here. There won't be any new clear_on_resume quirks upstreamed to the Linux kernel. There should be a cleanup deleting all clear_on_resume quirks from the Linux kernel upstream. Thanks Lv > A work-around is to close-and-reopen the lid a second time immediately after > opening the lid before userspace re-initiates suspend. This is an invalid workaround, your brain may be deceived. It doesn't mean something (post-resume lid open event) is missing. As long as the system is not suspended, it's likely that the lid event can be triggered and delivered to the OSPM. But the post-resume lid open event is a different case. The event is already handled by BIOS to wake the system up. And it's up to BIOS to deliver such event to the OSPM because Windows cares no about the existence of this event. You can call it a BIOS bug, but I don't think so. It's simply a Windows designed behavior. For such kind of issue, there is simply no mystery in LID. For resume from S3, considering the case that the lid event is triggered by EC. From comment 3, we know that EC driver already ensures that no events will be missing after resuming. So if you cannot see such a lid open event after resuming, it simply means there is no such event. > pr_debug("Detected system needing EC poll on resume.\n"); This line can only be seen during boot, you won't see the message again after resuming. And as long as it is shown during boot, it affects EC_FLAGS_CLEAR_ON_RESUME behavior after resuming: static void acpi_ec_enable_event(struct acpi_ec *ec) { ... if (EC_FLAGS_CLEAR_ON_RESUME) acpi_ec_clear(ec); } static int acpi_ec_resume(struct device *dev) { struct acpi_ec *ec = acpi_driver_data(to_acpi_device(dev)); acpi_ec_enable_event(ec); return 0; } Thanks and best regards Lv > A work-around is to close-and-reopen the lid a second time
> immediately after opening the lid before userspace re-initiates suspend.
Another case is:
After closing lid, lid notification is triggered
1. some BIOSen choose to store hardware lid status in the event handler, and return the cached value via _LID;
2. some BIOSen choose to do nothing in the event handler, and return the hardware lid status via _LID.
For the latter, when OSPM evaluates _LID, since you opened it quickly, the result could be open if the _LID is implemented to read the real hardware status.
So userspace can see an open event.
This is also why we need this logic in drivers/acpi/button.c:
if (button->last_state == !!state &&
ktime_after(ktime_get(), next_report)) {
It's not a proof that the BIOS has implemented a lid open event triggering source.
Thanks
Lv
(In reply to Lv Zheng from comment #3) > I don't think clear_on_resume quirk is useful now, it's quite old, invented > for old code base. > > Looking at the code: > if (!test_bit(EC_FLAGS_QUERY_PENDING, &ec->flags)) > advance_transaction(ec); > The EC driver polls events right after resuming just because of the above > code in __acpi_ec_enable_event(), regardless of the quirk: > acpi_ec_resume > acpi_ec_enable_event > __acpi_ec_enable_event On the other hand, the quirk is harmful. There are platforms, if OSPMs send EC query command to them without seeing EV_SCI bit set in the EC status register, the EC firmware crashes. So I believe the above code is much more useful than the quirk as it is safer, we needn't worry about that it could break specific EC firmware. Thanks Lv Let me close the bug as I couldn't see anything doable in it. If the workaround is not working, feel free to re-open it and let us know. Thanks Lv button.lid_init_state=open does work around the problem. Would you like me to provide a patch that bakes this quirk into the kernel so that this work out-of-box for other people? Ah, you mean this bug? Could you upload some logs to persuade me that the quirk is useful. IMO, it is not useful. Let me write down some debugging steps. Thanks Lv Do you mean attachment 249701 [details]? It's not working according to your comment. So it seems we have communication problem there. If attachment 249701 [details] is not working: Could you post the working patch here? If attachment 249701 [details] working: Please add dyndbg="file ec.c +p" to your kernel command line. And help to do the following 2 tests: Test 1: 1. do not apply attachment 249701 [details] 2. perform the suspend/resume 3. upload the dmesg here Test 2: 1. apply attachment 249701 [details] 2. perform the suspend/resume 3. upload the dmesg here Thanks and best regards Lv Also please upload the acpidump output here. And if possible, please do the test using latest upstream kernel. Thanks Lv Sorry, I wasn't clear in comment #9. We know that: 1. all Razer Blade laptops are affected by this problem and that 2. using button.lid_init_state=open works around the problem can I give you a patch that checks DMI info for this hardware name and automatically activates button.lid_init_state=open when running on a Razer Blade laptops? If not, that's fine. I just thought that I would offer to write that patch. Feel free to close this bug. OK. I'm closing this bug. The philosophy is: This bug looks like a gap between the kernel and the userspace. And BIOS looks correct. So if we start to introduce DMI tables into the kernel, the table surely will grow infinitely... I've submitted a patch to make "button.lid_init_state=open" to make it the default behavior, which should be able to reduce similar reports: https://patchwork.kernel.org/patch/9512307/ Thanks Lv |