Bug 195897
Summary: | S3 isn't supported and hangs, mem_sleep should mapped to s2idle by default - Dell Latitude 7275 | ||
---|---|---|---|
Product: | ACPI | Reporter: | Jérôme de Bretagne (jerome.debretagne) |
Component: | Power-Sleep-Wake | Assignee: | Rafael J. Wysocki (rjw) |
Status: | CLOSED CODE_FIX | ||
Severity: | normal | CC: | jerome.debretagne, lenb, rjw, rui.zhang, srinivas.pandruvada, yu.c.chen |
Priority: | P1 | ||
Hardware: | Intel | ||
OS: | Linux | ||
URL: | http://en.community.dell.com/techcenter/os-applications/f/4613/t/20012666 | ||
Kernel Version: | 4.13 4.12 4.11 4.10 4.9 3.16 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
acpidump for Dell Latitude 7275 with BIOS 1.1.31
dmesg after pm_tracing kernel 4.12-rc2 dmesg on normal boot - kernel 4.12-rc2 output of lspci for Dell Latitude 7275 - kernel 4.12-rc2 acpidump for Dell Latitude 7275 with BIOS 1.1.20 |
Description
Jérôme de Bretagne
2017-05-28 15:36:56 UTC
Created attachment 256749 [details]
acpidump for Dell Latitude 7275 with BIOS 1.1.31
Here is the result of acpidump.
Created attachment 256751 [details]
dmesg after pm_tracing kernel 4.12-rc2
Showing the Magic number section:
[ 2.866144] Magic number: 5:463:177
[ 2.866237] acpi device:0d: hash matches
This run was done with the minimal set of modules loaded , on kernel 4.12-rc2
Created attachment 256753 [details]
dmesg on normal boot - kernel 4.12-rc2
And here is the output of dmesg on a normal boot with kernel 4.12-rc2.
There are quite a few error messages, some ACPI-related :
$ dmesg | grep -i "error\|exception\|warning" | grep -v "load for iwlwifi"
[ 0.068078] ACPI Error: [\_SB_.PCI0.SAT1] Namespace lookup failure, AE_NOT_FOUND (20170303/dswload-210)
[ 0.068093] ACPI Exception: AE_NOT_FOUND, During name lookup/catalog (20170303/psobject-241)
[ 0.068261] ACPI Exception: AE_NOT_FOUND, (SSDT:IdeTable) while loading table (20170303/tbxfload-228)
[ 0.087414] ACPI Error: 1 table load failures, 7 successful (20170303/tbxfload-246)
[ 0.585932] acpi PNP0A08:00: _OSC failed (AE_ERROR); disabling ASPM
[ 2.631369] i8042: Warning: Keylock active
[ 4.056735] EXT4-fs (nvme0n1p6): re-mounted. Opts: errors=remount-ro
[ 4.202120] ACPI Warning: SystemMemory range 0x00000000FE028000-0x00000000FE0281FF conflicts with OpRegion 0x00000000FE028000-0x00000000FE028207 (\_SB.PCI0.GEXP.BAR0) (20170303/utaddress-247)
[ 4.204558] ACPI Warning: \_SB.IETM._ART: Return Package type mismatch at index 0 - found Integer, expected Reference (20170303/nspredef-297)
[ 4.204587] ACPI Warning: \_SB.IETM._TRT: Return Package has no elements (empty) (20170303/nsprepkg-130)
[ 4.228131] intel-lpss: probe of INT3446:00 failed with error -16
[ 4.247849] soc_button_array: probe of INT33D2:00 failed with error -2
[ 4.309380] Error: Driver 'pcspkr' is already registered, aborting...
I don't know if one of them could give a hint about the source of the issue.
Created attachment 256755 [details]
output of lspci for Dell Latitude 7275 - kernel 4.12-rc2
Identical behavior tested and reproduced on kernel 4.12-rc3, giving the exact same Magic number after pm_tracing: [ 2.916076] Magic number: 1:782:177 [ 2.916177] acpi device:0d: hash matches Hi, could you please boot into a minimal shell by appending "init=/bin/bash" in the commandline, and test with different pm_test mode? # cat /sys/power/pm_test [none] core processors platform devices freezer # echo freezer > /sys/power/pm_test # echo mem > /sys/power/state wait for 5 seconds if succeed, try next pm_test mode: # echo devices > /sys/power/pm_test # echo mem > /sys/power/state until the "none" mode, to see if it works. please do not enable pm_trace when testing Hi, Thanks for these instructions. Here are the results when booting 4.12-rc3 into a minimal shell (with pm_trace not enabled as you suggested): # echo freezer > /sys/power/pm_test # echo mem > /sys/power/state Success # echo devices > /sys/power/pm_test # echo mem > /sys/power/state Success # echo platform > /sys/power/pm_test # echo mem > /sys/power/state Success # echo processors > /sys/power/pm_test # echo mem > /sys/power/state Success # echo core > /sys/power/pm_test # echo mem > /sys/power/state Success # echo none > /sys/power/pm_test # echo mem > /sys/power/state Failure, the device can't be resumed. The power button only allows to force-halt the system on a long press (seems 6s) and then it can be rebooted with another medium long press (for about 1s) that gives a small vibration. Hi, I've investigated other parts, with the intuition that the behavior of the power button / EC may be the source of the issue. So here is some more feedback, hoping it can help. When suspending in "s2idle" mode instead of the default "deep" value with: # echo freeze > /sys/power/state the system can resume reliably but only with a *very* long press of the power button, for about 6 to 7 seconds. It was quite surprising at first, especially since the system wakes up instantaneously from sleep on Windows with a usual short press. I don't know if this long press trigger could still be expected when suspending in "deep" sleep state also, but this time with the long press being interpreted as a forced shutdown instead? Are there other ways to trigger a resume from "deep" sleep, to check if this is the power button wake-up event that may have an issue, and not the suspend step as I've suspected so far? Thanks, Jérome P.S. I have tried to resume using an RTC alarm clock but it doesn't wakeup from sleep it seems (either "s2idle" or "deep") while it works fine from suspend-to-disk state: rtc_cmos 00:01: RTC can wake from S4 Hi again, I've also seen the recent s2idle-dell-test branch created by Rafael J. Wysocki and it reminded me of changes I've seen in past BIOS updates for this system. In BIOS version 1.1.25 in particular, Dell had updated some behaviors as described here: https://www.dell.com/support/home/us/en/19/Drivers/DriversDetails?driverId=FKMC9 Fixes: [...] - Fix the system being reset if carrying in bag. Enhancements: [...] - Enhance power button behavior to avoid system reset triggered while putting in the bag. There are no specific description about the actual modifications though. Since it seemed related to my assumption that there may be an issue with the power button resume event, I've done some tests by adapting Rafael's patch for the Latitude 7275 DMI values to see its effects. I've then compared the behaviors when running the latest BIOS 1.1.31 and when running 1.1.20 (which was the previous version before 1.1.25 that introduced the above changes). I had to apply 1.1.20 from a recovery USB key btw as the downgrade from 1.1.25 was not supported using the official .exe... Here are the results: * s2idle tests - BIOS 1.1.20 - kernel 4.12-rc3 Wake up on power long press of about 6-7s - kernel 4.12-rc3 including the s2idle-dell-test patch (modified for the Latitude 7275) Wake up on power short press <-- main change detected * s2idle tests - BIOS 1.1.31 - kernel 4.12-rc3 Wake up on power long press of about 6-7s - kernel 4.12-rc3 including the s2idle-dell-test patch (modified for the Latitude 7275) Wake up on power long press of about 6-7s So no visible change with/without the patch on BIOS 1.1.31, but there was a positive difference on 1.1.20 as the patch made the s2idle wake-up to be triggered by a more usual short press. Too bad it doesn't work as-is on the latest BIOS revision. Now, coming back to the original bug report, I wonder if there are ways to update / adapt the EC-based wakeup from ("deep") suspend-to-RAM as this suspend state never worked in all 4 above scenarios. Are there any logs / other inputs that would be useful to investigate this assumption? Yu, should Rafael or someone else be CCed on that bug report maybe? Thanks, Jérome Same issue on the Latitude 7275 confirmed by another user on the Dell community forum: "it hangs pretty badly if you suspend it". He got the same Magic number result when testing with pm_trace. URL added for reference. CCing Rafael to share the feedback in #9 about the 's2idel-dell-test' branch (modified for this Latitude 7275 of course) and maybe to get some other ideas/directions about how to investigate this Suspend-to-RAM issue. The more interesting feedback so far seems to be the one in #8 about the s2idle power button behavior but that may be a wrong lead. I'm willing to provide any other useful inputs, just let me know! The best would be for me go through a git-bisect session but I haven't found a single working kernel version up to now, and I've tried many... Thanks, Jérome Hi, I've finally found this similar thread for the Dell XPS 13 9365: https://bugzilla.kernel.org/show_bug.cgi?id=192591 , after reading Rafael's comment in his recent EC-based wakeup patch set for some Dell systems stating: "on the 9365 ACPI S3 (suspend-to-RAM) is not expected to be used at all (the OS these systems ship with never exercises the ACPI S3 path) and suspend-to-idle is the only viable system suspend mechanism in there." In reference to #82 in this 9365 bug report, I can confirm that the ACPI_S0_LOW_POWER_IDLE FADT flag is also set in this 7275 system: # grep "Low Power" facp.dsl Low Power S0 Idle (V5) : 1 indicating that S0 should (must?) be used instead of S3 on the Latitude 7275 model. One main difference though is that S3 still seems to work properly on Dell 9365 while it hangs the Dell 7275. Srinivas, could you maybe try to get the same confirmation for the Dell 7275 system as you've shared in #80 for the Dell 9365? That would be great. If S3 is indeed not supported and never going to work on that model, what about changing the /sys/power/mem_sleep value to "[s2idle] deep" instead of "s2idle [deep]"? What about even making s2idle the new default for devices setting the ACPI_S0_LOW_POWER_IDLE FADT flag perhaps? Thanks, Jérôme Please test Rafael's patch on Dell 7275. If suspend to idle works. If low power S0 idle bit is set we should also add this platform to the list. Ultimately all platforms can rely on this bit, but till we have test data from several devices, it may not be safe. Hi Srinivas, I did already in fact, please see all the details in comment #9 above. S3 is not safe currently on this system, while Suspend to idle works fine. It just need a very long press of about 6-7s to trigger the wake-up. This may be a good idea in some scenarios btw (when closing the lid for ex) on this 2-in-1 model since it has the power button on the tablet-portion side, so still on the outside even when the lid is closed. The default behavior should still be the usual short-press one, as is the case on Windows. To sum up my testing (based on 4.12rc3 at the time), Rafael's patch made no difference on this Dell 7275 when running the latest BIOS 1.1.31 but worked as intended when running an older BIOS 1.1.20 (= short press to wakeup)! I'll try to test again on rc4 in the coming days. Let me know what kind of other logs / inputs would be useful to better understand the events triggered by the power button on this system, and I'll share them. Can you directly comment in the mailing list with the result? There are some folks for Dell there, who may comment why this platform has to be different. Done, here it is for reference: http://marc.info/?l=linux-pm&m=149687680603166&w=2 It gives exactly the same result on 4.12rc4 as it did with 4.12rc3 in comment #9. Created attachment 256913 [details]
acpidump for Dell Latitude 7275 with BIOS 1.1.20
Here is the result of acpidump once downgraded to BIOS 1.1.20.
The previous acpidump was taken with the current BIOS 1.1.31.
For those facing the same issue, here is the reference to the interesting follow-up email discussion: http://marc.info/?l=linux-pm&m=149705422318487&w=2 To sum up: - Mario Limonciello was able to get the confirmation that this system is only officially supporting Connected Standby / Modern Standby "CS/MS" / suspend-to-idle (low power S0 idle) but not suspend-to-RAM (S3) - its latest BIOS 1.1.31 exposes both Low Power S0 Idle and S3 through ACPI - when both are declared in ACPI, Linux currently defaults to "deep" suspend-to-RAM (S3) as visible in /sys/power/mem_sleep, as opposed to Windows 8.1 / 10 which defaults to low power S0 idle mode - when trying to use this default suspend-to-RAM (= S3) on Linux, the system hangs and can not resume. This happens with various (all?) BIOS up to the current 1.1.31 version. - an internal Dell inquiry has been triggered to see if the S3 ACPI declaration could be removed in a future BIOS update, which would fix the issue by h (previous comment sent too quickly) ... which would fix the issue by making Linux default to suspend-to-idle on that system, which works overall (with the issue that a long-press on the power button is needed to wake-up). - Another possible fix in the medium term, within the kernel this time, would be to default to suspend-to-idle instead of suspend-to-RAM when the ACPI_S0_LOW_POWER_IDLE FADT flag is set. It could be implemented either for all systems by default at some point or at least for systems known to have major issues with suspend-to-RAM: "Eventually this will change, but for now first we need to sort out the problems with the systems that do S2I." In the meantime, a user-side fix is to manually change the default Linux behavior by setting /sys/power/mem_sleep to "[s2idle] deep" instead of "s2idle [deep]" with: # echo s2idle > /sys/power/mem_sleep or to suspend manually from the command line with: # echo freeze > /sys/power/state which is using the suspend-to-idle mode. After suspended to idle(with Rafael's patch modified for the Latitude 7275) on top of BIOS 1.1.31, have to hold the power button for 6 second to wakeup.Well, I think there is still problem that it takes so much time to resume. Another test might be, how about waking up the system from s2idle by rtcwake? (In reply to Chen Yu from comment #20) > After suspended to idle(with Rafael's patch modified for the Latitude 7275) > on top of BIOS 1.1.31, have to hold the power button for 6 second to > wakeup.Well, I think there is still problem that it takes so much time to > resume. Another test might be, how about waking up the system from s2idle by > rtcwake? AFAIK, BIOS is not involved during s2idle, is it possible that the system has already resumed but the graphic did not show up? maybe this can be verified by ping the 7275 across s2idle test cycle. Hi Chen, Having to hold the power button for 6 seconds has been discussed at length in the thread mentioned in comment #18; good news is we start to have a good idea of the root cause. I will open a separate bug entry about this to avoid mixing 2 totally different issues, since defaulting to Suspend-to-RAM / S3 on this machine remains a real issue for end users. we should blacklist this machine as not supporting S3. I've searched a little bit but I haven't found code yet implementing a similar blacklisting for other models not supporting S3. Where would it be a good place to implement this blacklisting? When loading the compatible Sx ACPI modes maybe? In case that would be useful, here are some DMI values on this machine: DMI_SYS_VENDOR : "Dell Inc." DMI_PRODUCT_NAME : "Latitude 7275" I can confirm that the following commit: "ACPI / PM: Prefer suspend-to-idle over S3 on some systems" proposed for linux-next in the linux-pm tree fixes this issue of the Dell Latitude 7275 system. Indeed mem_sleep is now properly mapped to s2idle by default: $ cat /sys/power/mem_sleep [s2idle] deep Thanks again to Rafael. I'll mark this bug as RESOLVED once it will make it into Linus tree. Thanks for testing. This is the bugzilla process we're using, the bug should be marked as Resolved once a patch is proposed for upstream and has been conformed to solve the problem, and the bug will be marked as closed once the patch go upstream. So mark the bug as resolved. The commit mentioned in comment #25 is now upstream, part of Linux 4.14, here for reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e870c6c87cf9484090d28f2a68aa29e008960c93 So mark this bug as closed. |