Bug 63171 - SilverSeraph: suspend to ram does no more work with recent kernels.
Summary: SilverSeraph: suspend to ram does no more work with recent kernels.
Status: RESOLVED CODE_FIX
Alias: None
Product: Power Management
Classification: Unclassified
Component: Hibernation/Suspend (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Zhang Rui
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-10-16 20:07 UTC by Elmar Stellnberger
Modified: 2022-04-18 20:07 UTC (History)
5 users (show)

See Also:
Kernel Version: 3.11.3-1.1-default
Subsystem:
Regression: No
Bisected commit-id:


Attachments
/proc/acpi/wakeup (283 bytes, text/plain)
2013-10-17 21:24 UTC, Elmar Stellnberger
Details
patch against hang on windowing PCI bridge (4.44 KB, patch)
2014-01-02 10:24 UTC, Elmar Stellnberger
Details | Diff
errors compiling kernel v3.0 with the patch applied (2.68 KB, application/octet-stream)
2014-01-02 12:08 UTC, Elmar Stellnberger
Details
no_console_suspend + acpi_sleep=nonvs: low memory corruption (85.58 KB, image/jpeg)
2014-06-30 16:08 UTC, Elmar Stellnberger
Details
no_console_suspend without acpi_sleep=nonvs: low memory corruption (87.86 KB, image/jpeg)
2014-06-30 16:12 UTC, Elmar Stellnberger
Details
debug patch (799 bytes, patch)
2014-07-01 01:08 UTC, Zhang Rui
Details | Diff
diff for commit e396c9d8 which was found working by accident (2.6.37.6-1) (607.75 KB, application/x-bzip)
2014-07-01 12:52 UTC, Elmar Stellnberger
Details
error messages trying to compile 2.6.35-492-g72d7c3b (16.20 KB, text/plain)
2014-07-01 16:13 UTC, Elmar Stellnberger
Details
error messages trying to compile 2.6.35-492-g72d7c3b_HEAD~1 (22.99 KB, text/plain)
2014-07-01 16:17 UTC, Elmar Stellnberger
Details
ptrace.patch used to compile 2.6.35-492-g72d7c3b (excluding ";"-patch) (874 bytes, patch)
2014-07-01 17:27 UTC, Elmar Stellnberger
Details | Diff
some errors trying to install modules from 2.6.35-492-g72d7c3b (5.63 KB, text/plain)
2014-07-01 22:30 UTC, Elmar Stellnberger
Details
errors trying to make modules of 2.6.35-492-g72d7c3b without -i (996 bytes, text/plain)
2014-07-01 22:31 UTC, Elmar Stellnberger
Details
kernel config file used for trying to compile 2.6.35-492 (95.04 KB, text/plain)
2014-07-01 22:44 UTC, Elmar Stellnberger
Details
errors trying to make 2.6.35-492-g72d7c3b_head~1 without -i (74.04 KB, text/plain)
2014-07-01 22:51 UTC, Elmar Stellnberger
Details
2.6.37.6-1-default vga=0 modeset=0 debug ignore_loglevel: s2ram has worked (93.99 KB, text/plain)
2014-07-02 08:40 UTC, Elmar Stellnberger
Details
3.11.6-4-desktop vga=0 modeset=0 debug ignore_loglevel: s2ram has not worked (38.58 KB, text/plain)
2014-07-02 08:44 UTC, Elmar Stellnberger
Details
2.6.38-1 vga=0 modeset=0 debug ignore_loglevel: does not work (66.18 KB, text/plain)
2014-07-02 12:19 UTC, Elmar Stellnberger
Details
2.6.37-1 (2d90508) resume image: m.=0 v.=0 debug ignore_loglevel no_console_suspend (77.57 KB, image/jpeg)
2014-07-02 16:25 UTC, Elmar Stellnberger
Details
linux3.17.1-4_img1_BIOS-resume_without_uhci_hcd.JPG (154 bytes, text/plain)
2014-10-26 19:08 UTC, Elmar Stellnberger
Details
linux3.17.1-4_img1_BIOS-resume_without_uhci_hcd.JPG (65.92 KB, image/jpeg)
2014-10-26 19:10 UTC, Elmar Stellnberger
Details
linux3.17.1-4_img2_BIOS-corruption-backtrace.JPG (78.28 KB, image/jpeg)
2014-10-26 19:13 UTC, Elmar Stellnberger
Details
nolapic noapic debug ignore_loglevel - wakeup1 (2.74 MB, image/jpeg)
2015-03-19 22:34 UTC, Elmar Stellnberger
Details
nolapic noapic debug ignore_loglevel - wakeup2 (3.06 MB, image/jpeg)
2015-03-19 22:36 UTC, Elmar Stellnberger
Details
nolapic noapic debug ignore_loglevel - wakeup3 (3.12 MB, image/jpeg)
2015-03-19 22:40 UTC, Elmar Stellnberger
Details
dmesg with debug ignore_loglevel log_buf_len=1M (42.17 KB, text/plain)
2015-03-20 11:04 UTC, Elmar Stellnberger
Details
Linux 4.0.1-1, ScreenShot 1 (no_console_suspend, vga=0) (129.32 KB, image/jpeg)
2015-05-20 15:52 UTC, Elmar Stellnberger
Details
Linux 4.0.1-1, ScreenShot 2 (no_console_suspend, vga=0) (116.44 KB, image/jpeg)
2015-05-20 15:53 UTC, Elmar Stellnberger
Details
Linux 4.1.3-1, ScreenShot 1 (no_console_suspend, vga=0) (100.00 KB, image/jpeg)
2015-08-16 11:25 UTC, Elmar Stellnberger
Details
Linux 4.1.3-1, ScreenShot 2 (no_console_suspend, vga=0) (91.00 KB, image/jpeg)
2015-08-16 11:26 UTC, Elmar Stellnberger
Details
Linux 4.1.3-1, ScreenShot 3 (no_console_suspend, vga=0) (92.50 KB, image/jpeg)
2015-08-16 11:27 UTC, Elmar Stellnberger
Details
dmidecode for the SilverSeraph (8.75 KB, text/plain)
2016-08-27 05:19 UTC, Elmar Stellnberger
Details
freeze just hangs with 4.8.0-rc2 (screenshot) (464.00 KB, image/jpeg)
2016-08-27 10:55 UTC, Elmar Stellnberger
Details
s2ram gives some funny colors with 4.8.0-rc2 (screenshot) (489.50 KB, image/jpeg)
2016-08-27 10:56 UTC, Elmar Stellnberger
Details
standby with 4.8.0-rc2 (screenshot 1 of 3) (508.50 KB, image/jpeg)
2016-08-27 10:57 UTC, Elmar Stellnberger
Details
standby with 4.8.0-rc2 (screenshot 2 of 3) (485.50 KB, image/jpeg)
2016-08-27 11:01 UTC, Elmar Stellnberger
Details
kernel 4.9.6-1-arch (screenshot 1 of 3) (472.50 KB, image/jpeg)
2017-02-14 15:21 UTC, Elmar Stellnberger
Details
kernel 4.9.6-1-arch (screenshot 2 of 3) (524.50 KB, image/jpeg)
2017-02-14 15:22 UTC, Elmar Stellnberger
Details
kernel 4.9.6-1-arch (screenshot 3 of 3) (506.00 KB, image/jpeg)
2017-02-14 15:23 UTC, Elmar Stellnberger
Details
test with vanilla 4.10.0-rc8+, full s2ram (no no_console_suspend), no drm & no nouveau module loaded (286.00 KB, image/jpeg)
2017-02-16 10:30 UTC, Elmar Stellnberger
Details

Description Elmar Stellnberger 2013-10-16 20:07:42 UTC
If I do an 'echo mem >/sys/power/state' it does no more awake from s2ram even if no KMS module like nouveau is loaded and even if I additionally add the vga=0 parameter. The sysrq-keys do no more work.
  For there has been some other already resolved bug with my SilverSeraph notebook (Bug 42853) I could not find any working kernel version before 2.6.37.1-1.2-default. What can we do?
Comment 1 Lan Tianyu 2013-10-17 06:51:47 UTC
The symptom is that the machine can't awake from s2ram, right? 

Have you tried 3.12-rc5 kernel?

Please provide the output of acpidump and "cat /proc/acpi/wakeup".
Comment 2 Elmar Stellnberger 2013-10-17 21:24:38 UTC
Created attachment 111491 [details]
/proc/acpi/wakeup

same results with kernel 3.12.0-rc5-1-default (with and without vga=0).
Comment 3 Aaron Lu 2013-11-28 02:00:26 UTC
From your description, it seems to be: the system wakes up, but the screen is black. Is it the case?
Comment 4 Elmar Stellnberger 2013-11-28 10:24:27 UTC
No, unfortunately the kernel seems to crash: no sysrq keys and no num lock switching on my usb-keyboard.
Comment 5 Aaron Lu 2013-11-29 01:14:11 UTC
Please follow the kernel document basic-pm-debugging.txt to do some debug:
https://www.kernel.org/doc/Documentation/power/basic-pm-debugging.txt

In short, you can:
# echo devices > /sys/power/pm_test
# echo mem > /sys/power/state
To see if there is any errors.
Comment 6 Elmar Stellnberger 2013-11-29 14:46:47 UTC
Well it works with the 'devices' option but not with the 'platform' option; i.e. the machines ACPI is likely bogus. I just wonder how elder kernels have managed to circumvent this problem.
Comment 7 Aaron Lu 2013-12-02 09:02:09 UTC
What's the error message for platform mode test?
Comment 8 Elmar Stellnberger 2013-12-02 09:35:19 UTC
How shall I extract an error message if the machine crashes and the screen stays dark?
Comment 9 Aaron Lu 2013-12-03 03:11:24 UTC
Boot into console mode, with kernel cmdline option "nomodeset no_console_suspend". That may give us some information.
Comment 10 Elmar Stellnberger 2013-12-04 09:26:40 UTC
Sorry; the only thing that happens even with the given kernel options is that a blinking cursor in the upper left corner remains on a blackened screen. It does not resume on pressing the power button nor does it react on SysRq keys.
Comment 11 Aaron Lu 2013-12-11 03:47:52 UTC
From your comment #1, do you mean v2.6.37 works and all later version kernel breaks?
Comment 12 Elmar Stellnberger 2013-12-11 11:04:10 UTC
Yes, everything after v2.6.37 breaks. It does not break because of s2ram but because of Bug 42853 (error windowing PCI bridge). i.e. there will be no chance to bisect because of interference with another bug.
Comment 13 Aaron Lu 2013-12-19 06:56:59 UTC
I don't see how we can proceed without doing bisect...
Comment 14 Elmar Stellnberger 2013-12-19 10:31:59 UTC
Well we could bisect if I could apply the windowing PCI bridge fix on all checked-out kernels even on those that were before the official incorporation of this patch. However I fear that applying the patch to kernels before may likely fail or need additional work to get it incorporated so that I will have to rely on kernels assembled or compiled by you.
Comment 15 Elmar Stellnberger 2013-12-31 22:04:59 UTC
NEEDINFO? BISECT? Can you give me a patch that I could apply on kernels after 2.6.37.1-1.2-default that would resolve issue/bug 42853? - or perhaps a git rebased-branch? None of these kernels will boot otherwise and bisecting will be worthless.
Comment 16 Aaron Lu 2014-01-02 01:15:37 UTC
I have no idea which patch fixed your PCI bridge problem, sorry.
Comment 17 Elmar Stellnberger 2014-01-02 10:24:27 UTC
Created attachment 120631 [details]
patch against hang on windowing PCI bridge

I think I have got the patch though bisecting on a 2001 notebook will take just a while ...
Comment 18 Elmar Stellnberger 2014-01-02 12:08:43 UTC
Created attachment 120641 [details]
errors compiling kernel v3.0 with the patch applied

Rafael; unfortunately I got an error when trying to compile kernel v3.0 with the patch though git apply did succeed without any (error) message first (trying to apply it a second time has of course thrown some errors.).
Comment 19 Elmar Stellnberger 2014-01-02 18:33:47 UTC
AFAICT: 3.0.0-1 default is bad ( compiled all of it by accident and it took a whole day ...), 2.6.37.1-1.2-default is still the last known good kernel.
Comment 20 Elmar Stellnberger 2014-01-03 20:07:31 UTC
last known good: v2.6.37.6-1
first known bad: v2.6.37-rc1
Comment 21 Zhang Rui 2014-06-03 05:41:24 UTC
(In reply to Elmar Stellnberger from comment #20)
> last known good: v2.6.37.6-1
> first known bad: v2.6.37-rc1

how do you get this? v2.6.37-rc1 should be earlier than v2.6.37.6-1, no?

BTW, does "acpi_sleep=nonvs" helps in the latest kernel?
Comment 22 Elmar Stellnberger 2014-06-30 16:08:19 UTC
Created attachment 141501 [details]
no_console_suspend + acpi_sleep=nonvs: low memory corruption

Having retested previous settings with acpi_sleep=nonvs I have made an interesting discovery: If you wait a while after resume with no_console_suspend a readable dmesg trace appears on the screen (low memory corruption).
Comment 23 Elmar Stellnberger 2014-06-30 16:12:00 UTC
Created attachment 141511 [details]
no_console_suspend without acpi_sleep=nonvs: low memory corruption

A similar dmesg appears with no_console_suspend even if I do not use acpi_sleep=nonvs. I should likely have been more patient before ...
Comment 24 Elmar Stellnberger 2014-06-30 16:13:26 UTC
Still it is a miracle why it worked with v2.6.37.6-1 but not with v2.6.37-rc1. This is likely due to a reduced kernel configuration.
Comment 25 Zhang Rui 2014-07-01 01:08:07 UTC
Created attachment 141591 [details]
debug patch

can you please try this debug patch and see if the problem still exists?
Comment 26 Zhang Rui 2014-07-01 01:18:47 UTC
I also suspect the problem is caused by commit 72d7c3b, in order to verify this, please
1. git checkout 72d7c3b, build the kernel and see if the problem still exist.
2. if yes, please do "git reset --hard HEAD~1", and rebuild the kernel, and see if the problem still exist.
Comment 27 Elmar Stellnberger 2014-07-01 12:52:24 UTC
Created attachment 141721 [details]
diff for commit e396c9d8 which was found working by accident (2.6.37.6-1)

Here is a diff on e396c9d8 (2.6.37.6-1) for the kernel which I have found to be working with regards to s2ram by accident. I have checked out what was reported as the last commit by git log -n 1 and then issued a 'diff -cr -x .git linux-stable ../git2/linux-stable >mypatches.diff. However I can not remember having patched as many files as were different for a fresh checkout of the same commit so it still stays a miracle why that was working and how it did emerge (that kernel still works though it is from a while after what we have found not to work).

  P.S. I have just invoked compilation for the patch of attachement 141591 (debug patch) and will respond to your requests shortly.
Comment 28 Zhang Rui 2014-07-01 13:57:41 UTC
I saw there are a lot of differences in the kernel config file, why?
Can you use the same config file for both kernel?
Comment 29 Elmar Stellnberger 2014-07-01 14:23:24 UTC
  Yes there are and my first impression was that this particular kernel was only working because it had a different config (I had to reconfigure the kernel heavily in order to leverage shorter compile times for the bisection; otherwise bisecting would have been impossible with a compile time of more than a half day). However when I compiled the most recent kernel 3.16-rc2 with the same config I had to notice that it did work as little as all the others (i.e. it did not work; all new options were set to their default).
  Zhang, let us now have a look at what your patch will yield since compilation has just finished.
Comment 30 Elmar Stellnberger 2014-07-01 15:58:17 UTC
  Unfortunately the debug.patch did not yield any other result (attachment 141591 [details]: with no_console_suspend I get exactly the same message as in the attached jpeg; without it simply does not resume from s2ram).
  I am currently compiling commit 72d7c3b as you have suggested in comment 26 to see whether it makes a difference to the previous commit.
Comment 31 Elmar Stellnberger 2014-07-01 16:13:26 UTC
Created attachment 141731 [details]
error messages trying to compile 2.6.35-492-g72d7c3b
Comment 32 Elmar Stellnberger 2014-07-01 16:17:25 UTC
Created attachment 141751 [details]
error messages trying to compile 2.6.35-492-g72d7c3b_HEAD~1

Now these error messages look somewhat similar to what I had already patched in the past on my own. Nonetheless, please provide me with some verified patch to make this compile reproducibly.
Comment 33 Elmar Stellnberger 2014-07-01 17:27:28 UTC
Created attachment 141761 [details]
ptrace.patch used to compile 2.6.35-492-g72d7c3b (excluding ";"-patch)
Comment 34 Elmar Stellnberger 2014-07-01 21:03:43 UTC
concerning comment 26: Do I really need to do a git reset ** --hard ** ? Compiling from scratch will last another entire day. However if I could do sth. like a soft reset or simple checkout that would complete within minutes as only a few files would have to be rebuilt.
Comment 35 Elmar Stellnberger 2014-07-01 22:28:58 UTC
2.6.35-492-g72d7c3b does not want to compile. Apart from the fact that I had to add a ';' at arch/x86/kernel/e820.c and apply the ptrace.patch it has only compiled with 'make -i', 'make -i modules'. However the System.map-2.6.35-4-desktop+ file stayed empty due to the errors in the following attachement. That system would not have been bootable without modules.
Comment 36 Elmar Stellnberger 2014-07-01 22:30:04 UTC
Created attachment 141781 [details]
some errors trying to install modules from 2.6.35-492-g72d7c3b
Comment 37 Elmar Stellnberger 2014-07-01 22:31:03 UTC
Created attachment 141791 [details]
errors trying to make modules of 2.6.35-492-g72d7c3b without -i
Comment 38 Zhang Rui 2014-07-01 22:40:13 UTC
please attach the full kernel config file you're using
Comment 39 Elmar Stellnberger 2014-07-01 22:44:06 UTC
Created attachment 141801 [details]
kernel config file used for trying to compile 2.6.35-492
Comment 40 Elmar Stellnberger 2014-07-01 22:51:11 UTC
Created attachment 141811 [details]
errors trying to make 2.6.35-492-g72d7c3b_head~1 without -i

... and that are the classical errors for make without -i (same/similar for previous commit).
Comment 41 Zhang Rui 2014-07-01 22:53:40 UTC
does the problem still exist if you boot with kernel parameter "memmap=8K$0x0"?
Comment 42 Zhang Rui 2014-07-01 22:58:45 UTC
Elmar, then please give up building that kernel.

I will ask the author of that commit for help to find a cleanup to verify this.

Hi, Yinghai,

can you help look at this issue please? It seems that there is memory corruption detected after resume, but I'm wondering why this is a regression. We tried to check if it is commit 72d7c3b that introduce this problem but failed, because it can not be reverted, even in 2.6.37-rc1 kernel.
Comment 43 Elmar Stellnberger 2014-07-01 23:07:58 UTC
Unfortunately it does not want to boot with memmap=8K$0x0 (the only thing a boot attempt with this parameter will show is a blinking cursor in the left upper edge (3.16.0-rc2-4-desktop+)).
Comment 44 Yinghai Lu 2014-07-02 06:28:41 UTC
(In reply to Zhang Rui from comment #42)

> can you help look at this issue please? It seems that there is memory
> corruption detected after resume, but I'm wondering why this is a
> regression. We tried to check if it is commit 72d7c3b that introduce this
> problem but failed, because it can not be reverted, even in 2.6.37-rc1
> kernel.

that huge patch is based on 2.6.35?

That patch only could affect real_mode_blob position.

Please boot kernel with "debug ignore_loglevel" and compare the value between working kernel and non-working one from bootlog.
Comment 45 Elmar Stellnberger 2014-07-02 08:40:46 UTC
Created attachment 141891 [details]
2.6.37.6-1-default vga=0 modeset=0 debug ignore_loglevel: s2ram has worked
Comment 46 Elmar Stellnberger 2014-07-02 08:44:04 UTC
Created attachment 141901 [details]
3.11.6-4-desktop vga=0 modeset=0 debug ignore_loglevel: s2ram has not worked

Now that is a very different kernel. If you like that better I can recompile 2.6.37-rc1 until tomorrow (I had to delete some old non working kernels as my hard disk got filled up.).
Comment 47 Elmar Stellnberger 2014-07-02 12:19:56 UTC
Created attachment 141911 [details]
2.6.38-1 vga=0 modeset=0 debug ignore_loglevel: does not work
Comment 48 Elmar Stellnberger 2014-07-02 16:25:56 UTC
Created attachment 141921 [details]
2.6.37-1 (2d90508) resume image: m.=0 v.=0 debug ignore_loglevel no_console_suspend

With debug and ignore_loglevel no_console_suspend does also yield other results; perhaps they are interesting for you (see for the jpeg taken with 2.6.37 2d90508).
Comment 49 Yinghai Lu 2014-07-02 18:20:08 UTC
that already go to rather later stage.

Please checkout 

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/power/s2ram.txt?id=refs/tags/v3.16-rc3

to find out offending driver.
Comment 50 Elmar Stellnberger 2014-07-02 18:36:20 UTC
You mean the problem is due to a bogus driver? Was there anything you could see in the logs?
How can the dmesg contain anything after I have pressed the power button? - I thought a reset would clear that buffer inescapabely. Do I need to mount a root partition for that and look in to /var/log/messages after rebooting?
Gonna use 3.16.0-rc2 (PM_TRACE and PM_DEBUG are already enabled there).
Comment 51 Elmar Stellnberger 2014-07-03 14:11:12 UTC
Now I know what is causing the s2ram problems! 
It is the patch of attachement #2 ('patch against hang on windowing PCI bridge'). If I compile kernel 2.6.37.6-1 with this patch s2ram will not work. If I disapply the patch then it works. However the patch is an absolute requirement for later kernels as my machine won`t boot without it otherwise. Please have a look on it and how it could interfere with s2ram.
Comment 52 Elmar Stellnberger 2014-07-03 14:16:30 UTC
Rafael, can you still remember about your patch? - and how we could make it fit for s2ram.
Comment 53 Zhang Rui 2014-07-04 00:17:36 UTC
Okay, here is the status
1. commit cc2893b, shipped in 2.6.34-rc6, fixed a problem that some PCI device power is not restored after resume.
2. commit db288c9, shipped in 3.6-rc1, aka, the patch you applied manually, fixed the problem in commit cc2893b, which may change the device' power unnecessarily, for multiple times. And it breaks s2ram if you applied it manually on 2.6.37-rc1.
3. vanilla kernel stops to boot on your machine, but commit db288c9 can workaround it.

(In reply to Elmar Stellnberger from comment #51)
> Now I know what is causing the s2ram problems! 
> It is the patch of attachement #2 ('patch against hang on windowing PCI
> bridge'). If I compile kernel 2.6.37.6-1 with this patch s2ram will not
> work. If I disapply the patch then it works. However the patch is an
> absolute requirement for later kernels as my machine won`t boot without it
> otherwise. Please have a look on it and how it could interfere with s2ram.

Please define "later kernel". It seems that 2.6.37-rc1 works well without that patch, right? So which kernel stops to boot on your machine?
Comment 54 Elmar Stellnberger 2014-07-04 15:39:07 UTC
last working kernel: 2.6.37 (37c2ac7) 13.Jan.2011
first non-working kernel: 2.6.37 (891cc22) 14.Jan.2011

The last working kernel boots right away and can s2ram without problems.
The first non-working kernel does not boot without patch #2 (windowing pci bridge) and as soon as the patch is applied it boots but can no more s2ram (The problems are the same as in the .jpegs above: vga=0 modeset=0 ignore_loglevel debug init=/bin/bash -> early resume, clocksource tsc unstable -xxx).
Comment 55 Zhang Rui 2014-10-23 11:41:39 UTC
Hi, Elmar,
what is the status in the latest upstream kernel? as I saw there are several changes similar to patch #2 merged in Linus tree.
Comment 56 Elmar Stellnberger 2014-10-26 19:08:53 UTC
Created attachment 155241 [details]
linux3.17.1-4_img1_BIOS-resume_without_uhci_hcd.JPG
Comment 57 Elmar Stellnberger 2014-10-26 19:10:51 UTC
Created attachment 155251 [details]
linux3.17.1-4_img1_BIOS-resume_without_uhci_hcd.JPG

Now that has revealed a very different error: uhci_hcd wakeup seems to be disbaled by the ACPI ... (tested with v3.17.1-4 vanilla from the 15th Oct. 2014)
Comment 58 Elmar Stellnberger 2014-10-26 19:13:52 UTC
Created attachment 155261 [details]
linux3.17.1-4_img2_BIOS-corruption-backtrace.JPG

... tested both with vga=0 modeset=0 init=/bin/bash debug ignore_loglevel and no_console_suspend.
Comment 59 Elmar Stellnberger 2014-10-26 19:17:48 UTC
Amazing that all of it had worked with early 2.6.37 kernels. It takes a few seconds until the BIOS corruption (second picture) is displayed.
Comment 60 Zhang Rui 2014-12-02 07:32:48 UTC
 (In reply to Elmar Stellnberger from comment #59)
> Amazing that all of it had worked with early 2.6.37 kernels.

what do you mean? 2.6.37-rc1 also works now?

> It takes a few
> seconds until the BIOS corruption (second picture) is displayed.
Comment 61 Elmar Stellnberger 2014-12-03 17:09:11 UTC
"had" means a long long time ago: kernel 2.6.37.6-0.11 and earlier (as stated in Bug 42853).
Comment 62 Elmar Stellnberger 2015-03-19 22:07:37 UTC
retested with Linux archiso 3.18.6-1-ARCH #1 SMP PREEMPT Sat Feb 7 08:59:29 CET 2015 i686 GNU/Linux; with exactly the same result as in attachement #5 ( no_console_suspend without acpi_sleep=nonvs: low memory corruption image/jpeg)
2014-06-30 16:12 UTC) - same memory regions were corrupted by the same values.
Comment 63 Elmar Stellnberger 2015-03-19 22:34:38 UTC
Created attachment 171251 [details]
nolapic noapic debug ignore_loglevel - wakeup1
Comment 64 Elmar Stellnberger 2015-03-19 22:36:07 UTC
Created attachment 171261 [details]
nolapic noapic debug ignore_loglevel - wakeup2
Comment 65 Elmar Stellnberger 2015-03-19 22:40:46 UTC
Created attachment 171271 [details]
nolapic noapic debug ignore_loglevel - wakeup3

all tested with the same kernel (3.18.6-1-ARCH #1 SMP PREEMPT Sat Feb 7 08:59:29 CET 2015 i686). I had forgotten about a 3rd screenshot the last time; otherwise nolapic and noapic (which are said to resolve some s2ram issues) seem not having caused any difference (well there also was a small difference - 'switching clocksource' - for screenshot 1).
Comment 66 Yinghai Lu 2015-03-19 23:47:47 UTC
Can you send out whole boot log with "debug ignore_loglevel"?
Comment 67 Elmar Stellnberger 2015-03-20 11:04:37 UTC
Created attachment 171481 [details]
dmesg with debug ignore_loglevel log_buf_len=1M

kernel parameters: dmesg with debug ignore_loglevel log_buf_len=1M vga=0 nouveau.modeset=0 blacklist.noveau=1 no_console_suspend

Rafael J. Wysocki, Yinghai Lu; please also have a look at Bug 95141 or ask someone to do so! It is of high importance for me and my work.
Comment 68 Elmar Stellnberger 2015-05-20 15:52:31 UTC
Created attachment 177511 [details]
Linux 4.0.1-1, ScreenShot 1 (no_console_suspend, vga=0)

Now re-tested with Linux 4.0.1-1-ARCH (nouveau blacklisted, vga=0, no_console_suspend). Now what it displays is somewhat different.
Comment 69 Elmar Stellnberger 2015-05-20 15:53:16 UTC
Created attachment 177521 [details]
Linux 4.0.1-1, ScreenShot 2 (no_console_suspend, vga=0)

... and after a while these messages.
Comment 70 Zhang Rui 2015-06-26 01:02:37 UTC
Yinghai,
do you have any updates on this?
Comment 71 Elmar Stellnberger 2015-06-26 19:25:50 UTC
Not yet; the computer (SilverSeraph) is still somewhere here around at me. Do you think I should re-test with a newer kernel?
Comment 72 Zhang Rui 2015-08-10 03:36:37 UTC
yes, please.
Comment 73 Elmar Stellnberger 2015-08-10 08:45:41 UTC
OK, I will see what I can do. We are currently living at lake Ossiach and do not have cable internet. Nonetheless it should be possible to fetch an arch cd iso or so wirelessly via hsdpa if that should be recently enough. It may take some time until I can fetch the SilverSeraph up from the Gerlitzen as well.
Comment 74 Elmar Stellnberger 2015-08-16 11:25:35 UTC
Created attachment 185061 [details]
Linux 4.1.3-1, ScreenShot 1 (no_console_suspend, vga=0)
Comment 75 Elmar Stellnberger 2015-08-16 11:26:19 UTC
Created attachment 185071 [details]
Linux 4.1.3-1, ScreenShot 2 (no_console_suspend, vga=0)
Comment 76 Elmar Stellnberger 2015-08-16 11:27:57 UTC
Created attachment 185081 [details]
Linux 4.1.3-1, ScreenShot 3 (no_console_suspend, vga=0)

Linux 4.1.3-1: not so much different though it now shows a different number with the hung kworker task.
Comment 77 Elmar Stellnberger 2015-11-11 18:53:04 UTC
No apparent differences with kernel 4.2.5-1-ARCH #1 SMP PREEMPT Tue Oct 27 08:28:41 CET 2015 i686 GNU/Linux.
Comment 78 Elmar Stellnberger 2016-08-27 05:19:24 UTC
Created attachment 230431 [details]
dmidecode for the SilverSeraph

Who has disabled s2ram for the SilverSeraph?
cat /sys/power/state solely shows disk without mem any more.
I guess it would have worked well since the following patch: 
commit 65ea11ec6a82b1d44aba62b59e9eb20247e57c6e (Ville Syrjälä - x86/hweight: Don't clobber %rdi).
Any way to re-enable this by a patch?
Comment 79 Elmar Stellnberger 2016-08-27 05:24:58 UTC
oops; mem just was disabled by my last boot options. - pls. just ignore my previous message; gonna re-test soon; as soon as I have a respective i586 kernel.
Comment 80 Elmar Stellnberger 2016-08-27 10:55:24 UTC
Created attachment 230461 [details]
freeze just hangs with 4.8.0-rc2 (screenshot)
Comment 81 Elmar Stellnberger 2016-08-27 10:56:28 UTC
Created attachment 230471 [details]
s2ram gives some funny colors with 4.8.0-rc2 (screenshot)
Comment 82 Elmar Stellnberger 2016-08-27 10:57:30 UTC
Created attachment 230481 [details]
standby with 4.8.0-rc2 (screenshot 1 of 3)
Comment 83 Elmar Stellnberger 2016-08-27 11:01:05 UTC
Created attachment 230491 [details]
standby with 4.8.0-rc2 (screenshot 2 of 3)

  Unfortunately I did not hit the third screenshot containing more information because I had already taken the sdcard out of my camera in order to read it with my computer. Anyone who would be interested in working on this may ask me for the third screenshot ...
  ... now that looks different than before because we have a full backtrace here.
Comment 84 Elmar Stellnberger 2016-08-29 19:13:58 UTC
don`t know if it has anything to do with it; but irq 11 is unhandeled on that system the kernel complains.
Comment 85 Zhang Rui 2016-12-22 01:03:33 UTC
please attach the acpidump output.
Comment 86 Zhang Rui 2017-02-14 03:23:39 UTC
well, what a old bug, unfortunately, it seems the problem is still there, right?

You have verified that /sys/power/pm_test works with "devices" mode, but failed with "platform" mode, can you confirm this in the latest upstream kernel?

I think we should ignore the low memory corruption at the moment, and focus on what makes the difference.
Comment 87 Elmar Stellnberger 2017-02-14 15:19:39 UTC
  Today re-tested with 4.9.6-1_arch from Thu. 2017-01-26 09:41:20:
'echo freezer/devices/platform/processors/core >/sys/power/pm_test' appears to do nothing (no crash, hang or error)
  
  However when I do 'echo mem >/sys/power/state' I will get three screens of errors similar to how it was before (going to upload images shortly).
Comment 88 Elmar Stellnberger 2017-02-14 15:21:48 UTC
Created attachment 254749 [details]
kernel 4.9.6-1-arch (screenshot 1 of 3)
Comment 89 Elmar Stellnberger 2017-02-14 15:22:42 UTC
Created attachment 254751 [details]
kernel 4.9.6-1-arch (screenshot 2 of 3)
Comment 90 Elmar Stellnberger 2017-02-14 15:23:30 UTC
Created attachment 254753 [details]
kernel 4.9.6-1-arch (screenshot 3 of 3)
Comment 91 Elmar Stellnberger 2017-02-14 15:30:10 UTC
  Just noted that the nouveau module was still loaded though I had booted with 'vga=0 nouveau.modeset=0 nouveau.blacklist=1 no_console_suspend debug ignore_loglevel init=/bin/bash'. However I would believe that it would not have minded since we have been using no_console_suspend.
  If you need a test with an absolutely recent kernel I would need to read into how to compile 4.9.x kernels first. The 4.9.x kernels appear to require some gpg-signed key; otherwise they seem to refuse compilation.
Comment 92 Zhang Rui 2017-02-15 06:29:41 UTC
(In reply to Elmar Stellnberger from comment #91)
>   Just noted that the nouveau module was still loaded though I had booted
> with 'vga=0 nouveau.modeset=0 nouveau.blacklist=1 no_console_suspend debug
> ignore_loglevel init=/bin/bash'. However I would believe that it would not
> have minded since we have been using no_console_suspend.

I'm not sure, but it's better to confirm if the problem is still there with nouveau module totally removed...

(In reply to Elmar Stellnberger from comment #87)
>   Today re-tested with 4.9.6-1_arch from Thu. 2017-01-26 09:41:20:
> 'echo freezer/devices/platform/processors/core >/sys/power/pm_test' appears
> to do nothing (no crash, hang or error)
>   
>   However when I do 'echo mem >/sys/power/state' I will get three screens of
> errors similar to how it was before (going to upload images shortly).

so it seems that the symptom is different, pm_test always works, but suspend-to-mem fails, right?

can you please verify if freeze mode works or not? (echo freeze > /sys/power/state)
Comment 93 Elmar Stellnberger 2017-02-15 07:35:03 UTC
I have tested pm_test with all combinations: freezer, devices, platform, processors and core. None of these options did yield any (negative) effect.
Comment 94 Elmar Stellnberger 2017-02-16 10:27:17 UTC
  Today I have re-tested with kernel 4.10.0-rc8+ from Wed 2017-02-15 15:47:58 CET with no drm and nouveau in the initrd and the results were pleasent:
* vga=0 no_console_suspend debug ignore_loglevel init=/bin/bash: machine awakes fine from echo mem >/sys/power/state (printing two lines of messages).
* vga=0 debug ignore_loglevel init=/bin/bash: machine first awakes from s2ram printing 2x2 unreadable message lines but then keeps hanging (the textmode cursor still blinking at the bottom)

i.e. it already works with no_console_suspend; however it hangs on restoring the video output which is still garbled to some extent (see for the image I will upload).
Comment 95 Elmar Stellnberger 2017-02-16 10:30:58 UTC
Created attachment 254787 [details]
test with vanilla 4.10.0-rc8+, full s2ram (no no_console_suspend), no drm & no nouveau module loaded
Comment 96 Zhang Rui 2017-04-12 06:25:24 UTC
so let's stick with latest upstream kernel,
1. what do you get without any workaround?
2. what do you get with boot option nomodeset?
3. what do you get with boot options nomodeset no_console_suspend?

For any of the case, if the screen is not back, please try to remote login to the machine to see if the system hangs completely, or it's just the screen that does not come back.

If it is a system hang, please redo the test, and with init=/bin/bash for all the cases.
suspend via console and check if the screen is not back, if yes, please type "reboot" blindly and see if the machine can be reboot successfully.
Comment 97 Chen Yu 2017-04-24 05:13:42 UTC
ping.
Comment 98 Zhang Rui 2017-06-16 03:00:01 UTC
I'd say this is a tough bug, and we'd better to see the symptom in the latest upstream kernel, to see if it gets improved or not.
Bug temporarily closed as there is no response so far.
Please feel free to re-open it once you can provide the information requested in comment #96.
Comment 99 Elmar Stellnberger 2022-04-18 19:40:58 UTC
  This bug is resolved kernel 5.15.32-desktop586-1.mg8. I have tested standby mem(s2ram) and hibernation. Under init=/bin/bash and nomodeset, no_console_suspend is required for mem/s2ram. Nonetheless the screen powers off well in spite of this option being given. Standby worked at the first try. Freeze apparently not, but I have not examined that. Under nouveau/the gui it came up from s2ram even without no_console_suspend, standby is not supported by systemctl so went untested here. Suspend to disk has worked without problems.
Comment 100 Elmar Stellnberger 2022-04-18 20:07:38 UTC
I would not recommend no_console_suspend with nouveau/X11 on that machine. It shows screen distortion after wakeup and does not always wake up successfully. Without it everything should go fine.

Note You need to log in before you can comment on or make changes to this bug.