Hello, I have my desktop for something like 5 years and suspend to RAM never worked. I never looked to make it work until now. I tried everything I found on the web with no luck. The 2 first time the suspend / resume work perfectly but the third time the PC freeze. The PC can freeze while going to suspend that it seem to never reach (still on), or, more frequently, on resume, ending up with a black screen or a frozen desktop. Very often the keyboard doesn't respond (caps led). Here is what I tried - it work with windows - tried to suspend on init 3 - disable a lot of things in bios - tried to blacklist all kernel modules - tried nouveau and nvidia drivers (GT430) - tried an other graphic card (HD 5750) with radeon driver - tried other sata ports for my ssd (samsung 840) - tried to downgrade / updrade the motherboard bios - tried with only one ram module and in other ram ports too - unpluged everything I could externally and internally - tried manjaro (KDE 5.7), fedora 24 (Gnome 3.20), fedora 25 (live), ubuntu 12.04, 14.04, 16.04, 16.10 (all live) - tried to follow some part of https://01.org/blogs/rzhang/2015/best-practice-debug-linux-suspend/hibernate-issues and https://www.kernel.org/doc/Documentation/power/s2ram.txt - tried to unbind / bind some drivers with a systemd-suspend script - etc... The motherboard is this one http://www.gigabyte.eu/products/product-page.aspx?pid=3766&dl=1#ov I will be happy to provide every info you need to help me make suspend to ram working. Thanks !
Created attachment 242041 [details] journalctl -r Here is a log of "journalctl -r" with a successful suspend/resume cycle and a failed one
Created attachment 242051 [details] blacklist mod Here is a blacklist module conf file that I tried
Created attachment 242061 [details] systemd-suspend Here is a systemd-suspend script that I tried
Created attachment 242191 [details] lspci
Using this script to suspend #!/bin/sh sync echo 1 > /sys/power/pm_trace echo mem > /sys/power/state give this on reboot with dmesg [ 0.903446] Magic number: 0:422:762 [ 0.903450] hash matches drivers/base/power/main.c:818 [ 0.903456] atkbd serio0: hash matches and "cat /sys/power/pm_trace_dev_match" give input serio I tried to rmmod serio_raw but that doesn't change anything. atkbd can't be unloaded it seems.
I tried "freezer", "devices", "platform", "processors" and "core" tests with this script and it seems that every tests worked #!/bin/sh sync echo 1 > /sys/power/pm_trace echo core > /sys/power/pm_test echo mem > /sys/power/state cat /sys/kernel/debug/suspend_stats success: 26 fail: 0 failed_freeze: 0 failed_prepare: 0 failed_suspend: 0 failed_suspend_late: 0 failed_suspend_noirq: 0 failed_resume: 0 failed_resume_early: 0 failed_resume_noirq: 0 failures: last_failed_dev: last_failed_errno: 0 0 last_failed_step: What does that mean ?
I tried with "acpi=off" boot option but that doesn't change anything.
please attach the dmesg after two good suspend cycles.
Created attachment 242901 [details] dmesg after two good suspend cycles Here it is. Thank you !
Hello Rui (Zhang), I need assistance from Rafael (Wysocki), email: rjw@wysocki.net, also rafael.j.wysocki@intel.com . You are INTEL also, I 100% know. You need to contact him in your own department, INTEL OTC, since this is something I never saw before! We have here INTEL possible ugly problem, with CORE2 family Sandy Bridge. The data are here: Sandy Bridge: CPUID = 0x206A7, ark.intel.com: http://ark.intel.com/products/52227/Intel-Core-i7-2820QM-Processor-8M-Cache-up-to-3_40-GHz So, we have failing log for PM suspend from kernel.org derived kernel... And log is here: https://bugzilla.kernel.org/attachment.cgi?id=242051 Here are very confusing lines from it: [ 181.127124] PM: noirq resume of devices complete after 11.258 msecs [ 181.127463] PM: early resume of devices complete after 0.311 msecs [ 181.127509] usb usb1: root hub lost power or was reset [ 181.127702] usb usb2: root hub lost power or was reset [ 181.131402] ehci-pci 0000:00:1a.0: cache line size of 4 is not supported [ 181.131419] rtc_cmos 00:01: System wakeup disabled by ACPI [ 181.131568] pcieport 0000:00:1c.2: System wakeup disabled by ACPI [ 181.131588] ehci-pci 0000:00:1d.0: cache line size of 4 is not supported Please, do note the following two lines: [ 181.131402] ehci-pci 0000:00:1a.0: cache line size of 4 is not supported [ 181.131588] ehci-pci 0000:00:1d.0: cache line size of 4 is not supported The cache size in x86/x86_64 (IA32 and 64) are ONLY/SOLELY 64 bytes (you should know this). I also need this info, ASAP! Thank you, _nobody_
(In reply to Niemand from comment #10) > Hello Rui (Zhang), > > I need assistance from Rafael (Wysocki), email: rjw@wysocki.net, also > rafael.j.wysocki@intel.com . You are INTEL also, I 100% know. You need to > contact him in your own department, INTEL OTC, since this is something I > never saw before! Yes, I work with Rafael, and both of us works on all kinds of suspend/resume related issues. :) > > We have here INTEL possible ugly problem, with CORE2 family Sandy Bridge. > The data are here: > Sandy Bridge: CPUID = 0x206A7, ark.intel.com: > http://ark.intel.com/products/52227/Intel-Core-i7-2820QM-Processor-8M-Cache- > up-to-3_40-GHz > Usually, you just need to file a new bug report here, so the problem will be in our radar. > So, we have failing log for PM suspend from kernel.org derived kernel... what kernel version are you using? please try the latest upstream kernel, say, 4.9-rc3 > And > log is here: https://bugzilla.kernel.org/attachment.cgi?id=242051 > this looks like the driver blacklist, are you sure this is the right link? hmmm, it seems that you've already have a bug report in kernel bugzilla, right? can you tell me the bug id? > Here are very confusing lines from it: > > [ 181.127124] PM: noirq resume of devices complete after 11.258 msecs > [ 181.127463] PM: early resume of devices complete after 0.311 msecs > [ 181.127509] usb usb1: root hub lost power or was reset > [ 181.127702] usb usb2: root hub lost power or was reset > [ 181.131402] ehci-pci 0000:00:1a.0: cache line size of 4 is not supported > [ 181.131419] rtc_cmos 00:01: System wakeup disabled by ACPI > [ 181.131568] pcieport 0000:00:1c.2: System wakeup disabled by ACPI > [ 181.131588] ehci-pci 0000:00:1d.0: cache line size of 4 is not supported > > Please, do note the following two lines: > [ 181.131402] ehci-pci 0000:00:1a.0: cache line size of 4 is not supported > [ 181.131588] ehci-pci 0000:00:1d.0: cache line size of 4 is not supported > This seems like a device/driver problem to me. what's the symptom of the problem? again, please show me the bug report that describes the problem in details.
(In reply to paviluf from comment #9) > Created attachment 242901 [details] > dmesg after two good suspend cycles > [ 4.174888] nvidia: module license 'NVIDIA' taints kernel. [ 4.174891] Disabling lock debugging due to kernel taint [ 4.178792] nvidia: module verification failed: signature and/or required key missing - tainting kernel please retest without any out of tree driver.
Created attachment 243341 [details] dmesg without any out of tree driver I already tried with nouveau and even an other graphic card (HD 5750) with radeon driver with the same result. Here is dmesg without any out of tree driver.
Hello Rui, I see that you completely missed my points in my comment 10. All Good, I'll take other approach, so you can understand me much better. I am here talking not about other (use) case, other driver, but I am trying to support paviluf's case using paviluf's logs. And I supplied logs from his blacklisted log in my comment 10. But now I supply his normal logs from https://bugzilla.kernel.org/show_bug.cgi?id=178641#c13 - Comment 13 And here, again, in his normal log: attachment 243341 [details], we have the following: [ 78.858081] ehci-pci 0000:00:1a.0: cache line size of 4 is not supported [ 78.858201] ehci-pci 0000:00:1d.0: cache line size of 4 is not supported I am not saying that these two lines are causing resume hang-up, but I NEVER saw (I'll repeat) such comment: "cache line size of 4 is not supported"! If you start googling on "cache line size of 4 is not supported", you never find one on the net (or I did not look enough carefully/in-depth)! You should try on ANY INTEL platform on ANY Linux to execute the following commands: $ cat /proc/cpuinfo | grep cache_alignment $ getconf LEVEL1_ICACHE_LINESIZE $ getconf LEVEL1_DCACHE_LINESIZE And tell me what number do you see? You should see ONLY one number, guess which? ;-) _nobody_
(In reply to Niemand from comment #14) Hello Niemand, Thanks for helping me. Don't know if it's good. $ cat /proc/cpuinfo | grep cache_alignment cache_alignment : 64 cache_alignment : 64 cache_alignment : 64 cache_alignment : 64 $ getconf LEVEL1_ICACHE_LINESIZE 64 $ getconf LEVEL1_DCACHE_LINESIZE 64
Yes, paviluf, this is exactly what the magic number is... 64 (bytes)! Neither 16, 32, or 128, even 256... And INTEL does know this (they had created this cache alignment in HW long time ago)! ;-) So, let us see if Arjan van de Ven <arjan@linux.intel.com> (INTEL kernel maintainer) knows answers to my quest?! ;-) Cache line size of 4 is not supported https://bugzilla.redhat.com/show_bug.cgi?id=1390298 So, I have created the new thread. And I expect that mainly INTEL people (OTC) will come back to me on this one... Don't they? ;-) _nobody_
(In reply to Zhang Rui from comment #12) > please retest without any out of tree driver. I hope you can look at this problem. Thanks !
(In reply to Niemand from comment #10) > [ 181.131588] ehci-pci 0000:00:1d.0: cache line size of 4 is not supported > > Please, do note the following two lines: > [ 181.131402] ehci-pci 0000:00:1a.0: cache line size of 4 is not supported > [ 181.131588] ehci-pci 0000:00:1d.0: cache line size of 4 is not supported > > The cache size in x86/x86_64 (IA32 and 64) are ONLY/SOLELY 64 bytes (you > should know this). > It looks like all the pci devices in your system reported that they only support cache line size of 4 bytes, which is read from pci config space. The io space might already be broken.
> It looks like all the pci devices in your system reported that they only > support cache line size of 4 bytes, which is read from pci config space. > The io space might already be broken. Hello Yu (Chen), I need clarification on your statement (since your statement is very vague). Namely, the following questions to be answered? [1] Did you ever experience that pci devices on INTEL platform reported that they only support cache line size of 4 bytes? If YES, I need very hard/real proof of that! Hard evidences! Net pointers... Use cases, presented. You name it! [2] The statement: "The IO space might already be broken" needs further explanation. I'll give you the examples: [A] Broken by factory (silicon invalidated/useless) assembly/process? [B] Broken by in operational use (platform use)? [C] Broken by FW/SW use? [D] You name it! Are you sure you have enough expertise about this problem??? Please, give me some assurances?! Thank you, _nobody_
(In reply to Chen Yu from comment #18) > The io space might already be broken. What do you mean ? My computer work very well on Linux since years except this problem.
Any news ?
It will be great if we can find a fix. Windows works very well for example. I hope someone is interested in the challenge.
paviluf, I have proposal to you. Could you, please, upgrade your system to Fedora 25? And see if this solves your problem? As I remember, you are with Fedora 24 (I again went through logs to confirm that). Maybe this will solve you problem, but I doubt. Thank you, _nobody_
Thanks for the proposal but, as I said in my first post, I already tried with Fedora 25 live and it was the same :(
This is an excellent info (sorry I am forgetting the details), do NOT forget, that this Bugzilla has limited lifespan, and within 6 months from today will be closed without the proper solution (when Fedora 24 gets outdated). And you want to keep it alive (I want to keep it alive, since it is interesting use case). And my part: Cache line size of 4 is not supported https://bugzilla.redhat.com/show_bug.cgi?id=1390298 I WILL (for 100%) keep alive, for sure! I suggest, you keep this one alive as well. :-) _nobody_
Thank you Niemand ! So, developers, is there anything I can do to help making this work ? I remember you that it work on Windows. Thanks !
I will sell this computer soon, so if you want to fix this bug it's now. Thanks.
Jeremy, I had lot of issues with INTEL (past decades). They (INTEL) are ignorant and arrogant. They are not going to help you. Not at all... Their managers are getting $200K+ USD/per year for NOTHING (for bare bull shit). INTEL First Level Managers (INTEL FLM) do not care. And INTEL OTC, as you are one of the kind, will not care. You'll loose couple of hundred $$ USD. INTEL with me, proud to announce, lost already at least $10M USD. For Good. ;-) They'll loose more. Much more. I commit! :thumb up: _nobody_
Paviluf, please let me confirm: 1. Are you testing on latest vanilla kernel?( which is of Linus tree at kernel.org), as bugzilla kernel mainly focus on that upstream version. 2. When the system hangs during resume, is it possible to 'ping' it?(you can try either usb-mac-switcher or just your native mac interface) 3. According to attachment 242041 [details], there isn't too much information we can get, we should add "no_console_suspend ignore_loglevel" to get more information. 4. Since pm_test works well according to Comment 6, I suspect this might be either related to BIOS, or the system just can not be woken up. 5. do not touch /sys/power/pm_trace for now( it might scribble the rtc) So please use the following step to test: 1. boot up the system with "no_console_suspend ignore_loglevel" 2. make sure your network is OK 3. provide the output of /proc/acpi/wakeup before suspended 4. test: rtcwake -m freeze -s 30 5. if step 4 succeed to return after 30 seconds, provide the dmesg log 6. continue to test: rtcwake -m mem -s 30 if it does not woken up after 30s, please try to ping it, or reboot the system and provide the journalctl 7. if step 6 failed, please test with your graphic driver disabled(you mentioned during resume the desktop is there but no response?)
Hello Chen, Thank you for looking at this ! Unfortunately I didn't had time but has soon as I can I will give the infos needed. Thanks !
(In reply to paviluf from comment #30) > Hello Chen, > > Thank you for looking at this ! Unfortunately I didn't had time but has soon > as I can I will give the infos needed. > thanks. PS: usually, we will close the bug report if we don't have a valid response from the reporter for longer than a month, so that we can focus on the active ones. In this case, please feel free to reopen it at any time, once the reporter can provide the information required.
Created attachment 257063 [details] dmesg
Created attachment 257065 [details] journalctl
(In reply to Chen Yu from comment #29) > 1. Are you testing on latest vanilla kernel?( which is of Linus tree at > kernel.org), as bugzilla kernel mainly focus on that upstream version. I test with the kernel provided by the distros I have tested. > 2. When the system hangs during resume, is it possible to 'ping' it?(you can > try either usb-mac-switcher or just your native mac interface) I don't know how to do that. > So please use the following step to test: > > 1. boot up the system with > "no_console_suspend ignore_loglevel" Done. > 2. make sure your network is OK > 3. provide the output of /proc/acpi/wakeup before suspended cat /proc/acpi/wakeup Device S-state Status Sysfs node PCI0 S5 *disabled no-bus:pci0000:00 PEX0 S5 *disabled pci:0000:00:1c.0 PEX1 S5 *disabled pci:0000:00:1c.1 PEX2 S5 *disabled pci:0000:00:1c.2 PEX3 S5 *disabled pci:0000:00:1c.3 PEX4 S5 *disabled PEX5 S5 *disabled PEX6 S5 *disabled PEX7 S5 *disabled HUB0 S5 *disabled UAR1 S3 *disabled pnp:00:02 USBE S3 *enabled pci:0000:00:1d.0 USE2 S3 *enabled pci:0000:00:1a.0 AZAL S5 *disabled pci:0000:00:1b.0 > 4. test: rtcwake -m freeze -s 30 That works well. > 5. if step 4 succeed to return after 30 seconds, provide the dmesg log attachment 257063 [details] > 6. continue to test: rtcwake -m mem -s 30 That doesn't work on first try. The system suspend, after 30s it start to wake up but the motherboard start to beep continuously. According to the manual that correspond to a power problem (remember that suspend / wake up work perfectly well on Windows). > if it does not woken up after 30s, please try to ping it, or > reboot the system and provide the journalctl attachment 257065 [details] (journalctl after a reboot) > 7. if step 6 failed, please test with your graphic driver disabled(you > mentioned during resume the desktop is there but no response?) Not sure what to do. Thanks !
(In reply to paviluf from comment #34) > (In reply to Chen Yu from comment #29) > > 1. Are you testing on latest vanilla kernel?( which is of Linus tree at > > kernel.org), as bugzilla kernel mainly focus on that upstream version. > > I test with the kernel provided by the distros I have tested. then please either use latest linus' git tree, or download the latest vanilla kernel source from kernel.org, and rebuild your kernel and see if the problem still exists. > > > 2. When the system hangs during resume, is it possible to 'ping' it?(you > can > > try either usb-mac-switcher or just your native mac interface) > > I don't know how to do that. > > > So please use the following step to test: > > > > 1. boot up the system with > > "no_console_suspend ignore_loglevel" > > Done. > > > 2. make sure your network is OK > > 3. provide the output of /proc/acpi/wakeup before suspended > > cat /proc/acpi/wakeup > Device S-state Status Sysfs node > PCI0 S5 *disabled no-bus:pci0000:00 > PEX0 S5 *disabled pci:0000:00:1c.0 > PEX1 S5 *disabled pci:0000:00:1c.1 > PEX2 S5 *disabled pci:0000:00:1c.2 > PEX3 S5 *disabled pci:0000:00:1c.3 > PEX4 S5 *disabled > PEX5 S5 *disabled > PEX6 S5 *disabled > PEX7 S5 *disabled > HUB0 S5 *disabled > UAR1 S3 *disabled pnp:00:02 > USBE S3 *enabled pci:0000:00:1d.0 > USE2 S3 *enabled pci:0000:00:1a.0 > AZAL S5 *disabled pci:0000:00:1b.0 > > > 4. test: rtcwake -m freeze -s 30 > > That works well. > > > 5. if step 4 succeed to return after 30 seconds, provide the dmesg log > > attachment 257063 [details] > > > 6. continue to test: rtcwake -m mem -s 30 > > That doesn't work on first try. The system suspend, after 30s it start to > wake up but the motherboard start to beep continuously. According to the > manual that correspond to a power problem (remember that suspend / wake up > work perfectly well on Windows). > > > if it does not woken up after 30s, please try to ping it, or > > reboot the system and provide the journalctl > > attachment 257065 [details] (journalctl after a reboot) > > > 7. if step 6 failed, please test with your graphic driver disabled(you > > mentioned during resume the desktop is there but no response?) > > Not sure what to do. > a simple way is to make sure the i915 driver is build as module, and then rename this file /lib/modules/your_kernel_name/kernel/drivers/gpu/drm/i915/i915.ko
ping...
Don't worry, I won't forget this :) I just had little time and I don't know how to build a kernel. Is it ok if I use the linux-mainline package from arch aur ? > a simple way is to make sure the i915 driver is build as module, and then > rename this file /lib/modules/your_kernel_name/kernel/drivers/gpu/drm > /i915/i915.ko I have a Nvidia card (no video output on motherboard - can test with an Amd one) so I don't think it's the right way.
(In reply to paviluf from comment #37) > Don't worry, I won't forget this :) I just had little time and I don't know > how to build a kernel. Is it ok if I use the linux-mainline package from > arch aur ? > > > a simple way is to make sure the i915 driver is build as module, and then > > rename this file /lib/modules/your_kernel_name/kernel/drivers/gpu/drm > > /i915/i915.ko > > I have a Nvidia card (no video output on motherboard - can test with an Amd > one) so I don't think it's the right way. Ok, then please blacklist your nvidia driver and test with latest upstream kernel. I'm not sure if the package from arch would be a pure kernel version w/o any other modification, but I think it would also be easy to install a new kernel from scatch via source code. 1. download the source code from https://cdn.kernel.org/pub/linux/kernel/v4.x/linux-4.13.tar.xz 2. copy the kernel config at /boot/config-xxxx to your source code root dir, and rename it to .config(then you can not see it) 3. 'make menuconfig' in your source code dir, and disable the CONFIG_DEBUG_INFO by uncheck the option at: Kernel hacking ---> Compile-time checks and compiler options ---> [] Compile the kernel with debug info 4. make all -j4; make modules_install; make install 5. after finished, reboot into the new kernel. first, test the normal suspend to mem by rtcwake -m mem -s 30 If it failed, test echo core > /sys/power/pm_test echo deep > /sys/power/mem_sleep echo mem > /sys/power/stat and wait for 5 seconds
please attach the kernel config file. please try to echo the USB roothub wakeup via sysfs and attach the dmesg after a succeed suspend/resume.
ping ..
Sorry, I didn't found the time to provide the needed infos. I will try ASAP. Thank you.
Let me close this for now. Please feel free to reopen if this issue can be reproduced on latest vanilla kernel.
Hello, I finally found some time ! So I installed Ubuntu 17.10 to start fresh. - I tested "rtcwake -m mem -s 30" and it failed on 4th attempt. - I built kernel 4.13.16 as you indicate, rebooted on it and tested "rtcwake -m mem -s 30" and it failed on 4th attempt too. - I tried echo core > /sys/power/pm_test echo deep > /sys/power/mem_sleep echo mem > /sys/power/state and that worked. I don't know how to echo the USB roothub wakeup via sysfs but I here is dmesg after a succeed suspend/resume and the kernel config file.
Created attachment 273595 [details] dmesg after a succeed suspend/resume
Created attachment 273597 [details] kernel config file
(In reply to paviluf from comment #43) > Hello, > > I finally found some time ! So I installed Ubuntu 17.10 to start fresh. > > - I tested "rtcwake -m mem -s 30" and it failed on 4th attempt. > - I built kernel 4.13.16 as you indicate, rebooted on it and tested "rtcwake > -m mem -s 30" and it failed on 4th attempt too. > - I tried > echo core > /sys/power/pm_test > echo deep > /sys/power/mem_sleep > echo mem > /sys/power/state > and that worked. > > I don't know how to echo the USB roothub wakeup via sysfs but I here is > dmesg after a succeed suspend/resume and the kernel config file. There's no obvious error/warning in the log, and since the test_mode 'core' work well, let's bypass the BIOS and test suspend-to-freeze and see what would happend: 1. append "no_console_suspend ignore_loglevel" in the grub 2. echo freeze > /sys/power/state 3. press any key on the keyboard(or power button) to wake the system up
That work
Do you need more infos ?
(In reply to paviluf from comment #48) > Do you need more infos ? Humm, I don't have much clue now 1. Please check if you have serial port on your desktop? If this, we can root cause this issue within 1 day :-) 2. As you described in Comment 43, you have tried all the pm_test mode, could you confirm if you have tested each pm_test mode for at least 4 times?(because you said the rtcwake will failed at 4th try) Be aware, you should echo none > /sys/power/pm_test because trying rtcwake. 3. In most cases the graphic drivers might be the offenders, so please blacklist them nouveau/or nvidia native driver and try again. always provide the dmesg after each test, with no_console_suspend ignore_loglevel appended in grub file
(In reply to Chen Yu from comment #49) > (In reply to paviluf from comment #48) > > Do you need more infos ? > > Humm, I don't have much clue now > 1. Please check if you have serial port on your desktop? If this, we can > root cause this issue > within 1 day :-) > 2. As you described in Comment 43, you have tried all the pm_test mode, > could you confirm if you have tested each pm_test mode for at least 4 > times?(because you said the rtcwake will failed at 4th try) Be aware, you > should echo none > /sys/power/pm_test because trying rtcwake. > > 3. In most cases the graphic drivers might be the offenders, so please > blacklist them nouveau/or nvidia native driver and try again. > You might get black screen after resumed, please check if your keyboard is working by pressing Num Lock/Caps Lock, also the PS/2 keyboard is preferred to USB keyboard. > always provide the dmesg after each test, with no_console_suspend > ignore_loglevel appended in grub file
Created attachment 273775 [details] echo freeze > /sys/power/state
Created attachment 273777 [details] echo mem > /sys/power/state
Created attachment 273779 [details] echo freeze > /sys/power/state
> Humm, I don't have much clue now > 1. Please check if you have serial port on your desktop? If this, we can > root cause this issue within 1 day :-) I don't have a serial port. > 2. As you described in Comment 43, you have tried all the pm_test mode, > could you confirm if you have tested each pm_test mode for at least 4 > times?(because you said the rtcwake will failed at 4th try) Be aware, you > should echo none > /sys/power/pm_test because trying rtcwake. > 3. In most cases the graphic drivers might be the offenders, so please > blacklist them nouveau/or nvidia native driver and try again. So I blacklisted nouveau and nvidia drivers. I did "echo none > /sys/power/pm_test". I again did : echo core > /sys/power/pm_test echo deep > /sys/power/mem_sleep echo mem > /sys/power/state 5 times and that worked. I also tried "echo freeze > /sys/power/state" 5 times and that worked. > You might get black screen after resumed, please check if your keyboard is > working by pressing Num Lock/Caps Lock, also the PS/2 keyboard is preferred > to USB keyboard. I don't have PS/2 keyboard. I tried again "rtcwake -m mem -s 30" but now that fail on first try. I mean suspend is ok and on wake up black screen and keyboard doesn't work. > always provide the dmesg after each test, with no_console_suspend > ignore_loglevel appended in grub file I provided them
> 1. Please check if you have serial port on your desktop? If this, we can > root cause this issue within 1 day :-) Well I have one internal serial port. Don't know if I can use it but if it's possible what should I do ?
(In reply to paviluf from comment #55) > > 1. Please check if you have serial port on your desktop? If this, we can > > root cause this issue within 1 day :-) > > Well I have one internal serial port. Don't know if I can use it but if it's > possible what should I do ? OK, according to Comment 54, do you mean: 1. with nouveau and nvidia driver blacklisted, Q: can you confirm if nouveau/nvidia/i915 has been blacklisted? lsmod 2. based on 1, with 'core' pm_test mode, all 5 times of suspend to mem work. 3. based on 1, pm_test set to none, rtcwake -m mem -s 30 gets black screen and no respond after resumed. 4. Has you tested: based on 1, pm_test set to none, and echo mem > /sys/power/state and wake up via power button or other key? OK, you have a serial port, please append 'no_console_suspend ignore_loglevel console=ttyS0,115200 console=tty' in kernel boot commandline and try to gather the output from it. Then try step 3 above and record the serial port log, let' find out at which stage it failed.
(In reply to Chen Yu from comment #56) > OK, according to Comment 54, do you mean: > 1. with nouveau and nvidia driver blacklisted, > Q: can you confirm if nouveau/nvidia/i915 has been blacklisted? lsmod I blacklisted nouveau/nvidia and will provide lsmod. Should I also blacklist i915 too aven if it's not used (no video output on motherboard) ? > 2. based on 1, with 'core' pm_test mode, all 5 times of suspend to mem work. Yes > 3. based on 1, pm_test set to none, rtcwake -m mem -s 30 gets black screen > and no respond after resumed. Yes > 4. Has you tested: based on 1, pm_test set to none, and echo mem > > /sys/power/state and wake up via power button or other key? I gets black screen and no respond after resumed too. > OK, you have a serial port, please append 'no_console_suspend > ignore_loglevel console=ttyS0,115200 console=tty' in kernel boot commandline > and try to gather the output from it. Then try step 3 above and record the > serial port log, let' find out at which stage it failed. Do you have some infos on how to do that. I don't know if I will be able to use that internal port since it's just some pin (not a proper serial port).
Created attachment 273795 [details] lsmod
(In reply to Chen Yu from comment #56) > OK, you have a serial port, please append 'no_console_suspend > ignore_loglevel console=ttyS0,115200 console=tty' in kernel boot commandline > and try to gather the output from it. Then try step 3 above and record the > serial port log, let' find out at which stage it failed. Well I spent one day to try to gather the output of the kernel and that didn't worked at all. I already spent far too much time on this. If we don't find quickly why it isn't working whereas it's working on windows I will drop linux and only use windows or buy a mac. I'm so sick that on gnu/linux there is always something that doesn't work...
OK, no need to enable the serial port output if it is too hard. Please bear in mind that if you want to be benefit from flexibility/high performance of linux then you are also taking risk of using it - actually a workaround is to use suspend to freeze, which is similar to your requirement. Read from Comment 5 again, you mentioned that you are using pm_trace, which might be a clue - do you have a 'reset' key on your desktop, I mean, a hot reboot w/o power off? This is important, we can use this device for blind debugging. If yes, please disable all the graphic drivers, then do the pm_trace and provide the dmesg after a hot-reboot, let's check if the offender is atkbd? And, most importantly, please always try the latest uptream kernel from https://cdn.kernel.org/pub/linux/kernel/v4.x/linux-4.15.tar.xz
Thank you for your help and sorry but that's so annoying and time consuming ! So, I have a 'reset' button on my desktop. I build kernel 4.15, I disabled all the graphic drivers (nvidia, nouveau, i915) and I did : sync echo 1 > /sys/power/pm_trace echo mem > /sys/power/state That fail on first try (can't wakeup - black screen and keyboard unresponsive). Here is "cat /sys/power/pm_trace_dev_match" output bdi I will attach dmesg after a hot-reboot
Created attachment 273937 [details] dmesg after hot reboot
I'm done, I managed to get a motherboard, processor and ram for free. The hardware is older and less powerful but that work. I'm really disappointed.
Well, the replacement motherboard died... I'm back to the motherboard that have the suspend problem...
ok, let's continue, and stick to 4.15, [ 1.257020] Magic number: 0:320:41 [ 1.257094] bdi 7:6: hash matches Suggested there might be something wrong with the block device driver, and this might be a new problem... And sorry I forgot to mention, before doing any test, please: 1. remove the 'quiet splash vt.handoff=7' in your grub command line 2. echo 0 > /sys/power/pm_async to disable the async suspend thus we can confirm the offender. please provide dmesg and cat /sys/power/pm_trace_dev_match again. let me think about adding a hack debug patch based on pm_trace, if it is still bdi.
I tried KDE neon so I'm doing the tests on a clean install (no blacklist and whatsoever) with kernel 4.13 that come with the distro for now. So I removed "quiet splash" and added "no_console_suspend ignore_loglevel" from grub command line and updated grub. I did "echo 0 > /sys/power/pm_async". I rebooted and did : sync echo 1 > /sys/power/pm_trace echo mem > /sys/power/state That fail on first try "cat /sys/power/pm_trace_dev_match" output acpi So I blacklisted acpi, rebuild initramfs and retried pm_trace_dev_match output block rtc_cmos So I blacklisted block and rtc_cmos, rebuild initramfs and retried pm_trace_dev_match output sometimes bdi and sometimes block rtc_cmos
Created attachment 274283 [details] hot reboot dmesg
Created attachment 274285 [details] dmesg excerpt on fail (pc frozen)
I use Windows 10 on this computer, so I'm closing this... Thanks.