Bug 105041 - System constantly freezes up when waking from suspend2ram
Summary: System constantly freezes up when waking from suspend2ram
Status: ASSIGNED
Alias: None
Product: Drivers
Classification: Unclassified
Component: Video(DRI - non Intel) (show other bugs)
Hardware: x86-64 Linux
: P1 high
Assignee: drivers_video-dri
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-09-26 18:22 UTC by eversor
Modified: 2016-02-05 19:39 UTC (History)
3 users (show)

See Also:
Kernel Version: 4.*
Subsystem:
Regression: No
Bisected commit-id:


Attachments
journalctl and lspci output (21.45 KB, application/zip)
2015-09-26 18:22 UTC, eversor
Details

Description eversor 2015-09-26 18:22:10 UTC
Created attachment 188591 [details]
journalctl and lspci output

System constantly freezes up when waking from suspend2ram.
First time faced this problem almost a year ago, after buying new laptop.
As a workaround i just migrated to linux-lts (3.*) and everything was working good enough (at least decently).
Recently linux-lts package in archlinux repo's was upgraded up to 4.1.8-1-lts and now i got same annoying problem.
Suspend feature is enormously important for laptops.

Steps to reproduce:
0) Get same hardware and linux 4.*
1) Suspend to RAM
2) Try to wake up, using power button, keyboard or whatever

Actual results:
 Blackscreen with cursor (or getty, depending on where you were before suspend) appears, HDD and fans are working hard, system is irresponsible.
 Sometimes ALT+SysRQ+B is working (probably not S and U).

Expected results:
 Working system

Environment:
 Archlinux
 Linux 4.1.8-1-lts #1 SMP Tue Sep 22 17:49:49 CEST 2015 x86_64 GNU/Linux
 Same thing happens with any 4.* and latest linux-libre
 Problem doesn't occur on 3.10 for example

Additional information:
I'm using 
 * dm-crypt'ed root
 * custom wireless module, bought on ebay (i don't think it's a source of troubles since everything works on older kernels)
Comment 1 eversor 2015-09-27 16:09:35 UTC
Same behaviour with archlinux live-USB and e.g. trisquel7 so it probably isn't related to encrypted root, software or whatever
--
P.S. Just realised that downloading zip archive isn't really convenient, so:
 laptop: acer ex-2510g-54tk
 intel i5-4210U
 geForce 820M
 atheros ar9565
 RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 12)
Comment 2 Aaron Lu 2015-09-28 01:47:25 UTC
Please follow this document:
https://www.kernel.org/doc/Documentation/power/basic-pm-debugging.txt
to do some debugging see if it is a driver related issue.
Comment 3 Zhang Rui 2015-09-28 08:02:37 UTC
this sounds like a regression somehow.
eversor,
Do you still remember the previously working setup?
First of all, I think we need to identify if this is a kernel or userspace problem? Say, does the problem still exist with an old working kernel and new userspace, and does the problem still exist with a new kernel and old working userspace?
Comment 4 eversor 2015-09-29 07:16:05 UTC
Aaron Lu,
 with 4.* freezings occurs on "platform" mode and above.
 Right now i'm with 3.14.52-1-lts and it passes all the tests.
Zhang Rui,
 i'm pretty new with *nix and not sure, that i understand you right.
 Can you, please, describe what specifically should i have done, to answer your question?
Comment 5 eversor 2015-10-05 06:54:40 UTC
Just tried 3.18 and got same behaviour
Comment 6 Aaron Lu 2015-10-08 07:28:01 UTC
(In reply to eversor from comment #4)
> Aaron Lu,
>  with 4.* freezings occurs on "platform" mode and above.
>  Right now i'm with 3.14.52-1-lts and it passes all the tests.

Maybe some device driver failed the suspend, can you please boot into console mode by adding "init=/bin/bash" and then make sure as few as possible drivers are loaded and then do the test again?
Comment 7 eversor 2015-10-09 19:47:43 UTC
You(In reply to Aaron Lu from comment #6)
> (In reply to eversor from comment #4)
> > Aaron Lu,
> >  with 4.* freezings occurs on "platform" mode and above.
> >  Right now i'm with 3.14.52-1-lts and it passes all the tests.
> 
> Maybe some device driver failed the suspend, can you please boot into
> console mode by adding "init=/bin/bash" and then make sure as few as
> possible drivers are loaded and then do the test again?

You were right. It resumes properly if i boot with init=/bin/sh.
I tried to modprobe nouveau and i915 and it was still working.
What's next?
Comment 8 Aaron Lu 2015-10-10 01:49:47 UTC
That's a good question :-)

I would start with other PCI drivers and see if problem occured.
Comment 9 Aaron Lu 2015-12-16 02:59:45 UTC
Any update?
Comment 10 eversor 2015-12-21 14:01:24 UTC
Nope.
I tried few modules, but i didn't dig deep.
If i had some idea on what specific module it could be,
i'd have checked it, but i can't iterate through all of them now.
Now i just stick with 3.14
Comment 11 Aaron Lu 2015-12-22 01:50:36 UTC
Maybe worth a try: add "initcall_debug no_console_suspend" to kernel cmdline, after boot, suspend. If we are lucky, we may be able to see some error messages on the screen.
Comment 12 eversor 2016-01-24 23:09:01 UTC
Oh guys, i feel really sorry.
I can't figure out how it happend, but i didn't notice that the trouble was in nouveau...
I bet it was the first module i tried.
There's some strange behaviour, because if i 'modprobe' modules in some specific order they don't cause the problem (may be thats why i skipped nouveau).
Bruteforcing this night possible combinations i somehow modprobed all the 153 modules i have in regular system (including nouveau) and suspend/wakeup still worked.
But if i modprobe nouveau as i boot and then suspend --- it gets stuck.

Just blacklisted nouveau and i'm admiring suspend/wake in this beautiful 4.3.3-3
Comment 13 Aaron Lu 2016-01-25 02:02:39 UTC
Then I'm moving this bug to Drivers/DRM-non-intel for nouveau people to take a look.
Comment 14 eversor 2016-02-05 19:39:15 UTC
Unfortunately, it's not that simple...
With nouveau i actually had guaranteed deadlock on resume.
Without it i experience random freezes. Sometimes suspend/resume works, but sometimes it doesn't suspend at all --- LED indicates active mode, but laptop is irresponsible and screen is black (this behaviour is different from originally described).
Several possibly unrelated problems at once...
http://pastebin.com/tW9WtbpZ

Note You need to log in before you can comment on or make changes to this bug.