Bug 60714 - resume from suspend-to-ram does not work any more
Summary: resume from suspend-to-ram does not work any more
Status: RESOLVED WILL_NOT_FIX
Alias: None
Product: Drivers
Classification: Unclassified
Component: Video(DRI - Intel) (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: intel-gfx-bugs@lists.freedesktop.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-08-07 18:47 UTC by MDBürkle
Modified: 2013-08-24 14:30 UTC (History)
5 users (show)

See Also:
Kernel Version: 3.10.0
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
my kernel .config for 3.10.0-gentoo (101.15 KB, application/octet-stream)
2013-08-07 18:47 UTC, MDBürkle
Details

Description MDBürkle 2013-08-07 18:47:50 UTC
Created attachment 107143 [details]
my kernel .config for 3.10.0-gentoo

Hi,

since kernel version 3.10.0(-gentoo), resume from suspend-to-ram does not work any more for me: instead of continuing operations, only a graphical pointer is shown and the system does not react any more.

Will have time to do a bisect after August, 31st.

Any questions, meanwhile?

Kind regards,
mdbuerkle
Comment 1 Rafael J. Wysocki 2013-08-07 23:19:53 UTC
On Wednesday, August 07, 2013 06:47:50 PM bugzilla-daemon@bugzilla.kernel.org wrote:
>
> since kernel version 3.10.0(-gentoo), resume from suspend-to-ram does not
> work
> any more for me: instead of continuing operations, only a graphical pointer
> is
> shown and the system does not react any more.

What was the newest kernel you tried?
Comment 2 MDBürkle 2013-08-08 16:18:09 UTC
(In reply to Rafael J. Wysocki from comment #1)

> What was the newest kernel you tried?

Newest was 3.10.2-gentoo, re-tested the day before yesterday (iirc).


Astonishing was that the lid-close reaction was affected by i915-operations (cf. bug#60643):

 - with consoles visible, lid-close +-open lead to syslog messages
   "acpi button/LID close+open unhandled"
   and the system was continuing operations

 - with consoles invisible, lid-close +-open lead to a total hang
   (with fan rotating nearly at max rpm),
   no reaction at all any more (except on pwr button 3sec press)

At these events, fglrx kernel module was renamed to .ko_off and X (with fglrx) was not started.

Note to self: check if that behaviour is the same in 3.9.x kernels.
Comment 3 MDBürkle 2013-08-10 19:22:46 UTC
(In reply to Rafael J. Wysocki from comment #1)

> What was the newest kernel you tried?

Just tested 3.10.5-gentoo-r1 with cmdline

 root=/dev/sda8 vga=0x0f04 drm.debug=0xe acpi_osi="!Windows 2012" acpi_backlight=vendor

and had the same behaviour as the other 3.10.x kernels (3.10.0-gentoo, 3.10.1-gentoo, 3.10.2-gentoo).

:-(
Comment 4 MDBürkle 2013-08-11 09:33:04 UTC
(In reply to MDBürkle from comment #2)

> Note to self: check if that behaviour is the same in 3.9.x kernels.

Tested in 3.9.11-gentoo-r1:

the behaviour is the same (system hangs after lid-open), at least when the cmdline is with `acpi_osi="!Windows 2012" acpi_backlight=vendor' or one of these and with `drm.debug=0xe'.

Forgot to test without these cmdline additions - anyone suspecting that these have an effect...?
Comment 5 MDBürkle 2013-08-11 09:36:07 UTC
(continuing comment#4:)

With or without fglrx.ko kernel module loaded made no difference at all.

Always a GREY console after "processing uevents" and after opening the lid, the screen flashed and turned black (backlight off).
Comment 6 Rafael J. Wysocki 2013-08-11 21:11:35 UTC
And what's the last working kernel?
Comment 7 MDBürkle 2013-08-12 18:46:47 UTC
(In reply to Rafael J. Wysocki from comment #6)
> And what's the last working kernel?

Latest working (and which I'm currently using) is 3.9.11-gentoo-r1.

Check for updates... nope, no new versions (except 3.10.6) available.
Will try 3.10.6, maybe today.
Comment 8 MDBürkle 2013-08-12 23:16:59 UTC
Tried 3.10.6-gentoo: hangs at resuming like the other 3.10.x versions. :-(
Comment 9 MDBürkle 2013-08-12 23:30:17 UTC
Just verified if suspend-to-disk still works with 3.9.11-gentoo-r1: it does.
Comment 10 MDBürkle 2013-08-18 21:23:31 UTC
Tried to do a git bisect of 3.10.0 (bad) against 3.9.0/2/3/6/7/10/11 (good) but had problems (#ifdef KERNEL_VERSION >= 3.10.0 in some ati-drivers [fglrx] source file, which conflicts with Version being "3.9.0+").

Tried 3.9.11-r1, 3.10.0 and 3.9.0+ with "fglrx.ko_off", but all three had the same behaviour: echo mem > /sys/power/state and resuming succeeded, but afterwards the backlight (cf. bug#60643) was turned off, so I had to type blindly.

ati-drivers is now patched and compiled against "3.9.0+" so I will have another try for the first bisect step. :-)
Comment 11 MDBürkle 2013-08-20 20:05:51 UTC
(from the script-log; I wonder if I really wrote "git bisect" and not "git bisect good/bad"; if there has to be a good or bad then there was a "goo" history search at that command...:)

bisecting ended with

lapmdb-hpl linux-stable #>> git bisect 
24576d23976746cb52e7700c4cadbf4bc1bc3472 is the first bad commit
commit 24576d23976746cb52e7700c4cadbf4bc1bc3472
Author: Jesse Barnes <jbarnes@virtuousgeek.org>
Date:   Tue Mar 26 09:25:45 2013 -0700

    drm/i915: enable VT switchless resume v3
    
    With the other bits in place, we can do this safely.
    
    v2: disable backlight on suspend to prevent premature enablement on resume
    v3: disable CRTCs on suspend to allow RTD3 (Kristen)
    
    Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
    Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
    Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

:040000 040000 72fb279ebc4c48824015f43eac459d046dde3204 6797fd617ffd75b6e2a457aaabf5e90f9a4021c0 M      drivers
lapmdb-hpl linux-stable #>> 

lapmdb-hpl linux-stable #>> git bisect log
git bisect start
# good: [c1be5a5b1b355d40e6cf79cc979eb66dafa24ad1] Linux 3.9
git bisect good c1be5a5b1b355d40e6cf79cc979eb66dafa24ad1
# good: [57049bb1dd0461d8423c3feceea36148d4335317] Linux 3.9.2
git bisect good 57049bb1dd0461d8423c3feceea36148d4335317
# good: [4bb08696fab71294c8f1c134a21be9159f82ba08] Linux 3.9.3
git bisect good 4bb08696fab71294c8f1c134a21be9159f82ba08
# good: [4b73febd1ba302268aabe370de25601eaa884b25] Linux 3.9.6
git bisect good 4b73febd1ba302268aabe370de25601eaa884b25
# good: [485f25fcc014f2744754f22de395f745f2c7e492] Linux 3.9.7
git bisect good 485f25fcc014f2744754f22de395f745f2c7e492
# good: [0c2dc4da120bacc62d6d3f7cdaed11ca18e4d410] Linux 3.9.10
git bisect good 0c2dc4da120bacc62d6d3f7cdaed11ca18e4d410
# good: [896f5009ed1fbaec43f360c4ebf022639cd61d5f] Linux 3.9.11
git bisect good 896f5009ed1fbaec43f360c4ebf022639cd61d5f
# bad: [8bb495e3f02401ee6f76d1b1d77f3ac9f079e376] Linux 3.10
git bisect bad 8bb495e3f02401ee6f76d1b1d77f3ac9f079e376
# good: [20b4fb485227404329e41ad15588afad3df23050] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
git bisect good 20b4fb485227404329e41ad15588afad3df23050
# bad: [eac84105cddf8686440aaa9fbcb58093e37e4180] Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
git bisect bad eac84105cddf8686440aaa9fbcb58093e37e4180
# bad: [9992ba72327fa0d8bdc9fb624e80f5cce338a711] Merge tag 'sound-3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
git bisect bad 9992ba72327fa0d8bdc9fb624e80f5cce338a711
# good: [c8d8566952fda026966784a62f324c8352f77430] Merge tag 'for-linus-v3.10-rc1' of git://oss.sgi.com/xfs/xfs
git bisect good c8d8566952fda026966784a62f324c8352f77430
# bad: [33896bf3207d8812274747b8b96776d27082e580] udl: bind the framebuffer to the correct device.
git bisect bad 33896bf3207d8812274747b8b96776d27082e580
# bad: [7ddcc7364a93d18b80967b3a9b3f6aea107323f6] drm/exynos: hdmi: move mode_fixup to drm common hdmi
git bisect bad 7ddcc7364a93d18b80967b3a9b3f6aea107323f6
# bad: [baba133ae50e563c5896d39e150b6617857a9d8e] drm/i915: clean up plane bpp confusion
git bisect bad baba133ae50e563c5896d39e150b6617857a9d8e
# good: [67d5a50c0480d5d41e0423e6fa55984f9fd3381e] drm/i915: handle walking compact dma scatter lists
git bisect good 67d5a50c0480d5d41e0423e6fa55984f9fd3381e
# good: [f9c513e9d6d25fec3404a97c9b0f03b2eb858315] drm/i915: Always call fence-lost prior to removing the fence
git bisect good f9c513e9d6d25fec3404a97c9b0f03b2eb858315
# bad: [ed23abdd648358e69c1a94e0c70e45f6a23a7aab] Revert "drm/i915: set dummy page for stolen objects"
git bisect bad ed23abdd648358e69c1a94e0c70e45f6a23a7aab
# bad: [fa00abe00e379a0e9b070616baee58692576f29e] DRM/i915: Remove valleyview_hpd_irq_setup.
git bisect bad fa00abe00e379a0e9b070616baee58692576f29e
# good: [5e1bac2ff7376a823a6eedd1dd3815ac9ae250e6] drm/i915: add sprite restore function v3
git bisect good 5e1bac2ff7376a823a6eedd1dd3815ac9ae250e6
# bad: [24576d23976746cb52e7700c4cadbf4bc1bc3472] drm/i915: enable VT switchless resume v3
git bisect bad 24576d23976746cb52e7700c4cadbf4bc1bc3472
# good: [b5644d0554f37016763f615bd65cd68af96aa509] drm/i915: restore cursor and sprite state when forcing a config restore v2
git bisect good b5644d0554f37016763f615bd65cd68af96aa509
lapmdb-hpl linux-stable #>> 

(useless) full log from something like the third step onwards available upon request (1,9MB, zipped 303kB).


Can anyone tell me how I can get a diff (git diff _what_?) from that information? I'd like to verify that the hangs go away in 3.10.x if that commit is reverted...
Comment 12 Rafael J. Wysocki 2013-08-20 20:45:38 UTC
Reassigning to DRM/Intel.
Comment 13 Jesse Barnes 2013-08-20 21:02:27 UTC
Hm so this is a hybrid gfx machine?  Maybe you need to have fglrx loaded, but it can't support VT-switchless resume?
Comment 14 MDBürkle 2013-08-20 21:47:41 UTC
(In reply to Jesse Barnes from comment #13)
> Hm so this is a hybrid gfx machine?

Yes, this is a core-i5-2430M-graphics-and-amd-6470(M)-gpu machine.

From lspci -vmmnn:

Slot:   00:02.0
Class:  VGA compatible controller [0300]
Vendor: Intel Corporation [8086]
Device: 2nd Generation Core Processor Family Integrated Graphics Controller [0116]
SVendor:        Hewlett-Packard Company [103c]
SDevice:        Device [1670]
Rev:    09

Slot:   01:00.0
Class:  VGA compatible controller [0300]
Vendor: Advanced Micro Devices [AMD] nee ATI [1002]
Device: Seymour [Radeon HD 6400M/7400M Series] [6760]
SVendor:        Hewlett-Packard Company [103c]
SDevice:        Device [1670]

AMD renamed the chips from 6470 to 7470 or the like, lately.


> Maybe you need to have fglrx loaded,
> but it can't support VT-switchless resume?

I do have fglrx loaded as my X configuration is based on it. However, I had a working setup without fglrx, before...

I was not aware that suspending/resuming in pre-3.10.x kernels is using VT (usually vt7/vt8) switching. That flickering of the screen (e.g. in 3.9.11) is the VT switching - to which vt (the controlling tty? nothing written there, only a black screen) is it switching to or do I guess right that it's just the mode of vt7/vt8 that's changed from graphics to text?

In one of the many hangs I experienced in the last days, I moved the mouse pointer before the system was frozen (ie. suspend by close-laptop, wait for power led to blink slowly, resume by open-laptop, see mouse pointer move some 3cm, see mouse pointer (and system) frozen a fraction of a second later).
Is that of importance?

TIA
Comment 15 MDBürkle 2013-08-21 20:35:23 UTC
Tried 3.10.9-gentoo, result was a hang.
Comment 16 MDBürkle 2013-08-24 09:17:03 UTC
(In reply to Jesse Barnes from comment #13)
> Maybe you need to have fglrx loaded,
> but it can't support VT-switchless resume?

Hmmm, looks like fglrx is the bad boy here:(

running X11 with intel device driver makes the system continue after resume from suspend-to-ram, tested with 3.10.9-gentoo.

Looks like the discrete gpu is just good for producing heat; will try to run "intel+radeon" dual-X11 setup next (probably after August 31st)...
Comment 17 Daniel Vetter 2013-08-24 14:30:43 UTC
fglrx isn't supported by us (nor are blob drivers in general), please poke amd about this.

Note You need to log in before you can comment on or make changes to this bug.