Bug 44391 - After resume, Nvidia Optimus GPU is ON even when turned off with switcheroo
Summary: After resume, Nvidia Optimus GPU is ON even when turned off with switcheroo
Status: RESOLVED OBSOLETE
Alias: None
Product: Drivers
Classification: Unclassified
Component: Video(Other) (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: drivers_video-other
URL:
Keywords:
: 45061 (view as bug list)
Depends on:
Blocks:
 
Reported: 2012-07-10 14:25 UTC by Peter Wu
Modified: 2016-07-15 14:03 UTC (History)
7 users (show)

See Also:
Kernel Version: 3.6
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Peter Wu 2012-07-10 14:25:55 UTC
This issue is not new, it was already existent a year ago, but I'm now reporting it to track it somewhere.

Problem description:
After suspend/resume, the Nvidia GPU is turned on while vgaswitcheroo thinks it's off. As a result, the command "ON" must be sent to vgaswitcheroo, followed by "OFF" to bring the device back in sleep again.

Steps to reproduce:
1. Load "nouveau" if it wasn't already
2. Turn the card off: echo OFF > /sys/kernel/debug/vgaswitcheroo/switch
3. Tests:

   a) Test switcheroo state: cat /sys/kernel/debug/vgaswitcheroo/switch
   b) Test for the card state: watch powertop power usage and/ or check the PCI configuration space
   c) Check temperature: sensors
5. Suspend/resume
6. Repeat step 3

Expected result:
The card should still be off after resume.
The tests in step 3 and 6 should be the similar:
a) Output: 0:IGD:+:Pwr:0000:00:02.0
           1:DIS: :Off:0000:01:00.0
b) Power usage should be similar lower, PCI configuration space should be unreadable (all bits on, ff ff ff ...)
c) Temperature for nouveau is report as -1 degree Celsius.

Actual result:
The card is on after resume.
The output of (a) is as expected, but the PCI configuration space reports sensible information because the card is on. This can also be seen in an increased power usage in (b). Also, the temperature in (c) is reported correctly which should not be case if the card is off.

Hardware details:
- NVIDIA GT 425M (Device ID 0df0)
- "Optimus" _DSM UUID

Previously, I had a hacky patch that enabled the nvidia card on suspend while disabling it on resume, in the driver (nouveau) code. However, this unnecessary slows down suspend (by turning the nvidia card on).

Checklist for a fix:

- Disable the nvidia GPU on resume if the vga_switcheroo state wants it.
- Not enable the nvidia GPU on suspend (as bbswitch[1] does)
- Still work with the other switcheroo clients (like radeon).
- Still work for audio clients (should be fine without additional modifications)
- Work for the two nvidia DSM UUIDs

Does this bug only occur for the Optimus _DSM machines? E.g. are radeon APTX (?) and "nvidia" DSM machines also affected?

 [1]: https://github.com/Bumblebee-Project/bbswitch
Comment 1 Joaquín Aramendía 2012-07-10 21:54:51 UTC
I can confirm this bug affects proto-optimus nvidia DSM machines:
NVIDIA GT218 [GeForce 310M]

The card is ON after resume, but switcheroo thinks it's OFF. If I issue

# tee /sys/kernel/debug/vgaswitcheroo <<< OFF

Nothing happens. Then If
# tee /sys/kernel/debug/vgaswitcheroo <<< ON

It hangs. Laptop still responsive from TTY, but X will hang (and will freeze laptop).
Comment 2 rocko 2012-09-09 12:05:18 UTC
3.6-rc5 with X server 1.13 still hangs my laptop (i7-2630QM CPU with discrete nvidia 540M) if I try a suspend/resume cycle with nouveau loaded and if I have turned off the nvidia card via vgaswitcheroo. 

When I tried it, the laptop appeared to suspend properly, but on resume it came up with an unresponsive black screen (with the backlight on).

The last message in Xorg.0.log was:

[  3920.996] (II) AIGLX: Suspending AIGLX clients for VT switch 
[  3920.996] (II) NOUVEAU(G0): NVLeaveVT is called.

and in syslog:

Sep  9 18:28:54 sierra kernel: [ 3920.078844] PM: Syncing filesystems ... done.
Sep  9 18:28:54 sierra kernel: [ 3920.311083] PM: Preparing system for mem sleep
Comment 3 Peter Wu 2012-10-04 17:01:26 UTC
Adding one observation with the suspend/resume under Windows with my Optimus laptop: before suspend, the card is equally turned on (the LED indicator turns to "discrete"). On resume, the LED indicator is initially "integrated", but then jumps to "discrete" and back to "integrated".

So my approach to turn the device on before suspend and disabling it on resume was not that bad. Not sure about ATI laptops though, I am very interested in laptops with ATI hybrid graphics. Could anyone with ATI hybrid graphics provide details on that?
Comment 4 Peter Wu 2012-10-05 15:32:35 UTC
I have posted a patch for nouveau before:
https://bugzilla.kernel.org/show_bug.cgi?id=15845#c7

Another duplicate bug, but for ATI hybrids:
https://bugzilla.kernel.org/show_bug.cgi?id=45061
Comment 5 Alessandro Pignotti 2012-10-05 19:22:05 UTC
*** Bug 45061 has been marked as a duplicate of this bug. ***
Comment 6 Paolo Leoni 2013-03-13 13:41:07 UTC
I confirm this bug for ATI-Intel hybrid graphics with Radeon 7670M.

After resum discrete card can't be switched off.
Comment 7 Facundo Aguilera 2013-10-25 20:58:00 UTC
I can confirm this in a Sony VAIO with an ATI Radeon 7500 Series. Additionally, X server hang switching to any console or restarting X if discrete gpu is off.
Comment 8 Peter Wu 2016-07-15 14:03:19 UTC
With nouveau runpm=1 (or runpm=-1 on an Optimus system), the card is properly disabled after suspend. The same applies to the radeon module.

Note You need to log in before you can comment on or make changes to this bug.