Bug 156331 - [Regression]: Nvidia GT740M GPU dead after resume.
Summary: [Regression]: Nvidia GT740M GPU dead after resume.
Status: RESOLVED INVALID
Alias: None
Product: Power Management
Classification: Unclassified
Component: Hibernation/Suspend (show other bugs)
Hardware: x86-64 Linux
: P1 normal
Assignee: Rafael J. Wysocki
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-09-08 08:11 UTC by Maik Freudenberg
Modified: 2016-09-09 14:01 UTC (History)
0 users

See Also:
Kernel Version: >=3.15
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Maik Freudenberg 2016-09-08 08:11:36 UTC
For tracking this regression down I didn't build any drivers for the gpu and booted to text console.
Until and include linux-3.14.x, resume was fine. 3.15 - 4.2.7 does not work
After new boot
>lspci -v -s 07:00.0
gives me
07:00.0 3D controller: NVIDIA Corporation GK208M [GeForce GT 740M] (rev a1)
	Subsystem: Lenovo GK208M [GeForce GT 740M]
	Flags: bus master, fast devsel, latency 0
	Memory at b3000000 (32-bit, non-prefetchable) [size=16M]
	Memory at a0000000 (64-bit, prefetchable) [size=256M]
	Memory at b0000000 (64-bit, prefetchable) [size=32M]
	I/O ports at 4000 [disabled] [size=128]
	Expansion ROM at <ignored> [disabled]
	Capabilities: [60] Power Management version 3
	Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
	Capabilities: [78] Express Endpoint, MSI 00
	Capabilities: [100] Virtual Channel
	Capabilities: [128] Power Budgeting <?>
	Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
	Capabilities: [900] #19

I can then load bbswitch and turn the gpu on and off as I want, works.
Enter resume using
>systemctl suspend
><press any key to resume>
With linux-3.14.x lspci, bbswitch work as before. From 3.15 on, lspci gives me:
07:00.0 3D controller: NVIDIA Corporation GK208M [GeForce GT 740M] (rev ff) (prog-if ff)
	!!! Unknown header type 7f

Loading bbswitch and trying to switch gives:
Device refused to change power state. Currently in D3

in 3.15 there were many changes regarding power management and faster resume. Noteable patches using async suspend/resume so I tried with that but no success.
Having CONFIG_PM_ADVANCED_DEBUG=y set,
/sys/devices/pci0000:00/0000:00:01.1/0000:07:00.0/power/async is set to enabled.
This is an oddity by itself as the default value should be disabled and there is no driver that could set that value. What sets this?
Long story short, neither setting this to disabled nor setting 
/sys/power/pm_async
to 0 changes anything, still doesn't work.
Out of ideas now where to look, please advise.
PCI-ID: 10de:1292
Regards.
Comment 1 Maik Freudenberg 2016-09-08 09:19:56 UTC
linux-3.15-rc1 also doesn't work
cat /sys/kernel/debug/suspend_stats gives zero failures.
Comment 2 Maik Freudenberg 2016-09-08 09:56:20 UTC
Next oddity: before the line 'Refused to change power state...' with acpi debug I get:
device_pm-0164 device_set_power      : Device [PEGP] already in D0
Comment 3 Maik Freudenberg 2016-09-09 14:01:16 UTC
Ok, no bug. Bisecting gave me the obvious acpi_osi="!Windows 2013". Though this was my first shot I never tried that setting alone, specifying e.g. also "!Windows 2012" didn't fix it.
Sorry for disturbing.

Note You need to log in before you can comment on or make changes to this bug.