Bug 48981 (cybercyst)
Summary: | commit breaks the ability to turn on and off my nvidia optimus card | ||
---|---|---|---|
Product: | Power Management | Reporter: | Forrest Loomis (cybercyst) |
Component: | Run-Time-PM | Assignee: | Huang Ying (ying.huang) |
Status: | CLOSED CODE_FIX | ||
Severity: | normal | CC: | florian, kam1kaz3, lenb, peter, rui.zhang, ying.huang |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 3.6 and up | Subsystem: | |
Regression: | No | Bisected commit-id: |
Description
Forrest Loomis
2012-10-17 22:54:48 UTC
there have been some changes recently, can you verify that this is still a problem in 3.7-rc1? (In reply to comment #0) > I recently upgraded to linux 3.6 in the testing repo in Arch Linux and to my > dismay found that bbswitch was no longer able to power on and off my NVIDIA > optimus card. > > I bisected the kernel and found that it stopped working at commit > 71a83bd727cc31c5fe960c3758cb396267ff710e > > I posted a bug report with the programmers behind bumblebee / bbswitch at > https://github.com/Bumblebee-Project/bbswitch/issues/35 and we could use some > help finding out why we are seeing this error! Can you try to disable the runtime PM for PCIe bridge of NVIDIA card? ... Could you provide instructions to do that? Linux 3.4-rc1 also does not allow bbswitch to turn on and off my NVIDIA card. Whoops, I mean 3.7-rc1... FIXED: I changed /etc/laptop-mode/conf.d/runtime-pm.conf and changed the line CONTROL_RUNTIME_PM="auto" to: CONTROL_RUNTIME_PM="0" Everything works as expected now. Laptop-mode-tools' default settings were to blame here... Thanks Huang Ying, your post helped me dig up the solution! (In reply to comment #6) > FIXED: > I changed /etc/laptop-mode/conf.d/runtime-pm.conf and changed the line > > CONTROL_RUNTIME_PM="auto" > > to: > > CONTROL_RUNTIME_PM="0" > > Everything works as expected now. > > Laptop-mode-tools' default settings were to blame here... > Thanks Huang Ying, your post helped me dig up the solution! I still think this maybe a bumblebee / bbswitch issue. It need to resume device before operating on the device. Can you suggest that? > I still think this maybe a bumblebee / bbswitch issue. It need to resume
> device before operating on the device. Can you suggest that?
bbswitch (the kernel module, bumblebee is just a user) does not claim a device, it just does a one-shot power on/off action.
The device is woken by calling the _PS0 ACPI method. The code that brings the card back to life is:
pci_set_power_state(pdev, PCI_D0);
pci_restore_state(pdev);
pci_enable_device(pdev);
pci_set_master(pdev);
And off:
pci_save_state(pdev);
pci_clear_master(pdev);
pci_disable_device(pdev);
pci_set_power_state(pdev, PCI_D3hot);
(In my unpushed repo I have changed this to PCI_D3cold since the device actually sleeps that deep, is that a sensible thing to do?)
My Nvidia video card is connected through the PCIe port according to sysfs:
0000:01:00.0 -> ../../../devices/pci0000:00/0000:00:01.0/0000:01:00.0
lspci
00:01.0 PCI bridge [0604]: Intel Corporation Core Processor PCI Express x16 Root Port [8086:0045] (rev 02)
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GF108 [GeForce GT 425M] [10de:0df0] (rev a1)
The following holds regardless of having the nvidia card powered on or off.
Once I enable runtime PM for 00:01.0 (the PCIe port), `lspci` reports the correct values, but /proc/bus/pci/01/00 reports all bits on. bbswitch tries to determine whether the card is on or off by reading the PCI config space: if it returns all bits on, the nvidia card is assumed to be off. The code for that is:
pci_read_config_dword(dis_dev, 0, &cfg_word);
Apparently, this fails with run-time PM for PCIe enabled. The bbswitch module does not really operate a device, it only turns the card on and off. When a driver like nouveau/nvidia is loaded, bbswitch refuses to disable the nvidia card. Hence I do not attempt to register as a driver as that makes it not possible for other drivers to register, right? Any suggestions?
When I load the nouveau driver, reading the pci config space works fine (that is not all bits are on)
(In reply to comment #8) > > I still think this maybe a bumblebee / bbswitch issue. It need to resume > > device before operating on the device. Can you suggest that? > > bbswitch (the kernel module, bumblebee is just a user) does not claim a > device, > it just does a one-shot power on/off action. > > The device is woken by calling the _PS0 ACPI method. The code that brings the > card back to life is: > > pci_set_power_state(pdev, PCI_D0); > pci_restore_state(pdev); > pci_enable_device(pdev); > pci_set_master(pdev); > > And off: > > pci_save_state(pdev); > pci_clear_master(pdev); > pci_disable_device(pdev); > pci_set_power_state(pdev, PCI_D3hot); Why not use pm_runtime_suspend/pm_runtime_resume for the device? If that is impossible, you need to suspend/resume the parent of pdev (bridge) manually too. I believe that pm_runtime_{suspend,resume} is not possible for this driver because it binds the device. Is that a correct observation from me? I went the manual way by calling pm_runtime_{get,put}_sync on the bus device (assuming that is the parent). So, to make this bug more relevant to the Linux kernel instead of my project, this runtime PM behavior breaks /proc/bus/pci/??/??.?. Before 3.6, this file could be read even if there is no driver available. After this commit, this file reads empty. Note that /sys/bus/pci/devices/0000:??:??.?/config returns the correct values. (In reply to comment #10) > I believe that pm_runtime_{suspend,resume} is not possible for this driver > because it binds the device. Is that a correct observation from me? I went > the > manual way by calling pm_runtime_{get,put}_sync on the bus device (assuming > that is the parent). Yes. That should works. > So, to make this bug more relevant to the Linux kernel instead of my project, > this runtime PM behavior breaks /proc/bus/pci/??/??.?. Before 3.6, this file > could be read even if there is no driver available. After this commit, this > file reads empty. Note that /sys/bus/pci/devices/0000:??:??.?/config returns > the correct values. Oh, yes, that is a bug. I will fix it. But I think you need open another bugzilla item? I see you have fixed that /proc/acpi bug http://www.spinics.net/lists/linux-pci/msg18282.html This issue can be closed as FIXED! *** Bug 49031 has been marked as a duplicate of this bug. *** A patch referencing this bug report has been merged in Linux v3.7-rc5: commit b3c32c4f9565f93407921c0d8a4458042eb8998e Author: Huang Ying <ying.huang@intel.com> Date: Thu Oct 25 09:36:03 2012 +0800 PCI/PM: Fix proc config reg access for D3cold and bridge suspending Ying, can this bug be closed? (In reply to comment #15) > Ying, can this bug be closed? Yes. I think so. |