Created attachment 107381 [details]
acpidump for Acer V5-573G (from Felixll)
Another "vendor challenges ACPI" bug, this time the Acer V5-573G has an issue where the wrong ACPI handle is detected for the NVIDIA graphics card.
Previous related bugs that must be considered when fixing this bug: bug 60561, bug 42696 (Lenovo).
- Launchpad (lspci, dmidecode, acpidump)
Created attachment 107382 [details]
dmesg for Acer V5-573G (from Felixll)
- acpixtract from the latest iasl source seems to be broken, it does not extract two of the five SSDTs.
- \_SB.PCI0.RP05.PXSX (from DSDT) is detected where \_SB.PCI0.RP05.PEGP (from SSDT) is correct. Both have _ADR 0, but PSXS only has one other method, _PSW.
It's more of a driver specific issue to me, perhaps we should solve it in the specific driver instead? The ACPI core doesn't have the knowledge of which handle is useful when more than one have the same _ADR encoding, the specific driver does. So Peter, is it possible to solve these problems in bumblebee(I don't know much about this project)? If bumblebee has a kernel module, it can check all ACPI devices that are child of the PCI root port RP05 and then decide which handle is the right one. Having found the right handle, the rebind can be done.
Yes, this is also my idea. I am working on this. I think ACPI core should provide some info about multi ACPI nodes with the same _ADR or get next ACPI node with the same _ADR.
NVIDIA has stated that they do not have specific logic to check ACPI handles and rely on the OS to provide the correct ones.
It is possibly to work around shortcomings in the ACPI layer, but since Windows works fine on these devices, the issue should be solved in the Linux implementation of ACPI.
Would it make sense to pick the last detected Device? (guesswork) Vendors may provide a generic DSDT and add SSDTs for more specific configurations.
(In reply to Peter from comment #5)
> NVIDIA has stated that they do not have specific logic to check ACPI handles
> and rely on the OS to provide the correct ones.
Good to know this. So it basically means Windows will pick the last ACPI handle for the same _ADR ACPI handles.
> It is possibly to work around shortcomings in the ACPI layer, but since
> Windows works fine on these devices, the issue should be solved in the Linux
> implementation of ACPI.
Yes, Linux ACPI can do this to follow Windows' behavior.
> Would it make sense to pick the last detected Device? (guesswork) Vendors
> may provide a generic DSDT and add SSDTs for more specific configurations.
I think so if Windows has this behavior.
My take for this problem is: from ACPI spec's point of view, there is no way for Linux ACPI to pick a handle when multiple handles have the same _ADR, but the specific driver has(like the way you did in Bumblebee by checking handle name of PEGP). But if Windows has a rule(how can we be sure of this?), it is worth to follow it.
Something just occured to me, the previous commits introduced a check for _STA. I cannot find a _STA method for \_SB.PCI0.RP05.PXSX while the _STA methid for \_SB.PCI0.RP05.PEGP._STA returns 0x0f. That alone should make the ACPI core return the PEGP device right?
Nope. No _STA means "present and enabled".
And I really am not sure what Windows does here. Perhaps it just gives all of them to the driver.
It would be good to know what *exactly* NVidia expects to be done by the OS. Are we supposed to give all of them to the drvier? Or are we supposed to combine them into one?
Or perhaps Windows simply replaces existing objects from the DSDT by new ones from the tables loaded later if they happen to have the same _ADR?
Without knowing exactly what the expectation is we can only guess and that's likely to get things wrong.
Well, we could make a rule like "prefer objects having _STA that says 'present and enabled' to objects that have no _STA".
Would that help here?
Oh right, missing _ADR implies presence.
So far, the nvidia driver has been expecting only one device, no need for merging multiple devices. Only one of them can be correct.
Preferring devices with _STA over non-_STA will help here. I cannot believe that MS would use a reverse loop to find the device, that is not logical at all.
(In reply to Peter from comment #10)
> Oh right, missing _ADR implies presence.
Surely missing _STA?
> So far, the nvidia driver has been expecting only one device, no need for
> merging multiple devices. Only one of them can be correct.
> Preferring devices with _STA over non-_STA will help here. I cannot believe
> that MS would use a reverse loop to find the device, that is not logical at
Not, it is not, but it can do fixups when a new table is loaded.
I'll attach a patch to test in a while.
Created attachment 107416 [details]
ACPI / bind: Prefer device objects with _STA present
I wonder if this helps (untested).
Can you please try the patch in comment #12? Thanks.
I'm one of the laptop owners from the Bumblebee thread on Github. I'm running Ubuntu 13.04 on my laptop and applied your patch to a 3.11 kernel. Bumblebee works fine with these modifications and is able to switch between the graphics cards. Dmesg still prints some ACPI warnings on boot and especially when running an application on the discrete graphics adapter:
ACPI Warning: \_SB_.PCI0.RP05.PEGP._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20130517/nsarguments-95)
I'm not sure if you already know but this issue started to appear on kernel version 3.8.0. On kernel 3.7.10 Bumblebee works fine without any modifications and there are also no ACPI warnings. On 3.7.10 I can also boot without the "nomodeset" boot parameter, which results in a black screen on later kernels (including the patched one).
I've uploaded the complete dmesg log files to our issue thread at GitHub: https://github.com/Bumblebee-Project/Bumblebee/issues/460#issuecomment-23860731
Thank you for your help!
I somehow missed the confirmation from Felix on Github. It appears to work (as reported in comment 14 here).
The _DSM warnings can be ignored as NVIDIA did something wrong there, requiring a Buffer for optimal compatibility with different BIOSes. See also this discussion.
Your black screen issue is probably unrelated to this issue.
OK, I'll submit the patch to the list, but it would be good to verify that it doesn't introduce regressions for the users in bug #60561 and bug #42696.
I've asked affected users of those bugs to test that patch.
No regressions reported for bug 42696.
No regressions reported for bug 60561 either.
Works for Ubuntu 13.10 beta with nvidia-prime (no bumblebeee): https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1222838
OK, thanks! I'll queue this up as a fix for 3.12.
Fixed by commit 11b88ee ACPI / bind: Prefer device objects with _STA to those without it.
I just tried with the latest Ubuntu testing cd image that has the 3.13 kernel and the problem remains. Is it possible that the fix got somehow altered
on the way to the upstream kernel and was broken?
Kernel log has
Jan 25 18:30:57 ubuntu kernel: [ 56.188578] vgaarb: this pci device is not a vga device
Jan 25 18:30:57 ubuntu kernel: [ 56.209120] nvidia 0000:01:00.0: irq 68 for MSI/MSI-X
Jan 25 18:30:57 ubuntu kernel: [ 56.215731] NVRM: failed to copy vbios to system memory.
Jan 25 18:30:57 ubuntu kernel: [ 56.218302] NVRM: RmInitAdapter failed! (0x30:0xffffffff:720)
Jan 25 18:30:57 ubuntu kernel: [ 56.218309] NVRM: rm_init_adapter failed for device bearing minor number 0
Jan 25 18:30:57 ubuntu kernel: [ 56.218327] NVRM: nvidia_frontend_open: minor 0, module->open() failed, error -5
Meanwhile I experimented with the acpi-handle-hack kernel module,
has been used to work around this problem and it did the job for me
on the 3.11 kernel (didn't build for 3.13).
Seems I can't reopen this bug, can someone else do it or should I file a new one?
Nevermind. Turns out the 3.13 patch for the nvidia driver floating around
on the nvidia forums was incomplete and was confused by
the disappearance of the DEVICE_ACPI_HANDLE() - it built just
fine because of #ifdef checks but then failed to work.