Bug 60829 - Wrong ACPI handle is detected for Acer V5-573G
Summary: Wrong ACPI handle is detected for Acer V5-573G
Status: CLOSED CODE_FIX
Alias: None
Product: ACPI
Classification: Unclassified
Component: Power-Video (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: acpi_power-video
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-09-01 21:05 UTC by Peter Wu
Modified: 2014-01-30 18:41 UTC (History)
8 users (show)

See Also:
Kernel Version: 3.8 and later (3.10.10, 3.11-rc6 still affected)
Subsystem:
Regression: No
Bisected commit-id:


Attachments
acpidump for Acer V5-573G (from Felixll) (386.96 KB, application/octet-stream)
2013-09-01 21:05 UTC, Peter Wu
Details
dmesg for Acer V5-573G (from Felixll) (62.66 KB, application/octet-stream)
2013-09-01 21:06 UTC, Peter Wu
Details
ACPI / bind: Prefer device objects with _STA present (2.86 KB, patch)
2013-09-04 15:24 UTC, Rafael J. Wysocki
Details | Diff

Description Peter Wu 2013-09-01 21:05:35 UTC
Created attachment 107381 [details]
acpidump for Acer V5-573G (from Felixll)

Another "vendor challenges ACPI" bug, this time the Acer V5-573G has an issue where the wrong ACPI handle is detected for the NVIDIA graphics card.

Previous related bugs that must be considered when fixing this bug: bug 60561, bug 42696 (Lenovo).

Reported at:
- Launchpad[1] (lspci, dmidecode, acpidump)
- Github[2]

 [1]: https://bugs.launchpad.net/lpbugreporter/+bug/752542/+attachment/3796348/+files/Acer-Aspire_V5-573G.tar.gz
 [2]: https://github.com/Bumblebee-Project/Bumblebee/issues/460
Comment 1 Peter Wu 2013-09-01 21:06:26 UTC
Created attachment 107382 [details]
dmesg for Acer V5-573G (from Felixll)
Comment 2 Peter Wu 2013-09-01 21:08:54 UTC
Notes:

- acpixtract from the latest iasl source seems to be broken, it does not extract two of the five SSDTs[1].
- \_SB.PCI0.RP05.PXSX (from DSDT) is detected where \_SB.PCI0.RP05.PEGP (from SSDT) is correct. Both have _ADR 0, but PSXS only has one other method, _PSW.

 [1]: https://github.com/Bumblebee-Project/Bumblebee/issues/460#issuecomment-23625238
Comment 3 Aaron Lu 2013-09-03 00:49:48 UTC
It's more of a driver specific issue to me, perhaps we should solve it in the specific driver instead? The ACPI core doesn't have the knowledge of which handle is useful when more than one have the same _ADR encoding, the specific driver does. So Peter, is it possible to solve these problems in bumblebee(I don't know much about this project)? If bumblebee has a kernel module, it can check all ACPI devices that are child of the PCI root port RP05 and then decide which handle is the right one. Having found the right handle, the rebind can be done.
Comment 4 Lan Tianyu 2013-09-03 01:29:41 UTC
Yes, this is also my idea. I am working on this. I think ACPI core should provide some info about multi ACPI nodes with the same _ADR or get next ACPI node with the same _ADR.
Comment 5 Peter Wu 2013-09-03 09:08:21 UTC
NVIDIA has stated that they do not have specific logic to check ACPI handles and rely on the OS to provide the correct ones.

It is possibly to work around shortcomings in the ACPI layer[1], but since Windows works fine on these devices, the issue should be solved in the Linux implementation of ACPI.

Would it make sense to pick the last detected Device? (guesswork) Vendors may provide a generic DSDT and add SSDTs for more specific configurations.

 [1]: https://github.com/Bumblebee-Project/bbswitch/blob/hack-lenovo/acpi-handle-hack.c
Comment 6 Aaron Lu 2013-09-04 00:46:42 UTC
(In reply to Peter from comment #5)
> NVIDIA has stated that they do not have specific logic to check ACPI handles
> and rely on the OS to provide the correct ones.

Good to know this. So it basically means Windows will pick the last ACPI handle for the same _ADR ACPI handles.

> 
> It is possibly to work around shortcomings in the ACPI layer[1], but since
> Windows works fine on these devices, the issue should be solved in the Linux
> implementation of ACPI.

Yes, Linux ACPI can do this to follow Windows' behavior.

> 
> Would it make sense to pick the last detected Device? (guesswork) Vendors
> may provide a generic DSDT and add SSDTs for more specific configurations.

I think so if Windows has this behavior.

My take for this problem is: from ACPI spec's point of view, there is no way for Linux ACPI to pick a handle when multiple handles have the same _ADR, but the specific driver has(like the way you did in Bumblebee by checking handle name of PEGP). But if Windows has a rule(how can we be sure of this?), it is worth to follow it.
Comment 7 Peter Wu 2013-09-04 09:19:29 UTC
Something just occured to me, the previous commits introduced a check for _STA. I cannot find a _STA method for \_SB.PCI0.RP05.PXSX while the _STA methid for \_SB.PCI0.RP05.PEGP._STA returns 0x0f. That alone should make the ACPI core return the PEGP device right?
Comment 8 Rafael J. Wysocki 2013-09-04 12:26:35 UTC
Nope.  No _STA means "present and enabled".

And I really am not sure what Windows does here.  Perhaps it just gives all of them to the driver.

It would be good to know what *exactly*  NVidia expects to be done by the OS.  Are we supposed to give all of them to the drvier?  Or are we supposed to combine them into one?

Or perhaps Windows simply replaces existing objects from the DSDT by new ones from the tables loaded later if they happen to have the same _ADR?

Without knowing exactly what the expectation is we can only guess and that's likely to get things wrong.
Comment 9 Rafael J. Wysocki 2013-09-04 12:28:11 UTC
Well, we could make a rule like "prefer objects having _STA that says 'present and enabled' to objects that have no _STA".

Would that help here?
Comment 10 Peter Wu 2013-09-04 14:46:46 UTC
Oh right, missing _ADR implies presence.

So far, the nvidia driver has been expecting only one device, no need for merging multiple devices. Only one of them can be correct.

Preferring devices with _STA over non-_STA will help here. I cannot believe that MS would use a reverse loop to find the device, that is not logical at all.
Comment 11 Rafael J. Wysocki 2013-09-04 15:20:48 UTC
(In reply to Peter from comment #10)
> Oh right, missing _ADR implies presence.

Surely missing _STA?

> So far, the nvidia driver has been expecting only one device, no need for
> merging multiple devices. Only one of them can be correct.
> 
> Preferring devices with _STA over non-_STA will help here. I cannot believe
> that MS would use a reverse loop to find the device, that is not logical at
> all.

Not, it is not, but it can do fixups when a new table is loaded.

I'll attach a patch to test in a while.
Comment 12 Rafael J. Wysocki 2013-09-04 15:24:32 UTC
Created attachment 107416 [details]
ACPI / bind: Prefer device objects with _STA present

I wonder if this helps (untested).
Comment 13 Aaron Lu 2013-09-09 02:57:38 UTC
Hi Peter,

Can you please try the patch in comment #12? Thanks.
Comment 14 Felix Lisczyk 2013-09-09 08:04:40 UTC
Hi Aaron,

I'm one of the laptop owners from the Bumblebee thread on Github. I'm running Ubuntu 13.04 on my laptop and applied your patch to a 3.11 kernel. Bumblebee works fine with these modifications and is able to switch between the graphics cards. Dmesg still prints some ACPI warnings on boot and especially when running an application on the discrete graphics adapter:

ACPI Warning: \_SB_.PCI0.RP05.PEGP._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20130517/nsarguments-95)

I'm not sure if you already know but this issue started to appear on kernel version 3.8.0. On kernel 3.7.10 Bumblebee works fine without any modifications and there are also no ACPI warnings. On 3.7.10 I can also boot without the "nomodeset" boot parameter, which results in a black screen on later kernels (including the patched one).

I've uploaded the complete dmesg log files to our issue thread at GitHub: https://github.com/Bumblebee-Project/Bumblebee/issues/460#issuecomment-23860731

Thank you for your help!

Greetings
Felix
Comment 15 Peter Wu 2013-09-09 08:20:55 UTC
Hi Aaron,

I somehow missed the confirmation from Felix on Github. It appears to work (as reported in comment 14 here).

The _DSM warnings can be ignored as NVIDIA did something wrong there, requiring a Buffer for optimal compatibility with different BIOSes. See also this discussion[1].

Your black screen issue is probably unrelated to this issue.

 [1]: https://github.com/Bumblebee-Project/bbswitch/commit/ee0591b
Comment 16 Rafael J. Wysocki 2013-09-09 10:48:09 UTC
OK, I'll submit the patch to the list, but it would be good to verify that it doesn't introduce regressions for the users in bug #60561 and bug #42696.
Comment 17 Peter Wu 2013-09-09 10:51:40 UTC
I've asked affected users of those bugs to test that patch.
Comment 18 Rafael J. Wysocki 2013-09-09 10:53:14 UTC
Great, thanks!
Comment 19 Peter Wu 2013-09-09 13:53:11 UTC
No regressions reported for bug 42696[1].

 [1]: https://github.com/Bumblebee-Project/bbswitch/issues/2#issuecomment-24068693
Comment 20 Peter Wu 2013-09-09 17:46:14 UTC
No regressions reported for bug 60561 either[1].

 [1]: https://github.com/Bumblebee-Project/bbswitch/issues/65#issuecomment-24097710
Comment 21 b jih 2013-09-09 20:04:02 UTC
Works for Ubuntu 13.10 beta with nvidia-prime (no bumblebeee): https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1222838
Comment 22 Rafael J. Wysocki 2013-09-09 20:46:41 UTC
OK, thanks!  I'll queue this up as a fix for 3.12.
Comment 23 Rafael J. Wysocki 2013-09-15 04:33:19 UTC
Fixed by commit 11b88ee ACPI / bind: Prefer device objects with _STA to those without it.
Comment 24 erno 2014-01-26 10:40:07 UTC
I just tried with the latest Ubuntu testing cd image that has the 3.13 kernel and the problem remains. Is it possible that the fix got somehow altered
on the way to the upstream kernel and was broken?

Kernel log has

Jan 25 18:30:57 ubuntu kernel: [   56.188578] vgaarb: this pci device is not a vga device
Jan 25 18:30:57 ubuntu kernel: [   56.209120] nvidia 0000:01:00.0: irq 68 for MSI/MSI-X
Jan 25 18:30:57 ubuntu kernel: [   56.215731] NVRM: failed to copy vbios to system memory.
Jan 25 18:30:57 ubuntu kernel: [   56.218302] NVRM: RmInitAdapter failed! (0x30:0xffffffff:720)
Jan 25 18:30:57 ubuntu kernel: [   56.218309] NVRM: rm_init_adapter failed for device bearing minor number 0
Jan 25 18:30:57 ubuntu kernel: [   56.218327] NVRM: nvidia_frontend_open: minor 0, module->open() failed, error -5

Meanwhile I experimented with the acpi-handle-hack kernel module,
https://github.com/Bumblebee-Project/bbswitch/tree/hack-lenovo that
has been used to work around this problem and it did the job for me
on the 3.11 kernel (didn't build for 3.13).

Seems I can't reopen this bug, can someone else do it or should I file a new one?
Comment 25 erno 2014-01-26 19:20:29 UTC
Nevermind. Turns out the 3.13 patch for the nvidia driver floating around
on the nvidia forums was incomplete and was confused by
the disappearance of the DEVICE_ACPI_HANDLE() - it built just
fine because of #ifdef checks but then failed to work.

Note You need to log in before you can comment on or make changes to this bug.