Bug 104791

Summary: ACPI errors on Lenovo C50-30 (AE_AML_INFINITE_LOOP, argument type mismatch)
Product: Drivers Reporter: James Ettle (james)
Component: Video(DRI - non Intel)Assignee: drivers_video-dri
Status: RESOLVED CODE_FIX    
Severity: normal CC: aaron.lu, peter
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 4.1.6 Subsystem:
Regression: No Bisected commit-id:
Attachments: acpidump output
dmesg output

Description James Ettle 2015-09-19 08:12:02 UTC
Created attachment 187991 [details]
acpidump output

I'm seeing a handful of occasional ACPI errors on a Lenovo C50-30. Currently using kernel 4.1.6-201.fc22.x86_64, but they've always been there to my knowledge


[    1.154595] ACPI Warning: \_SB_.PCI0.GFX0._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20150410/nsarguments-95)
[    1.154710] ACPI Warning: \_SB_.PCI0.RP05.PEGN._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20150410/nsarguments-95)
[    1.155068] ACPI Warning: \_SB_.PCI0.RP05.PEGN._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20150410/nsarguments-95)

[    1.167601] ACPI Exception: AE_NOT_FOUND, Evaluating _DOD (20150410/video-1290)

[    6.721508] ACPI Warning: \_SB_.PCI0.RP05.PEGN._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20150410/nsarguments-95)
[    7.284445] ACPI Error: Method parse/execution failed [\_SB_.PCI0.RP05.PEGN.NVOP] (Node ffff8802260a7be0), AE_AML_INFINITE_LOOP (20150410/psparse-536)
[    7.284455] ACPI Error: Method parse/execution failed [\_SB_.PCI0.RP05.PEGN._DSM] (Node ffff8802260a7d98), AE_AML_INFINITE_LOOP (20150410/psparse-536)
[    7.284463] ACPI: \_SB_.PCI0.RP05.PEGN: failed to evaluate _DSM (0x3021)
[    7.284465] ACPI: \_SB_.PCI0.RP05.PEGN: failed to evaluate _DSM


Full dmesg and acpidump attached. I think PCI0.RP05 is something to do with the Nvidia chip in an Optimus set-up. I'm not using this GPU. I've not encountered any stuck kapcid or other functional problems, but would appreciate knowing whether this is a Linux bug or a broken BIOS that needs reporting.
Comment 1 James Ettle 2015-09-19 08:12:52 UTC
Created attachment 188001 [details]
dmesg output
Comment 2 James Ettle 2015-09-25 08:03:20 UTC
Still present with 4.1.7. An additional observation: this method is invoked whenever I switch between a VT and X, with a corresponding spike in CPU usage and a new report in the kernel buffer.

Is there some way that AML methods that misbehave like this could be blacklisted? I presume it's not really doing what it ought to anyway...
Comment 3 James Ettle 2015-09-25 21:30:58 UTC
For the time-being, I can avoid this by disabling nouveau.
Comment 4 James Ettle 2015-09-26 08:53:44 UTC
Just to waffle on a bit more... The stuck AML method contains a loop of the form

  while(true) {
    T_0 = Arg2
    if(T_0 == 0) {
      // do some stuff
      return Local0
    } else {
      if(T_0 == 0x1A) {
         // do some more stuff
         return Local0
      else {
        if(T_0 == 0x10)
          return GOBT(Arg3)
      }
    }
    // no break here
  }

There's no "break" at the end of the while, as there are in other while(true) loops in this DSDT -- is this a common AML trick to get a local context? (In that case, if a loop's not needed, why not an if(true)?)

None of the "do stuff" bits appear to modify T_0 or Arg2, as far as I understand them, so whether the loop exits depends only on Arg2's value at the start. I'm guessing under Windows this method is invoked at the behest of the Nvidia driver, which sets Arg2 to match one of the exit conditions.

I wonder if the nouveau people should have a loot at this one, too...
Comment 5 Aaron Lu 2015-10-22 06:39:49 UTC
Moving to GPU for nouveau people to take a look.
Comment 6 Peter Wu 2016-05-14 16:54:49 UTC
The problem is that nouveau_switcheroo_optimus_dsm() unconditionally calls function 0x1B ("NOUVEAU_DSM_OPTIMUS_FLAGS") without checking that this function is actually supported. Lenovo's firmware decides to run into an infinite loop in this case instead of returning an error code.

Proposed patch:
https://lkml.kernel.org/r/1463244575-3515-1-git-send-email-peter@lekensteyn.nl
Comment 7 James Ettle 2016-05-20 19:13:12 UTC
(In reply to Peter Wu from comment #6)

> Proposed patch:
> https://lkml.kernel.org/r/1463244575-3515-1-git-send-email-peter@lekensteyn.
> nl

Sorry I didn't immediately spot your response. (Bad Google spam filter.) I can't get the above link to work, please clarify and I'll try and remember how to build a patched kernel (it's been a while...).

Thanks.
Comment 8 Peter Wu 2016-05-20 20:14:46 UTC
Apparently the mail is not found on that list, try this one:
https://lists.freedesktop.org/archives/nouveau/2016-May/025039.html
Comment 9 James Ettle 2016-05-22 07:08:55 UTC
This patch does stop the infinite loop on my machine.

(Although now with or without it nouveau seems to cause delays starting GNOME 3, but that's another bug report.)
Comment 10 Peter Wu 2016-05-22 09:52:18 UTC
(In reply to James from comment #9)
> This patch does stop the infinite loop on my machine.

Cool, can I add your Tested-by: James <...@googlemail.com> to the patch?

> (Although now with or without it nouveau seems to cause delays starting
> GNOME 3, but that's another bug report.)

If it is a second or so, that is likely the card switching on or off. With or without this patch you can disable the power saving by setting the module option nouveau.runpm=0 (kernel cmdline) or "options nouveau runpm=0" (modprobe.conf snippet). Do you still experience a delay after that?
Comment 11 James Ettle 2016-05-22 13:02:41 UTC
(In reply to Peter Wu from comment #10)
> (In reply to James from comment #9)
> > This patch does stop the infinite loop on my machine.
> 
> Cool, can I add your Tested-by: James <...@googlemail.com> to the patch?

Yes you may. Please note I tested this patch against the current Fedora 23 kernel, 4.4.9-300.fc23.x86_64.

> > (Although now with or without it nouveau seems to cause delays starting
> > GNOME 3, but that's another bug report.)
> 
> If it is a second or so, that is likely the card switching on or off. With
> or without this patch you can disable the power saving by setting the module
> option nouveau.runpm=0 (kernel cmdline) or "options nouveau runpm=0"
> (modprobe.conf snippet). Do you still experience a delay after that?

It was more like 10 seconds. However the delay did disappear after using runpm=0.

(I generally don't use the nVidia chip on this machine even under Windows, it's an 820A on 4 PCIe lanes. I've not seen it show performance advantage over the built-in Intel HD5500, I don't know why Lenovo bothered with it.)
Comment 12 James Ettle 2020-06-10 23:20:19 UTC
I've not seen this for a while now, using the Nvidia 820A on this machine seems to be working OK. Closing, tested with kernel-5.6.16-300.fc32.x86_64.