Bug 57571 - Dell Inspiron N5010: freeze on adjusting brightness unless acpi_backlight=vendor is specified on boot
Summary: Dell Inspiron N5010: freeze on adjusting brightness unless acpi_backlight=ven...
Status: CLOSED DOCUMENTED
Alias: None
Product: ACPI
Classification: Unclassified
Component: Power-Video (show other bugs)
Hardware: All Linux
: P1 high
Assignee: Aaron Lu
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-05-04 19:59 UTC by Marwan Tanager
Modified: 2013-06-17 02:58 UTC (History)
5 users (show)

See Also:
Kernel Version: 3.5
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
lspci output (29.49 KB, text/plain)
2013-05-04 20:01 UTC, Marwan Tanager
Details
acpidump output. (171.46 KB, text/plain)
2013-05-04 20:01 UTC, Marwan Tanager
Details
dmesg output after boot with acpi_backlight=vendor (59.56 KB, text/plain)
2013-05-04 20:02 UTC, Marwan Tanager
Details
dmesg output after boot without acpi_backlight=vendor (57.25 KB, text/plain)
2013-05-04 20:03 UTC, Marwan Tanager
Details
fluctuate_bl.sh: A test script for changing the brightness using the sysfs interface. (740 bytes, application/x-shellscript)
2013-05-06 08:41 UTC, Marwan Tanager
Details
Output of dmidecode. (14.36 KB, text/plain)
2013-05-07 18:27 UTC, Marwan Tanager
Details
DSDT.hex (335.07 KB, text/x-hex)
2013-05-17 06:21 UTC, Lan Tianyu
Details
debug.patch (501 bytes, patch)
2013-05-24 07:32 UTC, Lan Tianyu
Details | Diff
DSDT.hex (335.07 KB, text/plain)
2013-05-27 14:00 UTC, Lan Tianyu
Details
debug.patch (1.49 KB, patch)
2013-05-30 06:37 UTC, Lan Tianyu
Details | Diff
debug.patch (510 bytes, patch)
2013-06-08 08:06 UTC, Lan Tianyu
Details | Diff

Description Marwan Tanager 2013-05-04 19:59:50 UTC
When adjusting the display brightness using the keyboard native brightness adjustment buttons, the laptop freezes totally, and the only way to bring it back, is to power it down by holding down the power button for 3-4 seconds, and then power it back up.

The only way around this issue seems to be by adding the kernel boot parameter acpi_backlight=vendor.

I attached the output of the following commands:

    - lspci -nnvvv
    - acpidump
    - dmesg (after booting with and without acpi_backlight)

If there is any more info or testing you need, just let me know.
Comment 1 Marwan Tanager 2013-05-04 20:01:12 UTC
Created attachment 100711 [details]
lspci output
Comment 2 Marwan Tanager 2013-05-04 20:01:48 UTC
Created attachment 100721 [details]
acpidump output.
Comment 3 Marwan Tanager 2013-05-04 20:02:57 UTC
Created attachment 100731 [details]
dmesg output after boot with acpi_backlight=vendor
Comment 4 Marwan Tanager 2013-05-04 20:03:28 UTC
Created attachment 100741 [details]
dmesg output after boot without acpi_backlight=vendor
Comment 5 Jonathan Nieder 2013-05-04 20:19:50 UTC
References: http://bugs.debian.org/706603
Comment 6 Lan Tianyu 2013-05-06 00:56:12 UTC
Please test this on the latest upstream kernel.

How about changing the backlight via "echo (brightness value) > /sys/class/backlight/acpi_video0/brightness"?
Comment 7 giri 2013-05-06 01:49:37 UTC
Yes i have done that...the above command helps me assign a fixed brightness value to my screen.
but lets say i wish to vary the brightness using the laptop functional keys...if i do so without "acpi_brightness=vendor" to the startup command line the laptop freezes randomly.
Comment 8 Lan Tianyu 2013-05-06 07:33:54 UTC
So changing backlight via "/sys/class/backlight/acpi_video0/brightness" will not cause freeze. This prove ACPI backlight control is ok. Using functional keys will involve hotkey producing. Could you test with a terminal running "acpi_listen" to check whether the brightness changing hotkey is sent to user space before being freezing?
Comment 9 giri 2013-05-06 07:53:35 UTC
When acpi_listen is performed i get the following on the terminal

video/brightnessdown BRTDN 00000087 00000000 K
video/brightnessup BRTUP 00000086 00000000 K

for the respective hotkeys pressed.

when i commented out "acpi_backlight=vendor" the terminal continues to show the above output but after sometime the system randomly freezes and hence no more output on screen.
Comment 10 giri 2013-05-06 07:56:28 UTC
and to be more accurate in my description...the button press after which freeze occurs is registered on the screen.
Comment 11 Lan Tianyu 2013-05-06 08:17:14 UTC
(In reply to comment #9)
> When acpi_listen is performed i get the following on the terminal
> 
> video/brightnessdown BRTDN 00000087 00000000 K
> video/brightnessup BRTUP 00000086 00000000 K
> 
This means brightness hotkeys also work.

> for the respective hotkeys pressed.
> 
> when i commented out "acpi_backlight=vendor" the terminal continues to show
> the
> above output but after sometime the system randomly freezes and hence no more
> output on screen.
Can you connect the machine via ssh after "freeze"? I suggest it is blank screen.
Comment 12 giri 2013-05-06 08:21:20 UTC
(In reply to comment #11)

> Can you connect the machine via ssh after "freeze"? I suggest it is blank
> screen.
I am sorry I do not have access to another machine at this time to try to ssh onto mine.
Comment 13 Marwan Tanager 2013-05-06 08:41:38 UTC
Created attachment 100781 [details]
fluctuate_bl.sh: A test script for changing the brightness using the sysfs interface.
Comment 14 Marwan Tanager 2013-05-06 08:44:24 UTC
I've just finished building the 3.9 kernel and testing the issue on it, but the
same behavior persists; laptop freezes when adjusting brightness using keyboard
keys; no freeze with "acpi_backlight=vendor".

As for testing using the sysfs interface, I've come up with a simple test
script (attached) that basically fluctuates the brightness back and forth and
takes as arguments:

    1. The driver
    2. The amount of delay between successive adjustments.

In case of the kernel running without acpi_backlight=vendor, there is two
options for the driver (at least on my system):

    - intel_backlight
    - dell_backlight

Without acpi_backlight=vendor, there is only one available driver:

    - acpi_video0

So, here is what I've got:

- With acpi_backlight=vendor, and intel_backlight:

    ./fluctuate_bl intel_backlight 0

  doesn't cause any freeze at all regardless of the delay value (0 here is 
  taken as full speed).

- With acpi_backlight=vendor, and dell_backlight:

    ./fluctuate_bl dell_backlight <delay>

  What I've noticed here, is that the laptop will eventually freeze, _but_ with
  higher probability as <delay> gets smaller. For example, a delay of 0.5 will
  cause the system to freeze sooner than a delay of 1 sec, and a delay of 0
  will freeze it very soon. 

- Without acpi_backlight=vendor:

    ./fluctuate_bl acpi_video0 <delay>

  The same as with dell_backlight.

So, whether the keyboard native keys is used or not, is irrelevant here (at
least on my system), and the only driver that seems to get it right is that of
intel. Also, the fact that the chance of a freeze is proportional to the delay
suggests, I guess, that the issue might be a race condition somewhere in the
drivers code that leads eventually to a deadlock.
Comment 15 Marwan Tanager 2013-05-06 10:07:10 UTC
(In reply to comment #14)

> In case of the kernel running without acpi_backlight=vendor, there is two
                                ^^^^^^^
> options for the driver (at least on my system):
> 
>     - intel_backlight
>     - dell_backlight

Sorry, I meant In case of the kernel running _with_ acpi_backlight=vendor...
Comment 16 Marwan Tanager 2013-05-06 10:23:57 UTC
(In reply to comment #8)
> So changing backlight via "/sys/class/backlight/acpi_video0/brightness" will
> not cause freeze. This prove ACPI backlight control is ok.

Based on the testing above, changing the backlight via acpi_video0 _will_ cause a freeze.
Comment 17 Marwan Tanager 2013-05-06 17:41:03 UTC
(In reply to comment #11)

> Can you connect the machine via ssh after "freeze"? I suggest it is blank
> screen.

- With the laptop (IP 192.168.1.117) being freezed:

    marwan@marwan-desktop:~$ ssh 192.168.1.117
    ssh: connect to host 192.168.1.117 port 22: No route to host

- In the normal case:

    marwan@marwan-desktop:~$ ssh 192.168.1.117
    Welcome to Ubuntu 12.10 (GNU/Linux 3.9.0 x86_64)

     * Documentation:  https://help.ubuntu.com/
     *
     * 0 packages can be updated.
     * 0 updates are security updates.
     *
     * New release '13.04' available.
     * Run 'do-release-upgrade' to upgrade to it.
     *
     * No mail.
     * Last login: Sat Apr 27 04:09:30 2013

The freeze is triggered by changing the brightness using the acpi_video0 sysfs interface from the test script:

    marwan@host:~$ cat /proc/cmdline 
    BOOT_IMAGE=/boot/vmlinuz-3.9.0 root=UUID=<uuid> ro
    sudo ./fluctuate_bl.sh acpi_video0 0
Comment 18 Lan Tianyu 2013-05-07 06:17:14 UTC
Hi Marwan:
           Thanks for test. Great job. From result of your test, it confirms system is really frozen. But the code path of acpi_video0 sysfs interface should be no problem because it works on the other machines and it calls standard acpi aml method which is provided by Bios. So this may be a bios problem. Since we can survive by using vendor driver. I'd like to add a quirk in the kernel for these machines(default to using vendor driver).

Please provide the output of dmidecode.
Comment 19 Marwan Tanager 2013-05-07 18:25:27 UTC
(In reply to comment #18)
> Hi Marwan:
>            Thanks for test. Great job. From result of your test, it confirms
> system is really frozen. But the code path of acpi_video0 sysfs interface
> should be no problem because it works on the other machines and it calls
> standard acpi aml method which is provided by Bios. So this may be a bios
> problem. Since we can survive by using vendor driver. I'd like to add a quirk
> in the kernel for these machines(default to using vendor driver).

If there is no problem with the apci_video0 code, then why does it work on
older kernels?

Here is what I've got:

- On kernel 2.6.32 with "acpi_backlight=vendor":

        root@host:/# uname -r
        2.6.32-24-generic
        root@host:/# cat /proc/cmdline
        BOOT_IMAGE=/boot/vmlinuz-2.6.32-24-generic root=UUID=<uuid> ro \
   acpi_backlight=vendor 1
        root@host:/# ls -a /sys/class/backlight/
        .  ..
        
- On kernel 2.6.32 without "acpi_backlight=vendor":

        root@host:/# uname -r
        2.6.32-24-generic
        root@host:/# cat /proc/cmdline
        BOOT_IMAGE=/boot/vmlinuz-2.6.32-24-generic root=UUID=<uuid> ro 1
        root@host:/# ls /sys/class/backlight/
        acpi_video0
        root@host:/# ./fluctuate_bl.sh acpi_video0 0
        ^C
        root@host:/# No freeze has happened.

- On kernel 3.9 without "acpi_backlight=vendor":

        root@host:/# uname -r
        3.9.0
        root@host:/# cat /proc/cmdline
        BOOT_IMAGE=/boot/vmlinuz-3.9.0 root=UUID=<uuid> ro 1
        root@host:/# ls /sys/class/backlight/
        acpi_video0  intel_backlight
        root@host:/# ./fluctuate_bl.sh intel_backlight 0
        ^C
        root@host:/# # Let's freeze the system!
        root@host:/# ./fluctuate_bl.sh acpi_video0 0
        
All these tests are, of course, performed on the same N5010 system, so I doubt 
that the problem is in the BIOS.  I think that some change in the acpi_video0
code might have introduced this bug.




> Please provide the output of dmidecode.
Comment 20 Marwan Tanager 2013-05-07 18:27:05 UTC
Created attachment 100961 [details]
Output of dmidecode.
Comment 21 Marwan Tanager 2013-05-07 18:38:11 UTC
(In reply to comment #18)
 
> Please provide the output of dmidecode.

Attached.
Comment 22 Lan Tianyu 2013-05-09 07:44:16 UTC
Hi Marwan:
      Sorry, I didn't know this worked on 2.6.32. So this means the bug is a regression. Could you do a bisect to find which commit or version the bug took place first?

Please do your test with following patch. The patch will make acpi video driver not to invoke bios ACPI method and then we can find the kernel code or Bios code causes freezing. Thanks in advance.

diff --git a/drivers/acpi/video.c b/drivers/acpi/video.c
index c3932d0..78271ab 100644
--- a/drivers/acpi/video.c
+++ b/drivers/acpi/video.c
@@ -357,6 +357,7 @@ acpi_video_device_lcd_set_level(struct acpi_video_device *device, int level)
        struct acpi_object_list args = { 1, &arg0 };
        int state;
 
+       return 0;
        arg0.integer.value = level;
 
        status = acpi_evaluate_object(device->dev->handle, "_BCM",(In reply to
Comment 23 Marwan Tanager 2013-05-09 21:23:42 UTC
(In reply to comment #22)
> Could you do a bisect to find which commit or version the bug took
> place first?

All right. I will report back as soon as I finish bisecting (it's going to be a long marathon!).
Comment 24 Marwan Tanager 2013-05-15 04:27:45 UTC
Finally... I've just finished bisecting between 3.2 (since this is the kernel
version used in the Debian bug report which has been forwarded to this one) and
2.6.32, and it turns out that:

        - This regression is caused by commit
          84e478c6f1eb9c4bfa1fff2f8108e9a061b46428 which enables, _by default_,
          a new implementation of an NMI-based watchdog via
          CONFIG_NMI_WATCHDOG.

        - There is nothing wrong with the acpi_video0 driver code excluding the
          BIOS calls. I've verified this using the patch in comment #22; No
          freeze happens with CONFIG_NMI_WATCHDOG=y.

        - The freeze happens as a result of the BIOS call while the new
          nmi_watchdog is enabled (and AFAICT, it happens exactly the moment
          the NMI is triggered). This _doesn't_ mean that there is a problem
          with the BIOS, though, since the freeze doesn't happen with this
          patch:

                lib/Kconfig.debug |    2 +-
                1 file changed, 1 insertion(+), 1 deletion(-)

                diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
                index f80b67e..4c98f1a 100644
                --- a/lib/Kconfig.debug
                +++ b/lib/Kconfig.debug
                @@ -173,7 +173,7 @@ config DETECT_SOFTLOCKUP
                config NMI_WATCHDOG
                        bool "Detect Hard Lockups with an NMI Watchdog"
                        depends on DEBUG_KERNEL && PERF_EVENTS
                -       default y
                +       default n
                        help
                        Say Y here to enable the kernel to use the NMI as a 
                        watchdog to detect hard lockups.  This is useful when a
                        cpu hangs for no


So, to relief users from having to resort to obscure workarounds such as
acpi_backlight=vendor, which might not even be available in case of no
vendor-supplied backlight driver, I think we need to revert back to the old
nmi_watchdog code until the new implementation gets fixed.
Comment 25 Lan Tianyu 2013-05-17 05:59:54 UTC
(In reply to comment #24)
> Finally... I've just finished bisecting between 3.2 (since this is the kernel
> version used in the Debian bug report which has been forwarded to this one)
> and
Thanks for bisect. Great job.

> 2.6.32, and it turns out that:
> 
>         - This regression is caused by commit
>           84e478c6f1eb9c4bfa1fff2f8108e9a061b46428 which enables, _by
>           default_,
>           a new implementation of an NMI-based watchdog via
>           CONFIG_NMI_WATCHDOG.
This is very strange. Currently, I have no idea about how NMI_WATCHDOG affect ACPI aml code to be executed.

> 
>         - There is nothing wrong with the acpi_video0 driver code excluding
>         the
>           BIOS calls. I've verified this using the patch in comment #22; No
>           freeze happens with CONFIG_NMI_WATCHDOG=y.
Yes.

> 
>         - The freeze happens as a result of the BIOS call while the new
>           nmi_watchdog is enabled (and AFAICT, it happens exactly the moment
>           the NMI is triggered). This _doesn't_ mean that there is a problem
>           with the BIOS, though, since the freeze doesn't happen with this
>           patch:
> 
>                 lib/Kconfig.debug |    2 +-
>                 1 file changed, 1 insertion(+), 1 deletion(-)
> 
>                 diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
>                 index f80b67e..4c98f1a 100644
>                 --- a/lib/Kconfig.debug
>                 +++ b/lib/Kconfig.debug
>                 @@ -173,7 +173,7 @@ config DETECT_SOFTLOCKUP
>                 config NMI_WATCHDOG
>                         bool "Detect Hard Lockups with an NMI Watchdog"
>                         depends on DEBUG_KERNEL && PERF_EVENTS
>                 -       default y
>                 +       default n
>                         help
>                         Say Y here to enable the kernel to use the NMI as a 
>                         watchdog to detect hard lockups.  This is useful when
>                         a
>                         cpu hangs for no
> 
> 
> So, to relief users from having to resort to obscure workarounds such as
> acpi_backlight=vendor, which might not even be available in case of no
> vendor-supplied backlight driver, I think we need to revert back to the old
> nmi_watchdog code until the new implementation gets fixed.

Let's figure out which Bios code causes freeze when using new NMI_WATCHDOG. I will attach the customized ACPI table.
Comment 26 Lan Tianyu 2013-05-17 06:21:38 UTC
Created attachment 101821 [details]
DSDT.hex

Please override the DSDT table.
cd (kernel source)
cp DSDT.hex include/
make menuconfig

Make the following change.
CONFIG_ACPI_CUSTOM_DSDT_FILE="DSDT.hex"
CONFIG_ACPI_CUSTOM_DSDT=y

Compile and install kernel.
Comment 27 Marwan Tanager 2013-05-17 14:49:59 UTC
With your custom DSDT, no freeze happens with CONFIG_NMI_WATCHDOG=y, and 
changing the brightness level has no effect.
Comment 28 Lan Tianyu 2013-05-24 07:32:51 UTC
Created attachment 102381 [details]
debug.patch

Further look the DSDT. Find _BCM method (change backlight) would access SMI io port and trigger SMM mode to do further thing. What is done in SMM mode is that we can't reach. Current symptom seems to NMI conflict with SMM. You can close the NMI watchdog by "echo 0 > /proc/sys/kernel/nmi_watchdog" and test.

Please try the attach patch which make your machine use vendor. To avoid the problem.
Comment 29 Marwan Tanager 2013-05-24 16:10:12 UTC
(In reply to comment #28)
> Created an attachment (id=102381) [details]
> debug.patch
> 
> Further look the DSDT. Find _BCM method (change backlight) would access SMI
> io
> port and trigger SMM mode to do further thing. What is done in SMM mode is
> that
> we can't reach. Current symptom seems to NMI conflict with SMM. You can close
> the NMI watchdog by "echo 0 > /proc/sys/kernel/nmi_watchdog" and test.
> 
> Please try the attach patch which make your machine use vendor. To avoid the
> problem.

Hi, Lan.

Thanks for your efforts, but since we are dealing with a regression here, then
why would we use such ad-hoc "fixes", rather than fixing the issue that caused
it in commits 84e478c6f1eb9c4bfa1fff2f8108e9a061b46428 and
1fb9d6ad2766a1dd70d167552988375049a97f21 in the first place, or just reverting
them? 

Note that this regression also affects people with machines other than Inspiron
N5010. For example, have a look at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1007765

Both "echo 0 >/proc/sys/kernel/nmi_watchdog" and the patch at comment #24 
eliminates the problem entirely, so if we're going to patch the kernel to fix 
this issue for good, I think the focus should be on the new nmi_watchdog, 
because this is the real cause of the bug. Otherwise, the issue might appear
every once in a while with other machines, who knows?
Comment 30 Lan Tianyu 2013-05-27 13:58:36 UTC
(In reply to comment #29)
> Hi, Lan.
> Thanks for your efforts, but since we are dealing with a regression here,
> then
> why would we use such ad-hoc "fixes", rather than fixing the issue that
> caused
> it in commits 84e478c6f1eb9c4bfa1fff2f8108e9a061b46428 and
> 1fb9d6ad2766a1dd70d167552988375049a97f21 in the first place, or just
> reverting
> them? 
I check the current code, there is only one NMI_WATCHDOG version which cause  this issue. So reverting it seems not practicable. About how to fix this, seems to be hard. Because this invlove SMM. We don't know what have done by firmware in the SMM.
CC: The author of new NWI Watchdog, Don Zickus.
Comment 31 Lan Tianyu 2013-05-27 14:00:29 UTC
Created attachment 102661 [details]
DSDT.hex

Please test this DSDT again. Thanks. It comments all SMI's io port acces.
Comment 32 Lan Tianyu 2013-05-30 06:37:55 UTC
Created attachment 102971 [details]
debug.patch

Please try this patch. Thanks.
Comment 33 Marwan Tanager 2013-05-30 21:23:57 UTC
I get the same results with the DSDT at comment #31 as with the one from comment #26; changing the backlight level has no effect, and no freeze happens.

With the patch at comment #32, the backlight level changes, but the computer freezes as before.
Comment 34 Lan Tianyu 2013-06-03 13:34:59 UTC
Ok. Currently have no good way to fix the conflict between NMI_WATCHDOG and SMI. I think default to use vendor video driver maybe a better workaroud for this issue since reverting NMI_WATCHDOG is not practiable(Because other machines don't have such problem and NMI_WATCHDOG are very useful in some perspectives). Please test the patch in the comment #28. If it's ok, I will send it to ACPI maillist.
Comment 35 Lan Tianyu 2013-06-08 01:38:06 UTC
Any update?
Comment 36 Marwan Tanager 2013-06-08 05:41:30 UTC
The patch from comment #28 doesn't solve the issue. The computer still freezes when changing the backlight level using acpi_video0.

I don't know how could this be (assuming the information in the patch is correct).

I've tested the patch with kernel 3.9

If there is further testing you need, let me know.
Comment 37 Lan Tianyu 2013-06-08 08:06:11 UTC
Created attachment 103901 [details]
debug.patch

Sorry. Please this again. Thanks
Comment 38 Marwan Tanager 2013-06-08 17:55:42 UTC
Here is what happens with this one (without using acpi_backlight=vendor on 
boot!):

        - Instead of acpi_video0 under /sys/calss/backlight, there is now
          dell_backlight (intel_backlight is always present).

        - Changing the brightness using intel_backlight is fine, but changing
          it using dell_backlight causes freezing.

        - Changing the brightness using the keyboard hotkeys or using software 
          like gnome-control-center is fine too.

So, regardless of the error in the previous patch, I assumed that changing the
brightness using acpi_video0 (or dell_backlight with this corrected patch)
should be fine, but it turns out that this is not the case, and any driver 
other than intel_backlight will freeze the system if used explicitly via the
sysfs interface, regardless of whether or not the kernel forces using the 
vendor driver.

Having that said, your patch has the exact same effect as using the 
acpi_backlight=vendor kernel boot parameter, but I don't know if the freeze  
still happening with the sysfs interface after applying this patch, would mean 
anything in practice, assuming that dell_backlight is the interface for the 
vendor driver (because it only appears with this patch or with the
acpi_backlight=vendor boot parameter).
Comment 39 Lan Tianyu 2013-06-09 02:08:11 UTC
Thanks for test.
(In reply to comment #38)
> Here is what happens with this one (without using acpi_backlight=vendor on 
> boot!):
> 
>         - Instead of acpi_video0 under /sys/calss/backlight, there is now
>           dell_backlight (intel_backlight is always present).
> 
>         - Changing the brightness using intel_backlight is fine, but changing
>           it using dell_backlight causes freezing.
Just like acpi_video? 
I checked dell_backlight code and it also involves with SMI. So I think they are   same problem.
The code is located in the driver/platform/x86/dell-laptop.c and driver/firmware/dcdbas.c.

> 
>         - Changing the brightness using the keyboard hotkeys or using
>         software 
>           like gnome-control-center is fine too.
> 
> So, regardless of the error in the previous patch, I assumed that changing
> the
> brightness using acpi_video0 (or dell_backlight with this corrected patch)
> should be fine, but it turns out that this is not the case, and any driver 
> other than intel_backlight will freeze the system if used explicitly via the
> sysfs interface, regardless of whether or not the kernel forces using the 
> vendor driver.
> 
> Having that said, your patch has the exact same effect as using the 
> acpi_backlight=vendor kernel boot parameter, but I don't know if the freeze  
> still happening with the sysfs interface after applying this patch

According you test, dell_backlight also will cause freeze.
Is this also related with NMI watchdog?

>, would mean 
> anything in practice, assuming that dell_backlight is the interface for the 
> vendor driver (because it only appears with this patch or with the
> acpi_backlight=vendor boot parameter).
Comment 40 Marwan Tanager 2013-06-09 04:30:22 UTC
(In reply to comment #39)
> Thanks for test.
> (In reply to comment #38)

> Just like acpi_video?

Yeah.

> I checked dell_backlight code and it also involves with SMI. So I think they
> are   same problem.
> The code is located in the driver/platform/x86/dell-laptop.c and
> driver/firmware/dcdbas.c.

Yes, my guess, too, is that it's the exact same problem.

> According you test, dell_backlight also will cause freeze.
> Is this also related with NMI watchdog?

Yes it's directly related, because it doesn't cause freeze after disabling the NMI watchdog using:

echo 0 >/proc/sys/kernel/nmi_watchdog

So, it's indeed the same problem as with acpi_video0.

The thing that confuse me is that, with your patch applied, if I changed the brightness with the keyboard  keys or gnome while watching: 

watch -n 1 /sys/class/backlight/{intel,dell}_backlight/brightness

, I find that only the value of intel_backlight is the one that actually changes, despite the fact that dell_backlight is obviously the vendor driver that is forced, because it only appears either with acpi_backlight=vendor kernel parameter, or with your patch applied.
Comment 41 Lan Tianyu 2013-06-09 06:54:20 UTC
(In reply to comment #40)
> (In reply to comment #39)
> > Thanks for test.
> > (In reply to comment #38)
> 
> > Just like acpi_video?
> 
> Yeah.
> 
> > I checked dell_backlight code and it also involves with SMI. So I think
> they
> > are   same problem.
> > The code is located in the driver/platform/x86/dell-laptop.c and
> > driver/firmware/dcdbas.c.
> 
> Yes, my guess, too, is that it's the exact same problem.
How about not load dell-laptop and dcdbas driver? Just use acpi driver to deal with SMI. I am suspicious that the dell-laptop driver has conflict with Bios ACPI table on using SMI.
 
> 
> > According you test, dell_backlight also will cause freeze.
> > Is this also related with NMI watchdog?
> 
> Yes it's directly related, because it doesn't cause freeze after disabling
> the
> NMI watchdog using:
> 
> echo 0 >/proc/sys/kernel/nmi_watchdog
> 
> So, it's indeed the same problem as with acpi_video0.
> 
> The thing that confuse me is that, with your patch applied, if I changed the
> brightness with the keyboard  keys or gnome while watching: 
> 
> watch -n 1 /sys/class/backlight/{intel,dell}_backlight/brightness
> 
> , I find that only the value of intel_backlight is the one that actually
> changes, despite the fact that dell_backlight is obviously the vendor driver
> that is forced, because it only appears either with acpi_backlight=vendor
> kernel parameter, or with your patch applied.
Comment 42 Marwan Tanager 2013-06-09 18:09:28 UTC
(In reply to comment #41)

> How about not load dell-laptop and dcdbas driver? Just use acpi driver to
> deal
> with SMI. I am suspicious that the dell-laptop driver has conflict with Bios
> ACPI table on using SMI.

How could the dell driver possibly conflict with the acpi driver unless both are used at the same time (which obviously isn't the case)?

Anyway, I've tested without dcdbas and dell-laptop, and nothing surprising has happened; acpi_video0 still freezes the system unless nmi_watchdog is disabled, and intel_backlight is fine in all cases. The only difference now (which is expected) is that dell_backlight doesn't become available after forcing using the vendor driver, either by acpi_backlight=vendor or your patch.

Anyhow, we are now clear about the root cause of the problem: NMIs issued for the nmi_watchdog, while the system is operating in SMM as a result of an SMI triggered by any backlight driver on this machine that issues SMIs (the acpi BIOS driver in particular, since the dell driver doesn't seem to be ever used; the acpi driver is used by default, and the intel driver is used only with the vendor driver being forced). 

> > The thing that confuse me is that, with your patch applied, if I changed
> the
> > brightness with the keyboard  keys or gnome while watching: 
> > 
> > watch -n 1 /sys/class/backlight/{intel,dell}_backlight/brightness
> > 
> > , I find that only the value of intel_backlight is the one that actually
> > changes, despite the fact that dell_backlight is obviously the vendor
> driver
> > that is forced, because it only appears either with acpi_backlight=vendor
> > kernel parameter, or with your patch applied.

Any ideas about this behavior?

If dell_backlight is the vendor driver, then why intel_backlight is the one that is used when the kernel forces the usage of the vendor driver? Or does dell_backlight only become available as a side effect of forcing the vendor driver, and the vendor driver in our case is actually the underlying driver of intel_backlight?
Comment 43 Aaron Lu 2013-06-14 02:30:40 UTC
(In reply to comment #42)
> Anyhow, we are now clear about the root cause of the problem: NMIs issued for
> the nmi_watchdog, while the system is operating in SMM as a result of an SMI
> triggered by any backlight driver on this machine that issues SMIs (the acpi
> BIOS driver in particular, since the dell driver doesn't seem to be ever
> used;
> the acpi driver is used by default, and the intel driver is used only with
> the
> vendor driver being forced). 

Good to know this, and I don't see there is anything we can do in ACPI video driver to avoid this since this is a firmware behavior. The only way we can avoid the freeze is not to use ACPI video driver's backlight interface.

> 
> If dell_backlight is the vendor driver, then why intel_backlight is the one
> that is used when the kernel forces the usage of the vendor driver? Or does
> dell_backlight only become available as a side effect of forcing the vendor
> driver, and the vendor driver in our case is actually the underlying driver
> of
> intel_backlight?

The command line name is indeed misleading and needs changed. I don't know the exact history, but from the code, it was used to decide between ACPI video driver and platform vendor driver, who should control backlight. If acpi_backlight=video, then platform drivers like the dell-laptop will not create backlight interface; and if acpi_backlight=vendor, the ACPI video driver will not create backlight interface. That's it, the GPU driver's backlight interface(intel_backlight in this case) doesn't care about that parameter so should always be there.
Comment 44 Marwan Tanager 2013-06-14 04:15:05 UTC
(In reply to comment #43)

> Good to know this, and I don't see there is anything we can do in ACPI video
> driver to avoid this since this is a firmware behavior. The only way we can
> avoid the freeze is not to use ACPI video driver's backlight interface.

Yes, it indeed seems like it's most probably a firmware issue.

> The command line name is indeed misleading and needs changed. I don't know
> the
> exact history, but from the code, it was used to decide between ACPI video
> driver and platform vendor driver, who should control backlight. If
> acpi_backlight=video, then platform drivers like the dell-laptop will not
> create backlight interface; and if acpi_backlight=vendor, the ACPI video
> driver
> will not create backlight interface. That's it, the GPU driver's backlight
> interface(intel_backlight in this case) doesn't care about that parameter so
> should always be there.

Hmm... I see. Thanks for clarifying this. I haven't read the code, but my guess
according to this info, is that when using apci_backlight=vendor (or
equivalently, the patch from comment #37), three things happen:

        - No backlight interface for the ACPI video driver is created.

        - A backlight interface for the platform vendor driver is created.

        - The kernel checks whether there is an interface provided by the GPU
          driver for changing the backlight level, and if that's the case, it
          registers that driver as the one for changing the backlight. This
          explains why the interface for the platform vendor driver
          (dell_backlight) is created, but the one for the GPU driver 
          (intel_backlight) is what is actually used, when changing the  
          backlight.
          
If the above is true, I think then that the patch from comment #37 would be a 
reasonable way for solving this issue, since it will cause the kernel to always use the GPU driver, which keeps us away from SMM, and thus avoiding the conflict with nmi_watchdog. It's downside, though, is that it is machine-specific, but since the problem itself seems to be firmware-related, I think there would be no better way.
Comment 45 Aaron Lu 2013-06-14 05:33:01 UTC
(In reply to comment #44)
> Hmm... I see. Thanks for clarifying this. I haven't read the code, but my
> guess
> according to this info, is that when using apci_backlight=vendor (or
> equivalently, the patch from comment #37), three things happen:
> 
>         - No backlight interface for the ACPI video driver is created.
> 
>         - A backlight interface for the platform vendor driver is created.
> 
>         - The kernel checks whether there is an interface provided by the GPU
>           driver for changing the backlight level, and if that's the case, it
>           registers that driver as the one for changing the backlight. This
>           explains why the interface for the platform vendor driver
>           (dell_backlight) is created, but the one for the GPU driver 
>           (intel_backlight) is what is actually used, when changing the  
>           backlight.

The first two guess are correct, the last one is not. The kernel does not deal with this currently, all it does is to have individual driver to export their own interfaces. It's the Xorg graphics driver or some GUI helper function that makes the decision which interface to use. For example, the intel xorg driver has the following priority:

static const char *backlight_interfaces[] = {
	"gmux_backlight",
	"asus-laptop",
	"asus-nb-wmi",
	"eeepc",
	"thinkpad_screen",
	"mbp_backlight",
	"fujitsu-laptop",
	"sony",
	"samsung",
	"acpi_video1", /* finally fallback to the generic acpi drivers */
	"acpi_video0",
	"intel_backlight",
	NULL,
};

So the dell interface will never be picked up by it as it is not even listed here...
 
> It's downside, though, is that it is machine-specific, but since the problem
> itself seems to be firmware-related, I think there would be no better way.

Yes, we have seen a lot of broken firmware caused problems and it's driving us crazy, as we do not want to have a huge DMI table, in the other hand, there is no way to solve it either...
Comment 46 Marwan Tanager 2013-06-14 05:50:28 UTC
(In reply to comment #45)
  
> The first two guess are correct, the last one is not. The kernel does not
> deal
> with this currently, all it does is to have individual driver to export their
> own interfaces. It's the Xorg graphics driver or some GUI helper function
> that
> makes the decision which interface to use. For example, the intel xorg driver
> has the following priority:
> 
> static const char *backlight_interfaces[] = {
>     "gmux_backlight",
>     "asus-laptop",
>     "asus-nb-wmi",
>     "eeepc",
>     "thinkpad_screen",
>     "mbp_backlight",
>     "fujitsu-laptop",
>     "sony",
>     "samsung",
>     "acpi_video1", /* finally fallback to the generic acpi drivers */
>     "acpi_video0",
>     "intel_backlight",
>     NULL,
> };
> 
> So the dell interface will never be picked up by it as it is not even listed
> here...
  
So, this really means that all that could be done is to disable the interface  
of the ACPI video driver and hope for the best, where "best" in this case 
means that user space won't pick up the backlight interface of a vendor driver
that uses SMM... 

Thanks for pointing that out.
Comment 47 Aaron Lu 2013-06-17 02:58:07 UTC
You can specify which backlight interface you want to use in xorg.conf as the following example shows:

Section "Device"
Identifier  "Intel Graphics"
Driver      "intel"
Option      "AccelMethod"  "sna"
Option     "Backlight"          "intel_backlight"
Identifier  "Card0"
Driver      "intel"
BusID       "PCI:0:2:0"
EndSection

http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=651741

Note You need to log in before you can comment on or make changes to this bug.