Bug 63931
Summary: | Changing brightness level is very slow on Dell E5530 | ||
---|---|---|---|
Product: | ACPI | Reporter: | vaclav |
Component: | EC | Assignee: | Lv Zheng (lv.zheng) |
Status: | CLOSED PATCH_ALREADY_AVAILABLE | ||
Severity: | normal | CC: | aaron.lu, jwrdegoede, lv.zheng, Robert.Moore, rui.zhang, simonandric5, tianyu.lan, tomasz.nowicki, xstej70, zanetu |
Priority: | P1 | ||
Hardware: | x86-64 | ||
OS: | Linux | ||
Kernel Version: | 3.10.17, 3.11.6 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
acpidump
boot.log test.log |
Description
vaclav
2013-10-28 13:29:42 UTC
acpidump please: # acpidump > acpidump.txt Created attachment 113011 [details]
acpidump
So v3.8 is OK, does v3.7 have the problem? Possible to do a bisect using git? Everything OK up to linux-3.9.11. Problem begins with linux-3.10-rc1. Can you please bisect the offending commit? With a Linus' git tree, please bisect between v3.9 and v3.10-rc1, thanks. I could not find bad commit because some kernels between 3.9 and 3.10-rc1 will not even boot - system just freezes with black screen while loading kernel files. By bisect I found that commit c0f4dfd4f90f1667d234d21f15153ea09a2eaa66 causes this freeze. But I can say that all kernels from 3.9 to commit before c0f4dfd4f90f1667d234d21f15153ea09a2eaa66 works well - no brightness issue. So this bug must be somewhere between the commit c0f4dfd4f90f1667d234d21f15153ea09a2eaa66 and 3.10-rc1. So: c0f4dfd4f90f1667d234d21f15153ea09a2eaa66 < ISSUE LOCATION < 3.10.rc1 There are too many commits between commit c0f4dfd4f90f1667d234d21f15153ea09a2eaa66 and v3.10-rc1, I'm afraid I can't figure it out. The below command can filter out acpi related commits: $ git log --no-merges c0f4dfd4f90f1667d234d21f15153ea09a2eaa66..v3.10-rc1 drivers/acpi |grep ^commit You can try to find the offending commit from them. Alternatively, you can use xorg.conf to specify intel_backlight for X to use: Section "Device" Option "Backlight" "intel_backlight" Identifier "Card0" Driver "intel" BusID "PCI:0:2:0" EndSection BTW, does the laptop ship with win8? I bought this laptop with Win7 preinstalled. By git log --no-merges c0f4dfd4f90f1667d234d21f15153ea09a2eaa66..v3.10-rc1 I found the latest commit 04a29a15520463e072b1a98d8989f04233ee4a7e, then I did git reset --hard 04a29a15520463e072b1a98d8989f04233ee4a7e and compiled kernel (kernel name was vmlinuz-3.9.0-rc8+). This kernel works without problem so the bug is probably not related to drivers/acpi itself. Also interesting is that vmlinuz-3.10.0-rc1 compiled from downloaded tar archive works but vmlinuz-3.10.0-rc1 which was compiled from git repository does not work. From what repository are tarballs created? BTW by works I mean "it boots". I think the v3.10-rc1 tar release should be the same with a git tree once you do git reset --hard v3.10-rc1. I tried to create tar using git archive and yes, it is the same. So I found the problem. I used distribution kernel config file and then "make menuconfig", not oldconfig. It worked on 3.10 rc1 tarball but produced different configuration in git. So I used working .config from "tarball compiled linux" and finally found problematic git commit: Bisecting: 5546 revisions left to test after this (roughly 13 steps) [73287a43cc79ca06629a88d1a199cd283f42456a] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Bisecting: 2793 revisions left to test after this (roughly 12 steps) [e3d98847ded1d183111ff7c4d1ef56b161c7f13e] Merge tag 'firmware-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Bisecting: 1320 revisions left to test after this (roughly 11 steps) [38f56f33ca381751f9b8910f67e7a805ec0b68cb] Merge tag 'dt-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Bisecting: 667 revisions left to test after this (roughly 9 steps) [e15e6119062d20cc96f95c8b345e361589a90244] Merge tag 'pci-v3.10-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Bisecting: 360 revisions left to test after this (roughly 8 steps) [6019958d146a4f127dae727a930f902c92531e6e] Merge tag 'arc-v3.10-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc Bisecting: 166 revisions left to test after this (roughly 7 steps) [b32729b1eeae7ef8f5709923b36b5a0906d213df] Merge branch 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile Bisecting: 72 revisions left to test after this (roughly 6 steps) [fc72053bb497e3054d6e4d13ab425fc2311442d4] Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Bisecting: 37 revisions left to test after this (roughly 5 steps) [70eba4226d9718946941c7be0c8cb66d431e7686] Merge tag 'please-pull-pstore' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux Bisecting: 17 revisions left to test after this (roughly 4 steps) [68b8eca5f882230e4991df0db0a3094f49420a90] NVMe: Fix error clean-up on nvme_alloc_queue Bisecting: 8 revisions left to test after this (roughly 3 steps) [ab3ea5bf37e7189e843e19e500e7af50e802b5f6] NVMe: Simplify Firmware Activate code slightly Bisecting: 4 revisions left to test after this (roughly 2 steps) [2e99f3a12b20ab3afad0e042cc0bdd0ee855dca0] Merge tag 'acpi-fixes-3.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Bisecting: 1 revision left to test after this (roughly 1 step) [3aa2eeacb559e9901153d3b084ade81515b0967e] ACPICA: _OSI support: Fix possible memory leak Bisecting: 0 revisions left to test after this (roughly 0 steps) [04a29a15520463e072b1a98d8989f04233ee4a7e] ACPICA: ACPICA: Fix for _INI regression 2e99f3a12b20ab3afad0e042cc0bdd0ee855dca0 is the first bad commit Commit 2e99f3a12b20ab3afad0e042cc0bdd0ee855dca0 is a merge, according to the log, it contains three commits: commit 04a29a15520463e072b1a98d8989f04233ee4a7e Author: Tomasz Nowicki <tomasz.nowicki@linaro.org> Date: Wed May 8 04:02:06 2013 +0000 ACPICA: ACPICA: Fix for _INI regression commit 3aa2eeacb559e9901153d3b084ade81515b0967e Author: Jung-uk Kim <jkim@FreeBSD.org> Date: Wed May 8 04:01:49 2013 +0000 ACPICA: _OSI support: Fix possible memory leak commit 61388f9e5d93053cf399a356414f31f9b4814c6d Author: Bob Moore <robert.moore@intel.com> Date: Wed May 8 04:01:15 2013 +0000 ACPICA: Fix possible buffer overflow during a field unit read operation I think you can: 1 set local tree to commit 2e99f3a12b20ab3afad0e042cc0bdd0ee855dca0 $ git reset --hard 2e99f3a12b20ab3afad0e042cc0bdd0ee855dca0 build kernel, kernel should have problem; 2 revert 04a29a15520463e072b1a98d8989f04233ee4a7e build kernel and test 3 revert 3aa2eeacb559e9901153d3b084ade81515b0967e build kernel and test 4 revert 61388f9e5d93053cf399a356414f31f9b4814c6d build kernel and test No problem after the first revert (git revert 04a29a15520463e072b1a98d8989f04233ee4a7e), so problem is commit 04a29a15520463e072b1a98d8989f04233ee4a7e. Add relevant people for: commit 04a29a15520463e072b1a98d8989f04233ee4a7e Author: Tomasz Nowicki <tomasz.nowicki@linaro.org> Date: Wed May 8 04:02:06 2013 +0000 ACPICA: ACPICA: Fix for _INI regression Hello everybody, It turned out the above commit affects _BCM's execution speed(see comment #1), please take a look, thanks. Unfortunately, the commit author is not an user of bugzilla so can not be added. Is it this change: Fixed a regression introduced in version 20130328 for _INI methods. This change fixes a problem introduced in 20130328 where _INI methods are no longer executed properly because of a memory block that was not initialized correctly. ACPICA BZ 1016. Tomasz Nowicki <tomasz.nowicki@linaro.org>. Any idea what the underlying issue is? No idea what's the real problem. From the bug description, all I can get is, after that commit, the evaluation of control method _BCM became slower. Hi vaclav, How slower it is getting after that commit? echo 15 > /sys/class/backlight/dell_backlight/brightness will result in the same brightness as echo 976 > /sys/class/backlight/intel_backlight/actual_brightness, so: Before the commit: echo 15 > /sys/class/backlight/dell_backlight/brightness && time while [ "`cat /sys/class/backlight/intel_backlight/actual_brightness`" -ne 976 ]; do :; done real 0m0.045s user 0m0.004s sys 0m0.004s After the commit: echo 15 > /sys/class/backlight/dell_backlight/brightness && time while [ "`cat /sys/class/backlight/intel_backlight/actual_brightness`" -ne 976 ]; do :; done real 0m0.515s user 0m0.231s sys 0m0.292s (same problem with /sys/class/backlight/acpi_video0 when not using "acpi_backlight=vendor" boot parameter) This commit is definitely needed and it is response to bug in 20130328, but it may expose the issue that has already been there before. Same or similar problem on earlier kernels too. On 3.8 kernel after a longer time and some wake ups from suspend to ram - sometimes same problem appears (even worse - it takes more than 4 seconds to switch the level). So the problem appears before the commit - sometimes, after the commit - always. Only workaround is to enable "Option "Backlight" "intel_backlight"" in xorg.conf which does not help for a long time because after reaching some brightness level, the system begins to use dell_brightness (I dont know why). So with the backlight directive in xorg.conf it is neccessary to boot with acpi_backlight=vendor and then blacklist the vendor module (dell_laptop in my case) so only intel brightness control appears. Is there a better alternative than completely disable dell_laptop? Something like acpi_backlight=disable? On v3.12 kernel, you can use video.use_native_backlight=1. Not in v3.12, but in v3.13-rc2 works. There is another problem. After some Fn-UP, Fn-Down, brightness changing becomes very slow. "echo SOMELEVEL > /sys/class/backlight/intel_backlight/brightness" works great (also xbacklight command reaction is ok) but Fn-Up and Fn-Down keys reaction is very slow. Is there any chance to find out where is the problem? The problem is the same as in 3.8 kernel I used so (btw acpi_video0 is working in v3.12 kernel - with limited brightness range but working - but same problem with fn keys). "On 3.8 kernel after a longer time and some wake ups from suspend to ram - sometimes same problem appears (even worse - it takes more than 4 seconds to switch the level)." from my previous comment seems to be the same problem with Fn-Up and Fn-Down keys but in 3.8 kernel it will happen after a long time or immediately after boot (sometimes, very rare). In 3.13-rc2 kernel it happens always after the N fn-up/down strokes where N is a very small number. Tested with the xev command and it sees "KeymapNotify event" for FN-up/down keys after +- 3 seconds. Correction: acpi_video0 is SOMETIMES working well in v3.12 kernel. So always only with intel_backlight I get working backlight control by "echo" or by "xbacklight" command ..., but the remaining problem is Fn-up/down response time. Do you always have the problem of "Fn-Up and Fn-Down keys reaction is very slow" or from some kernel version it starts to slow? It seems to be always here in every kernel version. Now I am running 3.13-rc2 kernel and FN-down/up keys reaction slew down after some FN-down/up keystrokes, then I suspended my laptop to ram and after restore from suspend to ram FN-down/up keys are working normally. It is quite strange behaviour because other FN keys are working always well. The bug affects Fn-Up (LCD brightness up), Fn-Down (LCD brightness down) and Fn-Right (keyboard illumination - not controlled by the OS, so working good, but OS does not know about keystroke some time) keys. Disable in kernel brightness control may be helpful: # echo 0 > /sys/module/video/parameters/brightness_switch_enabled "echo 0 > /sys/module/video/parameters/brightness_switch_enabled" does not make any difference. Is the slow Fn key also due to the bisected commit? Probably yes, hard to say. It seems that the commit uncovered the problem which was already been here before. It is not possible to test it because it can take a long time before the slowdown on older kernels. My whole story: I bought the laptop, removed preinstalled Windows 7 and installed Ubuntu 12.04 x86_64. I discovered that after each resume from suspend to ram, brightness control was slower and slower. Here I was using distribution 3.5 kernel and found this problem in ubuntu bugzilla and followed the workaround to use the acpi_backlight=vendor. This partially fixed my problem (it was slower but not so much). Then I started using distribution 3.8 kernel where it worked without the problem with acpi_backlight=vendor. I thought it is ubuntu specific problem so I did not solve it anymore. But than I needed to test some application in newer kernel, so I downloaded newest vanilla kernel and ... reported this bug. I did not notice this problem on 3.8 kernel before because it occurs after several days on this kernel - I did not use my system so long without the reboot before. One interesting thing: after reboot to 3.8 kernel from the newest kernel the problem often occurs immediately. The real problem is, for some unknown reason, the EC becomes a lot slower to emit notification for backlight related hotkey, so even we have a working backlight control interface, the response to backlight hotkey is still slow. Aaron, so this has nothing to do with commit 04a29a15520463e072b1a98d8989f04233ee4a7e (In reply to Zhang Rui from comment #34) > Aaron, so this has nothing to do with commit > 04a29a15520463e072b1a98d8989f04233ee4a7e Yeah, seems to be the case. It's still un-clear what's the cause for the slow fn key. FWIW I see the same problem (brightness hotkey events sometimes reported after a long delay) on my latitude E6430. The problem goes away after a suspend/resume. I've not bothered reporting it since it is quite hard to reproduce / pin down. When I first got this laptop and installed Linux about a year ago I also needed to add acpi.ec_delay=2000 to get battery status reporting to work. That seems to be better now (don't know if it was fixed by a BIOS update or by a kernel update though). In general the ACPI EC controller on these laptops seems flaky. I've send a mail to Dell about this but got no response. I'm under the impression that the 3.15 kernel may have fixed the brightness up/down delay, but I'm not 100% sure. It seems a bit of a miss and hit wether I see the problem or not. Could you please do an experiment: In drivers/acpi/ec.c, uncomment the /* #define DEBUG */, then perform the whole test and upload the dmesg. Let's see if there is "controller reset" log in the dmesg. This is caused by EC driver's state machine bugs. It could cause significant EC delays. We've recently fixed all of them. So if you are the same case, I have patches for you to improve. Hi, As requested I've built v3.15 with the /* #define DEBUG */ uncommented. With 3,15 it is very hard to trigger the bug, which I guess is good. Still eventually I managed to hit it once. I'm going to attach 2 log files: boot.log: dmesg directly after boot test.log: dmesg after triggering the bug. If you log in test.log you will see 2 transactions with 8 seconds between them, but I was hitting the brightness up key-combo as fast as I could at that point, after I stopped thinking I again failed to reproduce the issue, I noticed the brightness change notification coming up one more time many seconds later, iow my last keypress got reported / handled in a significantly delayed manner. Looking at the log it seems that the second transaction with the ec actually also starts much later, so likely this is not caused by the acpi/ec.c code, but for some reason the acpi notifcation of the keypress gets either reported to the kernel delayed, or the kernel waits with processing it for a long time. Regards, Hans Created attachment 140231 [details]
boot.log
Created attachment 140241 [details]
test.log
Do you mean the following 2 delays? [ 144.281787] ACPI : EC: transaction end [ 163.957214] ACPI : EC: ---> status = 0x20 [ 166.012581] ACPI : EC: transaction end [ 174.193777] ACPI : EC: ---> status = 0x20 It seems EC hardware doesn't trigger correct event in a bounded time. Maybe there are some commands not correctly executed by the EC hardware. This kind of bug can be triggered by a bug I was fixed in the EC fix series. There is a bug in the current code that the next command can be completed unexpectedly by the EC driver without issuing it to the EC firmware. I suggest you to give the patchset a try. It's posted here: https://patchwork.kernel.org/project/linux-acpi/list/ [v2,7/7] ACPI/EC: Add detailed fields debugging support of EC_SC(R). [v2,6/7] ACPICA: Events: Fix edge-triggered GPE by disabling before acknowledging it. [v2,5/7] ACPI/EC: Update revision due to full asynchrnous command support. [v2,4/7] ACPI/EC: Fix a race condition in ec_transaction_completed(). [v2,3/7] ACPI/EC: Remove duplicated ec_wait_ibf0() waiter. [v2,2/7] ACPI/EC: Add asynchronous command byte write support. [v2,1/7] ACPI/EC: Fix an issue that advance_transaction() processes stale hardware status. You can download the patches from the web site. Clicking into the patch content page, you will the following line: Download mbox | patch You can use the mbox link to download the original patch. Hi, It looks like the patch set from comment 41 fixes things completely for me. Note it was already much harder to reproduce with 3.15 then a while ago, som I'm not 100% certain. But I've done 100s of brightness key presses now and I cannot reproduce, so it seems like it is completely fixes with that patch set. Regards, Hans (In reply to Hans de Goede from comment #42) > Hi, > > It looks like the patch set from comment 41 fixes things completely for me. > Note it was already much harder to reproduce with 3.15 then a while ago, som > I'm not 100% certain. But I've done 100s of brightness key presses now and I > cannot reproduce, so it seems like it is completely fixes with that patch > set. > > Regards, > > Hans Thanks for the reporting and the testing. I'm going to mark this bug as resolved. This is a good case for my patchset. I'll add your test result in the updated patchset. This patchset fixes 2 race issues in the EC driver. The bug of 70891 can be explained using the first race condition. This bug (63931) appears to be a good sample of the second race condition. I'll call a triage process for all "Mr. E" EC bugs. :-) Thanks Closing due to the merging of this patcheset. It is now in linux-pm.git/bleeding-edge branch, and will appear in linux.git v3.16-rc5. (In reply to Lv Zheng from comment #44) > Closing due to the merging of this patcheset. > It is now in linux-pm.git/bleeding-edge branch, and will appear in linux.git > v3.16-rc5. Well, we can not close it until it has been in upstream. Once the patches are in upstream, please attach the commit ids and close it then. commit c0d653412fc8450370167a3268b78fc772ff9c87 Upstreamed in linux.git/master branch. Closing... |