Latest working kernel version: c21d1e7f53ffd9c0f162c42e7fde07d1c45fa127 i.e. one commit before: Earliest failing kernel version: 1b7fc5aae8867046f8d3d45808309d5b7f2e036a ACPI: EC: Use msleep instead of udelay while waiting for event. Distribution: Debian GNU/Linux 4.0r4 Hardware Environment: IBM Thinkpad T21 Problem Description: Commit 1b7fc5aae8867046f8d3d45808309d5b7f2e036a causes my Thinkpad T21 to no longer generate any hotkey events (e.g. suspend fn-f4, switch-output fn-f7, etc.) after resume. It's regression from 2.6.25. 2.6.26 and 2.6.27-rc7 are broken, reverting that commit on 2.6.27-rc7 makes keys work again. Steps to reproduce: 1. boot [hotkeys work here] 2. suspend to ram and resume 3. use special keys fn-fX (X: 3,4,7,12) [hotkeys doesn't work anymore] Thanks.
Created attachment 18110 [details] Broken (upstream) kernel 2.6.27-rc7 with DEBUG defined in drivers/acpi/ec.c
Created attachment 18111 [details] Broken (upstream) kernel 2.6.27-rc7 with DEBUG defined in drivers/acpi/ec.c and 1b7fc5aae8867046f8d3d45808309d5b7f2e036a reverted
Created attachment 18112 [details] GOOD kernel 2.6.27-rc7 with DEBUG defined in drivers/acpi/ec.c and 1b7fc5aae8867046f8d3d45808309d5b7f2e036a reverted This is dmesg from working kernel.
Created attachment 18113 [details] clear QUERY_PENDING flag on resume Please check if this patch helps.
That patch did't help.
Could you please attach dmesg from this run?
Created attachment 18121 [details] Broken kernel 2.6.27-rc7 with CLEAR_PENDING patch
Created attachment 18122 [details] add ec->flags printout please send dmesg from run with this patch.
Created attachment 18123 [details] dmesg from broken kernel 2.6.27-rc7 with CLEAR_PENDING and ec->status printout
Last attachment description was wrong, that's ec->flags printout of course. Sorry for confusion.
Ok, so it is the QUERY_PENDING bit being stuck... 0x2 in flags.
Could you please check the state of kacpid thread? You should probably have /proc/sysrq-trigger file, try to "echo 8 > /proc/sysrq-trigger", then "echo d", "echo w" and "echo t" they should drop some debug info into dmesg...
Before suspend: kacpid S cf89e300 0 65 2 cf89e300 00000046 cf838494 cf89e300 cf89e558 c03cd160 cf89e760 cf89e300 0001018d cf80b280 c0124e6d cf80b288 00000000 c0124eec 00000000 cf89e300 c0126e90 cf80b288 cf80b288 cf80b280 00000000 c0126d40 c0126d0a 00000000 Call Trace: [<c0124e6d>] worker_thread+0x0/0xb5 [<c0124eec>] worker_thread+0x7f/0xb5 [<c0126e90>] autoremove_wake_function+0x0/0x2d [<c0126d40>] kthread+0x36/0x5a [<c0126d0a>] kthread+0x0/0x5a [<c010409b>] kernel_thread_helper+0x7/0x10 ======================= After resume: kacpid S 003cd2e0 0 65 2 cf89e300 00000046 cf89455b 003cd2e0 cf89e558 30f26be8 0000001e ceea20a8 c01dccf2 cf80b280 c0124e6d cf80b288 00000000 c0124eec 00000000 cf89e300 c0126e90 cf80b288 cf80b288 cf80b280 00000000 c0126d40 c0126d0a 00000000 Call Trace: [<c01dccf2>] acpi_os_execute_deferred+0x0/0x25 [<c0124e6d>] worker_thread+0x0/0xb5 [<c0124eec>] worker_thread+0x7f/0xb5 [<c0126e90>] autoremove_wake_function+0x0/0x2d [<c0126d40>] kthread+0x36/0x5a [<c0126d0a>] kthread+0x0/0x5a [<c010409b>] kernel_thread_helper+0x7/0x10 ======================= . Thanks.
Ah, there isn't any info from sysrq-d nor -w, I'm possibly missing some required CONFIG_* options.
this might be ok. could you please check if changing msleep(1) to, say, msleep(5) changes anything?
No, msleep(5) doesn't work. But udelay(x) does work, where x=500-5000, x=0 works too. I'm wondering if that might be some jiffy(clocksource)-related problem. My T21 has 800MHz but I see sth like: Detected 335.365 MHz processor. ... Calibrating delay loop (skipped), value calculated using timer frequency.. 670.73 BogoMIPS (lpj=3353650) Maybe cpufreq is to blame? I've tried forcing different clocksources but that didn't help. Thanks.
I have reproduced this regression on a T20 using 2.6.27.rc8. ie. killing acpid and cat /proc/acpi/event shows for each Fn key: Fn F3: ibm/hotkey HKEY 00000080 00001003 Fn-F4: ibm/hotkey HKEY 00000080 00001004 Fn-F7: switches the LCD/VGA Fn F12: ibm/hotkey HKEY 00000080 0000100c Fn End: dims display Fn Home: ibm/hotkey HKEY 00000080 00001010 Fn Pgup: turns light on/off But after suspend, Fn keys do nothing, with the exception of Fn-End and Fn_Home which continue to control brightness (w/o anything in /proc/acpi/event) and FnPgUP which continues to control the light (also w/o anything in /proc/acpi/event) If 1b7fc5aae8867046f8d3d45808309d5b7f2e036a (mdelay->msleep) is reverted, then the keys all work properly before and after suspend.
after resume, power button still works, but LID button has stopped working along with the Fn keys.
Created attachment 18143 [details] dmesg after resume and sysrq commands per comment #12
Hmm, I looked at the dmesg and both kacpid and kacpid_notify are sleeping and wait for work. I have no real idea yet, how to debug this, but the patch which replaced udelay with msleep changes one detail: udelay does not schedule (it might be preempted), but msleep definitely schedules the task away. So the question whether scheduling the task away is causing the problem. What happens if you put a yield() there instead of the msleep() ?
Keys work after resume with yield(). (Keys work even if there is no msleep/udelay/yield/etc.) BTW According to http://www.thinkwiki.org/wiki/Category:T20 there were no 550MHz T20s -- Len's CPU clock wasn't properly calculated (just like mine). IMVHO clocksource might be culprit as udelay* doesn't use jiffies and schedule_timeout* does[x]. [x] According to my _very_ limited knowledge^grepping.
Ah, miscalculated jiffies should break keys before suspend to ram... Never mind.
I can't reproduce this problem with 2.6.28-rc2, all special keys seem to work after resume.
close this bug as it's not reproducible. Karol, please re-open it if the problem still exists in the latest upstream kernel.
This bug is present with latest kernel version 2.6.31-rc9. 2.6.30 is broken too. I'll try to further track this down. Thanks.
Linux 2.6.29(.6) isn't broken. I thus assume it's regression from 2.6.29. (I would happily reopen this bug, but it seems I've no permission to do so.)
Commit 34ff4dbccccce54c83b1234d39b7ad9e548a75dd is to blame, reverting it makes bug go away.