Bug 9226
Description
Mikhail Malygin
2007-10-25 10:54:02 UTC
[The fact that XP resumes correctly doesn't imply that the BIOS is not buggy.] Can you please try the 2.6.23 kernel (that contains many suspend-related fixes)? Just tried the 2.6.23.1 - unfortunately has the same effect - simply does not find this device on resume, even does not tell anything about synaptics in the log... (In reply to comment #1) > [The fact that XP resumes correctly doesn't imply that the BIOS is not > buggy.] > > Can you please try the 2.6.23 kernel (that contains many suspend-related > fixes)? > Please apply the patch from: http://bugzilla.kernel.org/attachment.cgi?id=12991&action=view on top of 2.6.23.1. Then, please follow the instructions at: http://bugzilla.kernel.org/show_bug.cgi?id=7499#c44 and see if you are able to reproduce the problem and in which step. Unfortunately it does not apply: patch -p1 < suspend-debug-facility.patch patching file kernel/power/main.c Hunk #1 succeeded at 27 with fuzz 2 (offset -4 lines). Hunk #2 FAILED at 172. Hunk #3 FAILED at 206. Hunk #4 succeeded at 292 with fuzz 1 (offset -3 lines). Hunk #5 succeeded at 420 (offset -3 lines). 2 out of 5 hunks FAILED -- saving rejects to file kernel/power/main.c.rej patching file kernel/power/power.h Hunk #1 succeeded at 194 (offset 12 lines). (In reply to comment #3) > Please apply the patch from: I'm sorry, some additional patches are necessary for that to work against 2.3.23. I'll attach a version that applies on top of 2.6.24-rc1. Created attachment 13284 [details] Suspend debug patch This is the version of the patch from http://bugzilla.kernel.org/attachment.cgi?id=12991&action=view that applies on top of 2.6.24-rc1 . Please apply it on top of 2.6.24-rc1 and follow the instructions given at: http://bugzilla.kernel.org/show_bug.cgi?id=7499#c44 Unfortunatelly it doe snot compile for whatever reason: LD .tmp_vmlinux1 arch/x86/kernel/built-in.o: In function `smp_send_nmi_allbutself': /usr/src/linux-2.6.23/arch/x86/kernel/crash.c:85: undefined reference to `genapic' make[1]: *** [.tmp_vmlinux1] Error 1 (In reply to comment #6) > Created an attachment (id=13284) [details] > Suspend debug patch > > This is the version of the patch from > http://bugzilla.kernel.org/attachment.cgi?id=12991&action=view > that applies on top of 2.6.24-rc1 . > > Please apply it on top of 2.6.24-rc1 and follow the instructions given at: > http://bugzilla.kernel.org/show_bug.cgi?id=7499#c44 > Created attachment 13286 [details]
dmesg - pm tests (echo 5..1, then echo 0 > /sys/power/pm_test_level)
Finally I fixed it and compiled it successfully...looks like a bit buggy 2.6.24.rc1 patch... After echoing 1-5 to pm_test_level touchpad does work perfectly!!! Even works after i put 0 then next cycle 1 to pm_test_level. After echoing 0 to pm_test_level touchpad does not work. But i i suspend the box which "echo 1" afterwards my touchpad comes back perfectly. I have attached dmesg dump for echo 1-5 and echo 0 cycles: http://bugzilla.kernel.org/attachment.cgi?id=13286 (In reply to comment #6) > Created an attachment (id=13284) [details] > Suspend debug patch Anything I could do in order to help in futher investigation? What happens if you normally suspend to RAM and then test with pm_test_level=2? Created attachment 13304 [details]
Test suspend - echo 0, echo 2
Normal suspend kills the touchpad. Suspend with pm_test_level=2 BRINGS IT BACK TO LIFE :) ! I have attached the corresponding dmesg: http://bugzilla.kernel.org/attachment.cgi?id=13304&action=view (In reply to comment #11) > What happens if you normally suspend to RAM and then test with > pm_test_level=2? > Well, it looks like the touchpad is not reinitialized appropriately during resume. I'll try to reassign the bug to Drivers->Input. (In reply to comment #14) > Well, it looks like the touchpad is not reinitialized appropriately during > resume. > > I'll try to reassign the bug to Drivers->Input. Done. The touchpad doesn't work after a resume from RAM, but it's sufficient to suspend and resume devices once again to make it work (that's what pm_test_level=2 does). Could you please try doing "echo 1 > /sys/module/i8042/parameters/debug" before suspending/resuming and send me full dmesg. Please boot with log_buf_len=262144 to make sure we won't loose any messages. Created attachment 13308 [details]
Suspen/Resume with i8042 debug=1
Created attachment 13311 [details]
Reset mouse at resume
Could you please try this patch? Thanks!
Tried to patch ps_mouse in 2.6.24rc1 - the same result: normal suspend/resume kills the touchpad, suspending with pm_test_level=2 brings it back. Can I get another i8042 debug dmesg with the patch applied, please? Created attachment 13318 [details]
dmesg / patched ps_mouse + i8042 debug enabled
Hmm, the log is from teh same kernel as before. Are you sure you a running with new kernel? maybe i did something wrong, but i didn't recompile the whole kernel - just the ps_mouse module. Should I do make clean? (In reply to comment #22) > Hmm, the log is from teh same kernel as before. Are you sure you a running > with > new kernel? > Just make sure that you are loading the updated module - I do not see the new commands I expected to be sent to i8042 in the 2nd log, that's why I am asking. ok, I'll better recompile it from scratch... BTW, should I use this "2.6.24rc1" patched version or can I apply your patch tp 2.6.23 tree? (In reply to comment #24) > Just make sure that you are loading the updated module - I do not see the new > commands I expected to be sent to i8042 in the 2nd log, that's why I am > asking. > I think it can be applied to 2.6.23... Created attachment 13327 [details]
dmesg - kernel 2.6.23 patched with psmouse-base.c + i8042 debug enabled
Did complete recompile - behavior has not been changed.
Created attachment 13346 [details]
Reset mouse at resume
Ok, the mouse still does not want to talk to us after resume... Let's try resetting a little harder... Could you please try this patch instead?
Created attachment 13353 [details]
dmesg - kernel 2.6.23 patch2 - psmouse-base.c + i8042 debug
Tested - still the same
Anything else i can have with? (In reply to comment #28) > Ok, the mouse still does not want to talk to us after resume... Let's try > resetting a little harder... Could you please try this patch instead? It is any way to go further or should I consider the problem to be unresolvable? If that is so, could you give me a hint how to hack the whole suspend/resume process in order to reinitialize it twice - workaround but still something i can use in my dayly work? (In reply to comment #31) > It is any way to go further or should I consider the problem to be > unresolvable? Well, it looks resolvable, but we need to know what to fix. For now, we only know that suspending and resuming _all_ devices brings your touchpad back to life, which doesn't necessarily mean that the touchpad itself needs that (it may be a device the touchpad depends on somehow). Please compile the kernel with CONFIG_PM_VERBOSE set, suspend to RAM, then run the level 2 test (ie. "suspend" with pm_test_level=2) and attach the dmesg output from after that (you may need to increase the dmesg buffer size to get the entire log). > If that is so, could you give me a hint how to hack the whole > suspend/resume process in order to reinitialize it twice - > workaround but still something i can use in my dayly work? On openSUSE 10.2 you can modify /etc/pm/functions by adding your workaround commands to do_suspend() . Created attachment 13378 [details]
Allow invoking i8042 resume manually
OK, let's try to get some more data. Please apply the patch below. You should see a new attribute - /sys/bus/platform/devices/i8042/force_resume. Writing 1 there forces i8042's resume method to be called, writing 2 causes it go through suspend first, then resume. Again, I'd like to see debug dmesgs. Thanks!
Created attachment 13382 [details]
Trace: suspend/resume; echo 2 > pm_test_level; suspend/resume
(In reply to comment #33) Do I understand correctly that you want Mikhail to suspend to RAM with the patch applied, then try to invoke the i8042's .resume() manually (or .suspend()/.resume(), depending on what's written into /sys/bus/platform/devices/i8042/force_resume) and see if that revives the touchpad? Right. This way we either isolate breakage to input layer or see that we need to look elsewhere, probly into ACPI/PNP code... Created attachment 13384 [details]
dmesg - i8042 force_resume patch + debug enabled
Sequence:
echo 1 > force_resume;
echo mem > power state;
resume
echo 2 > force_resume;
echo mem > power state;
resume
Created attachment 13389 [details]
dmesg - force_resume=1, force_resume=2
Sorry, apparently i did not get right what to do - now i did the following:
1. suspend/resume - touchpad dead;
2. echo 1 > /sys/bus/platform/devices/i8042/force_resume - still dead;
3. echo 2 > /sys/bus/platform/devices/i8042/force_resume - still dead...
I have attached the dmsg extraction with i8042 debug enabled - macbe it can say something for you...
Hmm, that means that the problem is wider than input core alone, somethign else is not resumed properly.. PNP? ACPI? Well, both seem to be good candidates ... But how did the suspend/resume without actual S3 call work? Why there we get the touchpad properly resumed? (In reply to comment #40) > Well, both seem to be good candidates ... > If we actaully go to S3, the platform firmware takes over control and it may do some things that will have to be undone after the resume. The problem is we don't know what it does and therefore we don't know how to undo that. That is really sad. All that means it is not resolvable with remote reproduction? (In reply to comment #42) > The problem is we don't know what it does and therefore we don't know how to > undo that. > (In reply to comment #43) > That is really sad. All that means it is not resolvable with remote > reproduction? We need to find a way to narrow it somehow. Should I do any other tests to find a problematic place? Please install the current mainline kernel on this box and check if the issue is present in it. In case it is, please report back and I'll tell you what to do next. i have checked it with (2.6.23 + 2.6.24.rc5 patch) - still the same. What should I try next? Please apply patches 01-09 from http://www.sisk.pl/kernel/hibernation_and_suspend/2.6.24-rc4-git6/patches/ on top of 2.6.24-rc5. If you boot the kernel with these patches applied, there should be /sys/power/pm_test file . If it's present, please do: # echo 8 > /proc/sys/kernel/printk # echo core > /sys/power/pm_test # echo mem > /sys/power/state and see if your touchpad works after that. I have applied the patches to 2.6.24-rc4 (some patches were rejected on 2.6.24-rc5) . After doing the given command sequence the box did not completely suspend - it stayed for 5 seconds with some console messages (freezing consoles ...) and then came back to life TOGETHER WITH TOUCHPAD! Do you need a dmesg? (In reply to comment #49) > I have applied the patches to 2.6.24-rc4 (some patches were rejected on > 2.6.24-rc5) . After doing the given command sequence the box did not > completely suspend - it stayed for 5 seconds with some console messages > (freezing consoles ...) It's supposed to do that. It just didn't go to the BIOS. > and then came back to life TOGETHER WITH TOUCHPAD! Well, this means exactly that the BIOS does something we're not expecting. This may be related to the embedded controller or something like this. > Do you need a dmesg? No, thanks. [I guess it's time to get some ACPI people on board.] Can you check if doing: #echo devices > /sys/power/pm_test #echo mem > /sys/power/state after a "regular" resume from RAM (ie. when the touchpad is dead) brings the touchpad back to life? boot 2.6.24.rc4 echo mem > /sys/power/state echo devices > /sys/power/pm_test echo mem > /sys/power/state That sequence brings the Touchpad back to live! Well, clearly, we have to reset something during a resume so that things work again, but it's not i8042. The question is what. I'll try to write a piece of debug code for you, but that'll take some time (please "ping" me in a couple of days). Is it too early "ping" :) ? Can I make a new tests? Created attachment 14085 [details]
Debug patch to try
For starters, can you check if this patch has any effect on the regular suspend?
have tested - does not have any effect (echo mem > /sys/power/state) another "ping" message :) ... Would it be possible to make any new tests? Created attachment 14164 [details] Patch to debug suspending of devices Yes, sorry for the delay. Please revert the patch from Comment #55 and apply the appended patch (please let me know in case it doesn't apply). Next, please compile the kernel with CONFIG_PM_VERBOSE set. Then, if everything goes well, there should be /sys/power/pm_drivers attribute available. There should be two numbers in it, 0 and the number of devices registered in your systems. If you echo a number (say N) between these two into /sys/power/pm_drivers, the suspend will be artificially failed after suspending N devices (ie. the suspend of device (N+1) will fail with ECANCELED). By echoing 0 to /sys/power/pm_drivers you go back to the normal behavior. Now, you can do the following: (1) suspend normally (2) after the resume, echo n/2 to /sys/power/pm_drivers, where n is the second number printed by "cat /sys/power/pm_drivers" and try to suspend (the suspend will fail) (3) see if your touchpad works (4) if it does, echo 0 to /sys/power/pm_drivers and repeat from (1), but take n/4 instead of n/2 in (2) (5) otherwise, echo 0 to /sys/power/pm_drivers and repeat from (1), but take 3n/4 instead of n/2 in (2) Repeat the procedure in the binary search fashion until you find the exact number that has to be echoed into /sys/power/pm_drivers before the second fake suspend so that the touchpad works. Next, attach the output of dmesg taken exactly at this point. Unfortunately it does not apply on 2.6.23 - 24.rc4,5,6 - here is the output for 2.6.24.rc6: patch -p1 < suspend-drivers-debug.patch patching file drivers/base/power/main.c patching file kernel/power/main.c Hunk #1 FAILED at 126. Hunk #2 FAILED at 508. 2 out of 2 hunks FAILED -- saving rejects to file kernel/power/main.c.rej Well, sorry. Please apply the series of patches from: http://www.sisk.pl/kernel/hibernation_and_suspend/2.6.24-rc6/patches/ on top of 2.6.24-rc6 (should apply to the current -git too) and follow the instructions in Comment #58. Sorry , maybe i am doing something wrong but that series does not apply as well. Here what i did - patch output i have attached as well: root@amilo:/usr/src $ rm -rf linux-2.6.23 root@amilo:/usr/src $ tar jxf ../patches/linux-2.6.23.tar.bz2 root@amilo:/usr/src $ cd linux root@amilo:/usr/src/linux $ bzcat -dc ../patches/patch-2.6.24-rc6.bz2 | patch -p1 > history patching file .gitignore .... patching file sound/usb/usbquirks.h root@amilo:/usr/src/linux $ for i in `cat ../patches/p/series`; do echo "<<<<<<<<<<<<<<< --- APPLYING $i ---"; patch --dry-run -p1 < ../patches/p/$i; done > hist Created attachment 14173 [details]
Patch history
Well, please download http://www.sisk.pl/kernel/hibernation_and_suspend/2.6.24-rc6/snapshot-071224.tgz, unpack it in the unmodified 2.6.24-rc6 source directory, rename "hibernation_and_suspend" to "patches" and run "quilt push -a". Works for me. Thanks, now it works. I have made some tests and found after first suspend/resume cycle one interesting thing - the number of the registered devices has been lowered from 352 to 349 (see below): $ cat /sys/power/pm_drivers 0 352 $ echo mem > /sys/power/state -------------------------------------> first suspend/resume $ echo 176 > /sys/power/pm_drivers $ cat /sys/power/pm_drivers 176 349 $ echo 0 > /sys/power/pm_drivers $ cat /sys/power/pm_drivers 0 349 I have used the first number (352) and tried to echo n/2=176 (TP works), n/4=88 (TP works) and (3*n)/4=264 (TP does not work) but then stopped - could you please write the algorithm i should use there? Sorry, was a stupid question - looks like i found that "magic" number - it is 235: after echoing 235 - TP works, increasing number to 236 stops it. attached you can find dmesg for both fake suspend cycles Created attachment 14179 [details]
TP works !!! - "echo 235 > /sys/power/pm_drivers; echo mem > /sys/power/state"
Created attachment 14180 [details]
TP does not work - "echo 236 > /sys/power/pm_drivers; echo mem > /sys/power/state"
It looks like the second suspend of serio:serio0 makes your touchpad work after resuming. I'll try to prepare another debug patch, but that'll take some time again. In the meantime, there's an important ACPI fix in the new patch series at: http://www.sisk.pl/kernel/hibernation_and_suspend/2.6.24-rc6/patches/ (the latest snapshot at: http://www.sisk.pl/kernel/hibernation_and_suspend/2.6.24-rc6/snapshot-080101.tgz). Can you test it on top of 2.6.24-rc6, please? Sorry, I was on vacation... I have tried the snapshot-080101.tgz today . does not bring the touchpad back. Anything else i could do to help in investigating the problem? I'll let you know when I have the next debug patch ready. Just wanted to add a me too here on this. I have a gateway 8510GZ Centrino 1.7 GHZ machine that the touchpad does not wake up after sleep. I have tested some of the debug patches that you posted and they also provide the same results as Mikhail reported. If there is any testing that you need done I can also help. Thanks, Zak Hi, is something happening? This is really annoying bug. Is there anything I can do to help? Thanks, Ondra Please can Rafael teel us if this bug must be reassigned or put as NEW? Sorry, I lost this report from my radar. I understand that this issue is still present in 2.6.28. Is that correct? I can confirm that this bug is still present in 2.6.27 and also other people could confirm this[ยน] I have not already tested on the new released server 2.6.28. Is there anybody in cc that have already tested on 2.6.28? [1] https://bugs.launchpad.net/linux/+bug/52967 Rafael i have tested this issue with 2.6.28 and is still present. Unfortunately, I'm not aware of any patches that can help fix this problem and I'm not an input devices expert myself. Rafael, Is there some one else that this can be assigned to? I would love to have STR working. -Zak No idea. Everyone relevant seems to be in the CC list ... Rafael Zak and other in cc what do u think about add in cc or assign this bug to drivers_input-devices@kernel-bugs.osdl.org ? Well, you can try, but that need not have any effect. Ok now drivers_input-devices@kernel-bugs.osdl.org is in cc. I think that only you could assign this bug to this address. With the release of 2.6.29-rc3 http://lkml.org/lkml/2009/1/28/277 there are fixes for PCI suspend/resume handling. I will try to give the 2.6.29-rc3 a shot some time next week. -Zak looks like 2.6.29-rc3 does not help. I have a similar problem: after a suspend-to-ram, the Synaptics touch pad driver stops working correctly (the scroll features do not work, but I can move and click the mouse). There is nothing suspicious in either kernel and X logs. This problem appeared only when I've installed 2.6.28 (any minor version, currently I have .8). This problem was not seen in either 2.6.25 or .27. My laptop is a Benq Joybook S32B. Now I've skimmed through the replies, and tried the thing with echo ... > /sys/bus/..., and didn't worked. (I must mention that if I restart X server the problem is fixed until the next susspend. This is not an X problem, as with another kernel, the same "experiment" works.) If anyone wants me to try some patch, I'm up for it! Thanks, Ciprian Craciun. Can you test the 2.6.37 kernel and see if the issue is still present in it, please? (In reply to comment #88) > Can you test the 2.6.37 kernel and see if the issue is still present in it, > please? Compiled up 2.6.37 on Fedora 14 still have the same issues. the trackpad does not wake up after resuming from STR. Hmm, btw, does it help if you boot with i8042.reset on the kernel command line? Adding atkbd.reset option finally works fine and seems from 2.6.35. Is there something that could be done inside kernel directly? (In reply to comment #90) > Hmm, btw, does it help if you boot with i8042.reset on the kernel command > line? Like ste said adding atkbd.reset works on 2.6.35 and on 2.6.37. At this point the laptop that has this issue is no longer my daily driver so this "hack" is good enough. Dmitry: only question on this is should be quirk the i8042_reset for this system ? We now trying to automatically reset i8042 suding suspend/resume on all systems without quirks. In this bug Zak and Ste mention that they need atkbd.reset, not i8042.reset and we currently do not have automatic reset quirks in atkbd driver. It would be nice to see if they really need atkbd.reset with 3.x kernels. *** Bug 11656 has been marked as a duplicate of this bug. *** |