|Summary:||HP Compaq nx7400 "bad state"|
|Product:||Drivers||Reporter:||Calorì Alessandro (axelgenus)|
|Component:||Input Devices||Assignee:||Dmitry Torokhov (dmitry.torokhov)|
|Severity:||high||CC:||acpi-bugzilla, dmitry.torokhov, rui.zhang, trenn, tuukka.tolvanen|
My kernel configuration
My kernel log
Good state dmesg
Bad state dmesg
Good state acpidump
Bad state acpidump
created a dedicated workqueue for notify() execution
Good state dmesg
Bad state dmesg
Unregister serio drivers on shutdown
The first one I got from dmitry (psmouse-fiddle-with-reset)...
... and the seond one (serio-cleanup-to-bus)
Properly reset psmouse at suspend
Let serio bus handle suspending/resuming of i8042 ports
Properly reset psmouse at suspend
Let serio bus handle suspending/resuming of i8042 ports
Description Calorì Alessandro 2006-12-16 02:32:57 UTC
Comment 1 Calorì Alessandro 2006-12-18 01:13:29 UTC
I tried compiling kernel on my own with Gentoo 2006.1 (latest stable kernel is 2.6.18) and still no luck. I'm going to try other kernels (even with some patchset) with Gentoo to make some tests.
Comment 2 Vladimir Lebedev 2006-12-18 01:25:47 UTC
Please try the latest mm version - 2.6.20-rc1-mm1 is available now.
Comment 3 Calorì Alessandro 2006-12-18 10:02:10 UTC
Created attachment 9865 [details] My kernel configuration Here's the configuration which I used to test the new kernel.
Comment 4 Calorì Alessandro 2006-12-18 10:03:15 UTC
Created attachment 9866 [details] My kernel log Here is the kernel log.
Comment 5 Calorì Alessandro 2006-12-18 10:05:15 UTC
I tried the new kernel as you suggested but still no luck! After first shutdown/reboot the system falls again in "bad state".
Comment 6 Vladimir Lebedev 2006-12-18 12:28:53 UTC
Please attach acpidump, full dmesg of "bad state" and content of battery files (/proc/acpi/battery/...) in "bad state". Also, please describe more careful the "bad state" on your laptop. What is the "battery status locked" in terms of battery files fields?
Comment 7 Vladimir Lebedev 2006-12-18 12:49:40 UTC
Please attach dmesg for both cases - "bad state" and normal.
Comment 8 Alexey Starikovskiy 2006-12-18 13:01:20 UTC
please try patch from #5534.
Comment 9 Calorì Alessandro 2006-12-19 01:41:09 UTC
I've already read that bug report but it doesn't seems to apply to me because my laptop's thermal zones and fans always work as they should even if the system is in "bad state". I'm going to send the information Vladimir required as soon as possible (I must recompile the kernel with psmouse as a module to send them).
Comment 10 Calorì Alessandro 2006-12-19 02:18:09 UTC
Created attachment 9878 [details] Good state dmesg This is the dmesg when the system starts in good state.
Comment 11 Calorì Alessandro 2006-12-19 02:19:48 UTC
Created attachment 9879 [details] Bad state dmesg This is the dmesg when the system starts in bad state.
Comment 12 Calorì Alessandro 2006-12-19 02:23:07 UTC
Created attachment 9880 [details] Good state acpidump This is the result of acpidump when the system is in good state.
Comment 13 Calorì Alessandro 2006-12-19 02:27:37 UTC
Created attachment 9881 [details] Bad state acpidump This is the result of acpidump when the system is in bad state.
Comment 14 Calorì Alessandro 2006-12-19 02:43:42 UTC
The symptoms of "bad state": - Boot slowdown (the BIOS takes up to 20 seconds to make POST); - Kernel slowdown during boot (you can compare dmesg I attached); - Battery and AC Adapter status (read from /proc/acpi/battery) are not updated after boot. The biggest problem is that when the system falls in "bad state" the only way to back it up is to unplug the cord and disconnect the battery for few seconds.
Comment 15 Calorì Alessandro 2006-12-19 02:55:46 UTC
> What is the "battery status locked" in terms of battery files fields? The fields don't change at all. The kernel reads the status of the battery and of the AC adapter during boot and then never again. If I unplug the cord and the system is in bad state the battery files in /proc/acpi/battery aren't updated and status remains: AC Adapter: on - Battery status: Fully charged (or charging). If I run the system on battery and I plug the cord the status remains: AC Adapter: off - Battery status: discharding. Percentage of charge doesn't change. If I reboot the system the kernel updates the status during boot and then it doesn't update them again.
Comment 16 fiodor.f.suietov 2006-12-19 07:16:22 UTC
Alexey Starikovskiy wrote: Lebedev, Vladimir P wrote: > Alexey, ... > Please look in "My kernel log" attachment: there are ERROR messages > here: > Is it criminal? > > Dec 18 18:52:10 coreblack [ 35.963000] ACPI Exception (exoparg2-0442): > AE_AML_BUFFER_LIMIT, Index (000000100) is beyond end of object >  > Dec 18 18:52:10 coreblack [ 35.964000] ACPI Error (psparse-0537): > Method parse/execution failed [\_SB_.C002.C341.C0F3._SDD] (Node > df790540), AE_AML_BUFFER_LIMIT From: Fiodor Suietov <email@example.com> Subject: libata: wrong sizeof for BUFFER I have reproduced the AE_AML_BUFFER_LIMIT exception basing on the SSDT ASL code and libata ata_acpi_push_id() code. There is the oversight in ata_acpi_push_id() causing the exception. The following update fixes it: Signed-off-by: Fiodor Suietov <firstname.lastname@example.org> --- --- linux-2.6.20-rc1-mm1/drivers/ata/libata-acpi.c.orig 2006-12-19 11:51:19.809222900 +0300 +++ linux-2.6.20-rc1-mm1/drivers/ata/libata-acpi.c 2006-12-19 17:36:05.128443900 +0300 @@ -672,7 +672,7 @@ int ata_acpi_push_id(struct ata_port *ap input.count = 1; input.pointer = in_params; in_params.type = ACPI_TYPE_BUFFER; - in_params.buffer.length = sizeof(atadev->id * ATA_ID_WORDS); + in_params.buffer.length = sizeof(atadev->id) * ATA_ID_WORDS; in_params.buffer.pointer = (u8 *)atadev->id; /* Output buffer: _SDD has no output */
Comment 17 Vladimir Lebedev 2006-12-19 09:45:39 UTC
Created attachment 9885 [details] created a dedicated workqueue for notify() execution Please try this patch and patch from comment #16, post dmesg and system state. In any case these patches should be useful.
Comment 18 Calorì Alessandro 2006-12-19 11:44:22 UTC
Sorry, it's the first time I help debugging the kernel... How can I apply those patches? I mean: what command-line should I use?
Comment 19 Vladimir Lebedev 2006-12-19 12:21:41 UTC
Created attachment 9889 [details] D:\bug_7689\libata-acpi.patch
Comment 20 Vladimir Lebedev 2006-12-19 12:34:20 UTC
>Sorry, it's the first time I help debugging the kernel... How can I apply >those patches? I mean: what command-line should I use? For example: 1) copy the patches from comment #17 and #19 to your hard disk : for example pathnames are file_1 and file_2 2) change current directory to linux-2.6.20-rc1-mm1/ 3) cat file_1 | patch -p1 should succeed cat file_2 | patch -p1 should succeed 4) build the kernel, etc ....
Comment 21 Calorì Alessandro 2006-12-20 02:18:29 UTC
Created attachment 9896 [details] Good state dmesg Dmesg when system is in good state.
Comment 22 Calorì Alessandro 2006-12-20 02:19:25 UTC
Created attachment 9897 [details] Bad state dmesg Dmesg when system is in bad state.
Comment 23 Calorì Alessandro 2006-12-20 02:22:35 UTC
Well, it seems that there are no more "error" messages in dmesg but the problem of the "bad state" still remains. I also have the acpidump of the good-state and the bad-state, if you want to give a look at them, but there aren't differences between them with those patch applied.
Comment 24 Vladimir Lebedev 2006-12-21 21:07:29 UTC
> Workarounds: > ... > 2) If PS/2 mouse support is compiled as a module (psmouse), remove that > module before shutdown/reboot. What will be the system state if psmouse is module but is not removed? What will be the system state if psmouse is not part of kernel at all? I am interested in cases - shutdown/reboot and 'suspend to disk'. Thanks.
Comment 25 Calorì Alessandro 2006-12-22 00:52:37 UTC
> What will be the system state if psmouse is module but is not removed? As if it's built-in the kernel. Same thing... > What will be the system state if psmouse is not part of kernel at all? If you mean not to load psmouse at startup the system will not fall in bad state if I reboot/shutdown the system. But, in such a way the touchpad won't work, obviousely. > I am interested in cases - shutdown/reboot and 'suspend to disk'. I've never tried to "suspend" my laptop since I've never felt the needing; I'm going to try this one too. If I don't remove that module before either rebooting or shutting down the system. > Thanks. Actually, I should thank you for your patience... ;)
Comment 26 Calorì Alessandro 2006-12-22 01:31:06 UTC
I've done some tests: - If I don't compile PS/2 mouse support at all, the kernel works (even old versions, now I'm using the gentoo-kernel 2.6.18 as my stable kernel); - If I compile PS/2 mouse support in kernel, the kernel breaks as soon as I reboot or shutdown my laptop; - If I compile PS/2 mouse support as a module there are three sub-cases: * If I load it and I unload it before shutdown/reboot everything works * If I load it but I don't unload it before shutdown/reboot the system falls in bad state * If I don't load it at all everything works even after reboot/shutdown, except touchpad. I suppose it's a problem with the touchpad: it seems like the BIOS require to do something before shutdown/reboot which is done by unloading the psmouse module. Maybe the BIOS only needs that the kernel frees the resources used by the touchpad module before shutdown/reboot. Suspend is still to test but I've read there are some problems here too so I would like to focus on this problem for now.
Comment 27 Calorì Alessandro 2006-12-22 01:32:48 UTC
> If I compile PS/2 mouse support in kernel, the kernel breaks as soon as I reboot or shutdown my laptop; Sorry I meant to say that the system falls in bad state and not that the kernel breaks...
Comment 28 Vladimir Lebedev 2006-12-22 04:20:17 UTC
> I suppose it's a problem with the touchpad: it seems like the BIOS > require ... I guess that this BIOS requires to do something at the beginning of any type of boot (boot, reboot, resume). In any case we should find the cause of problem - hw/sw conflicts, etc.... Also, > Workarounds: > 1) When the laptop is off, detach AC adapter and battery for a while (some > models require at least 2 minutes), the next boot will be normal; Is it true that it is happening with your laptop too? > * If I don't load it at all everything works even after reboot/shutdown, > except touchpad. So, the problem is absent if we do NOT load psmouse in all the case - built- in, module, removed from .config, Isn't it?
Comment 29 Calorì Alessandro 2006-12-22 05:11:20 UTC
> I guess that this BIOS requires to do something at the beginning > of any type of boot (boot, reboot, resume). I don't think so. If you read the symptoms you can see that bad state doesn't affect only the operating system but the BIOS itself. The POST last 10/20 seconds more than normal. > In any case we should find the cause of problem - hw/sw conflicts, etc.... I guess is a bug in the BIOS and the psmouse module seems to activate/deactivate it. >> 1) When the laptop is off, detach AC adapter and battery for a while (some >> models require at least 2 minutes), the next boot will be normal; > Is it true that it is happening with your laptop too? Yes of course. It's the only way to get laptop back to normal state. > So, the problem is absent if we do NOT load psmouse in all the case - built- > in, module, removed from .config, Isn't it? Yes, it is.
Comment 30 Vladimir Lebedev 2006-12-22 05:46:39 UTC
> I guess is a bug in the BIOS and the psmouse module seems to activate/deactivate it. Yes, but I guess that the Windows providers know how to do it. Do you have any experience of dealing with our problem on Windows on your computer? So, in any case, we need some time for investigation
Comment 31 Calorì Alessandro 2006-12-22 06:45:41 UTC
> Do you have any experience of dealing with our problem on > Windows on your computer? There is no problem with Windows. Even with no drivers it works and the laptop doesn't fall in bad state. I installed Windows XP Home Edition SP2 from an original CD I own, using the product key bundled with the laptop. There were no drivers (I downloaded them from HP website) and rebooting/shutting down the laptop doesn't caused the bad state. There's no way of debugging the kernel?
Comment 32 Vladimir Lebedev 2006-12-22 19:29:31 UTC
Thanks for the information; we need some time for the investigation.
Comment 33 Calorì Alessandro 2006-12-23 10:25:38 UTC
Update: I tried Damn Small Linux which ships with kernel 2.4: NO BAD STATE! The touchpad works and, after a reboot or a shutdown, the system keeps working! This should prove that this problem is related to kernel 2.6.
Comment 34 Vladimir Lebedev 2006-12-23 20:31:52 UTC
Surely, we work on it now.
Comment 35 Thomas Renninger 2007-01-10 14:37:01 UTC
Created attachment 10054 [details] Unregister serio drivers on shutdown This one should fix it?
Comment 36 Vladimir Lebedev 2007-01-18 06:01:24 UTC
Please try 'irqpoll' boot flag.
Comment 37 Vladimir Lebedev 2007-02-01 07:29:18 UTC
No response from bug submitter, please reopen if problem persists.
Comment 38 Calorì Alessandro 2007-02-09 08:22:51 UTC
Trenn did the magic! The kernel available here (ftp://ftp.suse.com/pub/people/trenn/hp_fixes_final/) doesn't put my laptop in "bad state" and after reboot/shutdown the system fully works! Thanks dude! PS: Sorry for not responding all this time but I had some troubles with my internet connection.
Comment 39 Thomas Renninger 2007-02-09 08:55:13 UTC
Let's bring this together here...: Dmitry (not sure whether he maintains the stuff, but submitted a lot patches to serio subsystem in the recent past and he probably should ack/push the patch in the end) asked me to try these two patches. As I am very busy trying to fix some more things on these HP beasts for our SLE10 kernel, I'd be really happy if Allessandro could help out a bit with testing... The patches apply fine (with small offset in the one) in recent 2.6.20 kernel.
Comment 40 Thomas Renninger 2007-02-09 09:01:33 UTC
Created attachment 10368 [details] The first one I got from dmitry (psmouse-fiddle-with-reset)...
Comment 41 Thomas Renninger 2007-02-09 09:02:28 UTC
Created attachment 10369 [details] ... and the seond one (serio-cleanup-to-bus)
Comment 42 Thomas Renninger 2007-02-09 09:12:02 UTC
Be careful that you should boot into the kernel twice. Or suspend the kernel twice. This is because the breakage happens on shutdown when the psmouse/serio subsystem is not cleaned up and survives the reboot (I expect the Embedded Controller, is also accessing serio/i8042 very late on shutdown, gets confused and is not rebooted or correctly initialised after reboot (only if power and battery are unplugged for some time)). The bad state gets fixed when booted into a working kernel (still in bad state) and then shutting down/rebooting.
Comment 43 Mario Pascucci 2007-02-09 14:11:32 UTC
I can confirm the "bad-state" problem in the HP Compaq nx7300 too. I'll try kernel 2.6.20 with two patches from Thomas Renninger.
Comment 44 Calorì Alessandro 2007-02-10 00:16:08 UTC
Ok Thomas... I will help as much as I can. What version of the kernel should I use to apply those patches? 2.6.20?
Comment 45 Mario Pascucci 2007-02-10 02:39:54 UTC
I think you got it. Fedora Core 6, updated to feb 9, 2007 Vanilla kernel 2.6.20 with Thomas Renningen patches from comment #40 and #41 compiled and installed: 1-Boot in good state with original kernel 2-Reboot->going in bad state with original kernel 3-Reboot with 2.6.20 patched kernel->still in bad state 4-Reboot with 2.6.20 patched kernel->going in good state: battery reporting ok, cpufreq working and go to full speed. 5-reboot with original Fedora kernel->still in bad state 6-reboot->going in bad state Let me know if I can do more specific testing. Thank you!
Comment 46 Mario Pascucci 2007-02-10 02:45:58 UTC
OPS! Errata I mean: 5-reboot with original Fedora kernel->still in *good* state Sorry for typo.
Comment 47 Calorì Alessandro 2007-02-10 06:59:13 UTC
Same here. It works!
Comment 48 Dmitry Torokhov 2007-02-10 08:06:45 UTC
Hm, any of you guys using suspend-to-ram? Would be nice to test the patches with suspend to ram as well...
Comment 49 Calorì Alessandro 2007-02-11 03:12:15 UTC
In openSuSE 10.2 (updated) and with the custom kernel made by Thomas, suspend to RAM only works if I add the "-f" flag to S2RAM_OPTS in /etc/pm/config. When I turn on my laptop again everything works except the keyboard! I didn't tested with kernel 2.6.20 yet.
Comment 50 Dmitry Torokhov 2007-02-11 08:14:30 UTC
Alessandro, is there a kernel version that has working s2ram on your box?
Comment 51 Calorì Alessandro 2007-02-12 03:30:09 UTC
No luck... keyboard doesn't work on resume either with kernel 2.6.8 or kernel 2.6.20... any suggestion?
Comment 52 Calorì Alessandro 2007-02-12 03:37:20 UTC
I'm going to install Gentoo again and I'm gonna try other kernel versions. Thank you for solving the "bad state" trouble but suspend to RAM is important too. If we manage to solve this problem, this box (and also others from HP) will be fully linux compatible! Should I close this bug and open a new one?
Comment 53 Thomas Renninger 2007-02-12 05:37:58 UTC
Maybe it's worth splitting up patches into the shutdown cleanup, which should be rather unrisky and it shouldn't be a problem pushing this into stable kernels (back to, don't know, 2.6.16.X). And the suspend/resume problem which could be added to 2.6.21-rcX as soon as it works half way stable (but still might break other mice/machines on suspend/ resume?).
Comment 54 Thomas Renninger 2007-02-12 07:25:33 UTC
I added the patches to latest SUSE CVS kernel head. This should make things easier to test. You should see a kernel popping up with Alexey's "execute notify handlers in own thread" (to not get confused by another problem) here: ftp.suse.com/pub/projects/kernel/kotd/i386/HEAD/kernel-default.i586.rpm Do a rpm -qp --changelog |less to check whether the patches are included. Hmm, this works for i386, but something seems to be broken with x86_64 and possibly other branches/archs. I try to get this fixed. It may take some hours/ a day until submitted patchs arrive compiled and packaged. Hmm, pre-testing by someone who comiles the stuff on his own should still be faster..., however I can add things as soon as it looks stable for broader and easier testing.
Comment 55 Dmitry Torokhov 2007-02-12 09:24:02 UTC
Created attachment 10392 [details] Properly reset psmouse at suspend Guys, I would appreciate if you test updated versions of the patches. The first one (psmouse-fiddle-with-reset.patch) is the one that really fixes the problem, the second one is generic improvement of i8042 suspend process and should speed it a bit as we don't try to reset mouse several times during suspend/resume. If these patches work fir you I will commit them in my tree and once they survive a -mm release will push them to Linus. Thanks! P.S. Let's move s2ram discussion to bug 7977
Comment 56 Dmitry Torokhov 2007-02-12 09:25:13 UTC
Created attachment 10393 [details] Let serio bus handle suspending/resuming of i8042 ports
Comment 57 Calorì Alessandro 2007-02-12 09:38:31 UTC
Should I apply all four patches or only the last two?
Comment 58 Dmitry Torokhov 2007-02-12 10:00:19 UTC
Just the last two. I could not mark the first 2 obsolete (i guess because i wasn't the one who put them in).
Comment 59 Calorì Alessandro 2007-02-12 10:55:22 UTC
The last two patches applied to vanilla kernel 2.6.20 make my laptop falling in bad state again. I'm trying again with the previous two patches.
Comment 60 Calorì Alessandro 2007-02-12 11:28:08 UTC
I confirm that the first two patches work and the last two don't. What did you modified Dmitry?
Comment 61 Dmitry Torokhov 2007-02-12 11:59:29 UTC
I added disabling of pass-through port at shutdown (which should be a noop) and removed enabling of the mouse after resetting it. I guess that your laptops really like mouse to be fully enabled before rebooting. Could you please try applying patches 1 and 4 (patch id #10368 and patch id 10393, IOW old psmouse patch and the new serio bus patch) to make sure that my theory is correct?
Comment 62 Calorì Alessandro 2007-02-13 04:41:27 UTC
Yep, you're right! Vanilla kernel 2.6.20 patched with the psmouse first patch and the serio last patch still work and it doesn't let my laptop in bad state. Now I'm going to try suspending.
Comment 63 Thomas Renninger 2007-02-14 07:59:00 UTC
I tried to fix the broken away fan issue on nx6325. After suspend, fans that should be on are in off state and I added a little line to override fan to switch it on even thermal module thinks it's off. This did not work and I was quite confused why... I tried patch 1+4, rebooted it twice and tested suspend twice. Fan state got corrected now as expected after next thermal polling. So I expect that this model (every HP has something else that gets fixed with mouse removal/cleanup) shows strange fan behaviour. I can also confirm that 1+4 works with suspend (first time machine froze, but I had huge acpi debug output switched on...) and as said, it even seem to fix things up. Dmitry, would you mind adding me to Signed-off, CC me or send me mainline commit no., so that I can track this.
Comment 64 Dmitry Torokhov 2007-02-17 23:12:41 UTC
Created attachment 10446 [details] Properly reset psmouse at suspend Hm, it turns out my patches completely broke suspend-to-ram which is not good, so here is the updated versions. I have them committed to my tree and I believe commit id stays the same when Linus pulls from other trees, so here they are: a1cec06177386ecc320af643de11cfa77e8945bd 82dd9eff4bf3b17f5f511ae931a1f350c36ca9eb
Comment 65 Dmitry Torokhov 2007-02-17 23:13:33 UTC
Created attachment 10447 [details] Let serio bus handle suspending/resuming of i8042 ports
Comment 66 Dmitry Torokhov 2007-03-09 06:20:46 UTC
Hm, as far as I know the bove patches fix the issue with shutdown and suspend to disk and they are in mainline so I am closing this.
Comment 67 Calorì Alessandro 2007-03-09 08:31:19 UTC
In mainline? From what version?
Comment 68 Dmitry Torokhov 2007-03-09 08:42:36 UTC
2.6.21-rc2 should have it.