Kernel Bug Tracker – Bug 7689
HP Compaq nx7400 "bad state"
Last modified: 2007-03-09 08:42:36 UTC
Distribution: openSuSE, Ubuntu, Fedora Core (maybe others)
Hardware Environment: HP Compaq nx7400
Software Environment: X.org 7.1.99 - KDE 3.5.5
This laptop, as many others from HP, has this problem: when the computer it's
booted in a Linux environment and then shut down or rebooted the system falls
in a "bad state" which causes: 10/20 seconds longer boot procedure and battery
status locked (doesn't update). The "bad state" is persistent!
This had already been reported in bug #6455 but it's not solved yet...
Steps to reproduce:
Turn on and then shut down or reboot.
1) When the laptop is off, detach AC adapter and battery for a while (some
models require at least 2 minutes), the next boot will be normal;
2) If PS/2 mouse support is compiled as a module (psmouse), remove that module
More info at: http://emisca.altervista.org/nx7400/ (it's not made by me but I
have the same computer).
I tried compiling kernel on my own with Gentoo 2006.1 (latest stable kernel is
2.6.18) and still no luck. I'm going to try other kernels (even with some
patchset) with Gentoo to make some tests.
Please try the latest mm version - 2.6.20-rc1-mm1 is available now.
Created attachment 9865 [details]
My kernel configuration
Here's the configuration which I used to test the new kernel.
Created attachment 9866 [details]
My kernel log
Here is the kernel log.
I tried the new kernel as you suggested but still no luck! After first
shutdown/reboot the system falls again in "bad state".
Please attach acpidump, full dmesg of "bad state" and content of battery files
(/proc/acpi/battery/...) in "bad state". Also, please describe more careful
the "bad state" on your laptop. What is the "battery status locked" in terms
of battery files fields?
Please attach dmesg for both cases - "bad state" and normal.
please try patch from #5534.
I've already read that bug report but it doesn't seems to apply to me because my
laptop's thermal zones and fans always work as they should even if the system is
in "bad state".
I'm going to send the information Vladimir required as soon as possible (I must
recompile the kernel with psmouse as a module to send them).
Created attachment 9878 [details]
Good state dmesg
This is the dmesg when the system starts in good state.
Created attachment 9879 [details]
Bad state dmesg
This is the dmesg when the system starts in bad state.
Created attachment 9880 [details]
Good state acpidump
This is the result of acpidump when the system is in good state.
Created attachment 9881 [details]
Bad state acpidump
This is the result of acpidump when the system is in bad state.
The symptoms of "bad state":
- Boot slowdown (the BIOS takes up to 20 seconds to make POST);
- Kernel slowdown during boot (you can compare dmesg I attached);
- Battery and AC Adapter status (read from /proc/acpi/battery) are not updated
The biggest problem is that when the system falls in "bad state" the only way to
back it up is to unplug the cord and disconnect the battery for few seconds.
> What is the "battery status locked" in terms of battery files fields?
The fields don't change at all. The kernel reads the status of the battery and
of the AC adapter during boot and then never again.
If I unplug the cord and the system is in bad state the battery files in
/proc/acpi/battery aren't updated and status remains: AC Adapter: on - Battery
status: Fully charged (or charging). If I run the system on battery and I plug
the cord the status remains: AC Adapter: off - Battery status: discharding.
Percentage of charge doesn't change. If I reboot the system the kernel updates
the status during boot and then it doesn't update them again.
Alexey Starikovskiy wrote:
Lebedev, Vladimir P wrote:
> Please look in "My kernel log" attachment: there are ERROR messages
> Is it criminal?
> Dec 18 18:52:10 coreblack [ 35.963000] ACPI Exception (exoparg2-0442):
> AE_AML_BUFFER_LIMIT, Index (000000100) is beyond end of object
> Dec 18 18:52:10 coreblack [ 35.964000] ACPI Error (psparse-0537):
> Method parse/execution failed [\_SB_.C002.C341.C0F3._SDD] (Node
> df790540), AE_AML_BUFFER_LIMIT
From: Fiodor Suietov <firstname.lastname@example.org>
Subject: libata: wrong sizeof for BUFFER
I have reproduced the AE_AML_BUFFER_LIMIT exception
basing on the SSDT ASL code and libata ata_acpi_push_id()
code. There is the oversight in ata_acpi_push_id() causing
the exception. The following update fixes it:
Signed-off-by: Fiodor Suietov <email@example.com>
--- linux-2.6.20-rc1-mm1/drivers/ata/libata-acpi.c.orig 2006-12-19
+++ linux-2.6.20-rc1-mm1/drivers/ata/libata-acpi.c 2006-12-19
@@ -672,7 +672,7 @@ int ata_acpi_push_id(struct ata_port *ap
input.count = 1;
input.pointer = in_params;
in_params.type = ACPI_TYPE_BUFFER;
- in_params.buffer.length = sizeof(atadev->id * ATA_ID_WORDS);
+ in_params.buffer.length = sizeof(atadev->id) * ATA_ID_WORDS;
in_params.buffer.pointer = (u8 *)atadev->id;
/* Output buffer: _SDD has no output */
Created attachment 9885 [details]
created a dedicated workqueue for notify() execution
Please try this patch and patch from comment #16, post dmesg and system state.
In any case these patches should be useful.
Sorry, it's the first time I help debugging the kernel... How can I apply
those patches? I mean: what command-line should I use?
Created attachment 9889 [details]
>Sorry, it's the first time I help debugging the kernel... How can I apply
>those patches? I mean: what command-line should I use?
1) copy the patches from comment #17 and #19 to your hard disk : for example
pathnames are file_1 and file_2
2) change current directory to linux-2.6.20-rc1-mm1/
3) cat file_1 | patch -p1
cat file_2 | patch -p1
4) build the kernel, etc ....
Created attachment 9896 [details]
Good state dmesg
Dmesg when system is in good state.
Created attachment 9897 [details]
Bad state dmesg
Dmesg when system is in bad state.
Well, it seems that there are no more "error" messages in dmesg but the problem
of the "bad state" still remains.
I also have the acpidump of the good-state and the bad-state, if you want to
give a look at them, but there aren't differences between them with those patch
> 2) If PS/2 mouse support is compiled as a module (psmouse), remove that
> module before shutdown/reboot.
What will be the system state if psmouse is module but is not removed?
What will be the system state if psmouse is not part of kernel at all?
I am interested in cases - shutdown/reboot and 'suspend to disk'.
> What will be the system state if psmouse is module but is not removed?
As if it's built-in the kernel. Same thing...
> What will be the system state if psmouse is not part of kernel at all?
If you mean not to load psmouse at startup the system will not fall in bad state
if I reboot/shutdown the system. But, in such a way the touchpad won't work,
> I am interested in cases - shutdown/reboot and 'suspend to disk'.
I've never tried to "suspend" my laptop since I've never felt the needing; I'm
going to try this one too. If I don't remove that module before either rebooting
or shutting down the system.
Actually, I should thank you for your patience... ;)
I've done some tests:
- If I don't compile PS/2 mouse support at all, the kernel works (even old
versions, now I'm using the gentoo-kernel 2.6.18 as my stable kernel);
- If I compile PS/2 mouse support in kernel, the kernel breaks as soon as I
reboot or shutdown my laptop;
- If I compile PS/2 mouse support as a module there are three sub-cases:
* If I load it and I unload it before shutdown/reboot everything works
* If I load it but I don't unload it before shutdown/reboot the system falls
in bad state
* If I don't load it at all everything works even after reboot/shutdown,
I suppose it's a problem with the touchpad: it seems like the BIOS require to do
something before shutdown/reboot which is done by unloading the psmouse module.
Maybe the BIOS only needs that the kernel frees the resources used by the
touchpad module before shutdown/reboot.
Suspend is still to test but I've read there are some problems here too so I
would like to focus on this problem for now.
> If I compile PS/2 mouse support in kernel, the kernel breaks as soon as I
reboot or shutdown my laptop;
Sorry I meant to say that the system falls in bad state and not that the kernel
> I suppose it's a problem with the touchpad: it seems like the BIOS
> require ...
I guess that this BIOS requires to do something at the beginning of any type
of boot (boot, reboot, resume).
In any case we should find the cause of problem - hw/sw conflicts, etc....
> 1) When the laptop is off, detach AC adapter and battery for a while (some
> models require at least 2 minutes), the next boot will be normal;
Is it true that it is happening with your laptop too?
> * If I don't load it at all everything works even after reboot/shutdown,
> except touchpad.
So, the problem is absent if we do NOT load psmouse in all the case - built-
in, module, removed from .config, Isn't it?
> I guess that this BIOS requires to do something at the beginning
> of any type of boot (boot, reboot, resume).
I don't think so. If you read the symptoms you can see that bad state doesn't
affect only the operating system but the BIOS itself. The POST last 10/20
seconds more than normal.
> In any case we should find the cause of problem - hw/sw conflicts, etc....
I guess is a bug in the BIOS and the psmouse module seems to activate/deactivate it.
>> 1) When the laptop is off, detach AC adapter and battery for a while (some
>> models require at least 2 minutes), the next boot will be normal;
> Is it true that it is happening with your laptop too?
Yes of course. It's the only way to get laptop back to normal state.
> So, the problem is absent if we do NOT load psmouse in all the case - built-
> in, module, removed from .config, Isn't it?
Yes, it is.
> I guess is a bug in the BIOS and the psmouse module seems to
Yes, but I guess that the Windows providers know how to do it.
Do you have any experience of dealing with our problem on Windows on your
So, in any case, we need some time for investigation
> Do you have any experience of dealing with our problem on
> Windows on your computer?
There is no problem with Windows. Even with no drivers it works and the laptop
doesn't fall in bad state.
I installed Windows XP Home Edition SP2 from an original CD I own, using the
product key bundled with the laptop. There were no drivers (I downloaded them
from HP website) and rebooting/shutting down the laptop doesn't caused the bad
There's no way of debugging the kernel?
Thanks for the information; we need some time for the investigation.
I tried Damn Small Linux which ships with kernel 2.4: NO BAD STATE!
The touchpad works and, after a reboot or a shutdown, the system keeps working!
This should prove that this problem is related to kernel 2.6.
Surely, we work on it now.
Created attachment 10054 [details]
Unregister serio drivers on shutdown
This one should fix it?
Please try 'irqpoll' boot flag.
No response from bug submitter, please reopen if problem persists.
Trenn did the magic! The kernel available here
(ftp://ftp.suse.com/pub/people/trenn/hp_fixes_final/) doesn't put my laptop
in "bad state" and after reboot/shutdown the system fully works!
PS: Sorry for not responding all this time but I had some troubles with my
Let's bring this together here...:
Dmitry (not sure whether he maintains the stuff, but submitted a lot patches to
serio subsystem in the recent past and he probably should ack/push the patch in
the end) asked me to try these two patches.
As I am very busy trying to fix some more things on these HP beasts for our
SLE10 kernel, I'd be really happy if Allessandro could help out a bit with
The patches apply fine (with small offset in the one) in recent 2.6.20 kernel.
Created attachment 10368 [details]
The first one I got from dmitry (psmouse-fiddle-with-reset)...
Created attachment 10369 [details]
... and the seond one (serio-cleanup-to-bus)
Be careful that you should boot into the kernel twice.
Or suspend the kernel twice.
This is because the breakage happens on shutdown when the psmouse/serio
subsystem is not cleaned up and survives the reboot (I expect the Embedded
Controller, is also accessing serio/i8042 very late on shutdown, gets confused
and is not rebooted or correctly initialised after reboot (only if power and
battery are unplugged for some time)).
The bad state gets fixed when booted into a working kernel (still in bad state)
and then shutting down/rebooting.
I can confirm the "bad-state" problem in the HP Compaq nx7300 too.
I'll try kernel 2.6.20 with two patches from Thomas Renninger.
Ok Thomas... I will help as much as I can.
What version of the kernel should I use to apply those patches? 2.6.20?
I think you got it.
Fedora Core 6, updated to feb 9, 2007
Vanilla kernel 2.6.20 with Thomas Renningen patches from comment #40 and #41
compiled and installed:
1-Boot in good state with original kernel
2-Reboot->going in bad state with original kernel
3-Reboot with 2.6.20 patched kernel->still in bad state
4-Reboot with 2.6.20 patched kernel->going in good state: battery reporting ok,
cpufreq working and go to full speed.
5-reboot with original Fedora kernel->still in bad state
6-reboot->going in bad state
Let me know if I can do more specific testing.
5-reboot with original Fedora kernel->still in *good* state
Sorry for typo.
Same here. It works!
Hm, any of you guys using suspend-to-ram? Would be nice to test the patches
with suspend to ram as well...
In openSuSE 10.2 (updated) and with the custom kernel made by Thomas, suspend
to RAM only works if I add the "-f" flag to S2RAM_OPTS in /etc/pm/config. When
I turn on my laptop again everything works except the keyboard!
I didn't tested with kernel 2.6.20 yet.
Alessandro, is there a kernel version that has working s2ram on your box?
No luck... keyboard doesn't work on resume either with kernel 2.6.8 or kernel
2.6.20... any suggestion?
I'm going to install Gentoo again and I'm gonna try other kernel versions.
Thank you for solving the "bad state" trouble but suspend to RAM is important
too. If we manage to solve this problem, this box (and also others from HP)
will be fully linux compatible!
Should I close this bug and open a new one?
Maybe it's worth splitting up patches into the shutdown cleanup, which should
be rather unrisky and it shouldn't be a problem pushing this into stable
kernels (back to, don't know, 2.6.16.X).
And the suspend/resume problem which could be added to 2.6.21-rcX as soon as it
works half way stable (but still might break other mice/machines on suspend/
I added the patches to latest SUSE CVS kernel head. This should make things
easier to test. You should see a kernel popping up with Alexey's "execute
notify handlers in own thread" (to not get confused by another problem) here:
Do a rpm -qp --changelog |less to check whether the patches are included.
Hmm, this works for i386, but something seems to be broken with x86_64 and
possibly other branches/archs. I try to get this fixed. It may take some hours/
a day until submitted patchs arrive compiled and packaged.
Hmm, pre-testing by someone who comiles the stuff on his own should still be
faster..., however I can add things as soon as it looks stable for broader and
Created attachment 10392 [details]
Properly reset psmouse at suspend
I would appreciate if you test updated versions of the patches. The first one
(psmouse-fiddle-with-reset.patch) is the one that really fixes the problem, the
second one is generic improvement of i8042 suspend process and should speed it
a bit as we don't try to reset mouse several times during suspend/resume.
If these patches work fir you I will commit them in my tree and once they
survive a -mm release will push them to Linus.
P.S. Let's move s2ram discussion to bug 7977
Created attachment 10393 [details]
Let serio bus handle suspending/resuming of i8042 ports
Should I apply all four patches or only the last two?
Just the last two. I could not mark the first 2 obsolete (i guess because i
wasn't the one who put them in).
The last two patches applied to vanilla kernel 2.6.20 make my laptop falling
in bad state again. I'm trying again with the previous two patches.
I confirm that the first two patches work and the last two don't.
What did you modified Dmitry?
I added disabling of pass-through port at shutdown (which should be a noop) and
removed enabling of the mouse after resetting it. I guess that your laptops
really like mouse to be fully enabled before rebooting. Could you please try
applying patches 1 and 4 (patch id #10368 and patch id 10393, IOW old psmouse
patch and the new serio bus patch) to make sure that my theory is correct?
Yep, you're right!
Vanilla kernel 2.6.20 patched with the psmouse first patch and the serio last
patch still work and it doesn't let my laptop in bad state.
Now I'm going to try suspending.
I tried to fix the broken away fan issue on nx6325.
After suspend, fans that should be on are in off state and I added a little
line to override fan to switch it on even thermal module thinks it's off.
This did not work and I was quite confused why...
I tried patch 1+4, rebooted it twice and tested suspend twice.
Fan state got corrected now as expected after next thermal polling.
So I expect that this model (every HP has something else that gets fixed with
mouse removal/cleanup) shows strange fan behaviour. I can also confirm that 1+4
works with suspend (first time machine froze, but I had huge acpi debug output
switched on...) and as said, it even seem to fix things up.
Dmitry, would you mind adding me to Signed-off, CC me or send me mainline
commit no., so that I can track this.
Created attachment 10446 [details]
Properly reset psmouse at suspend
Hm, it turns out my patches completely broke suspend-to-ram which is not good,
so here is the updated versions. I have them committed to my tree and I believe
commit id stays the same when Linus pulls from other trees, so here they are:
Created attachment 10447 [details]
Let serio bus handle suspending/resuming of i8042 ports
Hm, as far as I know the bove patches fix the issue with shutdown and suspend
to disk and they are in mainline so I am closing this.
In mainline? From what version?
2.6.21-rc2 should have it.