Bug 7062 - nanosleep blocks indefinitely when CONFIG_HPET_TIMER is enabled
Summary: nanosleep blocks indefinitely when CONFIG_HPET_TIMER is enabled
Status: CLOSED CODE_FIX
Alias: None
Product: Timers
Classification: Unclassified
Component: Other (show other bugs)
Hardware: i386 Linux
: P2 normal
Assignee: john stultz
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-08-27 06:08 UTC by Michael Olbrich
Modified: 2006-09-06 16:45 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.18-rc4
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
dmesg output (17.14 KB, text/plain)
2006-08-27 11:18 UTC, Michael Olbrich
Details
kernel tasks (19.12 KB, text/plain)
2006-08-27 11:26 UTC, Michael Olbrich
Details
patch to hopefully better check the hpet status (356 bytes, patch)
2006-08-28 12:10 UTC, john stultz
Details | Diff

Description Michael Olbrich 2006-08-27 06:08:30 UTC
Most recent kernel where this bug did not occur:
2.6.17.3
I did not test 2.6.17.11 yet but I'm pretty sure the bug was introduced in
2.6.18-rc1

Distribution:
Debian unstable

Hardware Environment:
Thinkpad Z61m
It has a Pentium M CoreDuo. /proc/cpuinfo says:
Genuine Intel(R) CPU           T2500  @ 2.00GHz

Problem Description:
When CONFIG_HPET_TIMER is enabled any call to nanosleep blocks indefinitely. 
For me that's in a initramfs during udevsettle. Alt+PrintScreen+T says 
something about nanosleep. replaceing udevsettle with "busybox sleep 1" blocks 
as well with similar results for Alt+PrintScreen+T. The waiting process can be 
killed with Alt+PrintScreen+I and the boot process continues until the next 
nanosleep.
When CONFIG_HPET_TIMER is disabled the problem does not occur.

I have tested 2.6.18-rc1 - 2.6.18-rc4 and several 2.6.18-rc?-mm? kernels. The 
problem occurs for all of them. I assume the bug was intoduced with the big 
i386 timer related patches.
"indefinitely" is of course a guess. I did however wait for serveral minutes 
for a "busybox sleep 1" to complete.
Comment 1 Andrew Morton 2006-08-27 10:16:42 UTC
argh.

Are you able to get us a copy of the kernel boot log?  demsg output,
or serial console?
Comment 2 Michael Olbrich 2006-08-27 11:18:46 UTC
Created attachment 8882 [details]
dmesg output

This is the dmesg output when booting with init=/bin/bash
I'm using debian's initramfs-tools here. After setting up a temporary rootfs
it starts udevd, udevtrigger and udevsettle. At this point I have to press
Alt+PrintScreen+I to continue.
Comment 3 Michael Olbrich 2006-08-27 11:26:07 UTC
Created attachment 8883 [details]
kernel tasks

This is the output from Alt+PrintScreen+T at the point where I pressed
Alt+PrintScreen+I in the dmesg output. This is from a different startup (with
the same configuration) because the dmesg buffer was too small.
Comment 4 john stultz 2006-08-28 10:20:09 UTC
Hmm.. Is time incrementing properly? If you wait 5 seconds (using your watch)
inbetween calls to "date" does the output look correct?
Comment 5 john stultz 2006-08-28 10:21:35 UTC
Also I assume if you boot w/ "clocksource=acpi_pm" the problem goes away as well?
Comment 6 john stultz 2006-08-28 12:09:05 UTC
Further looking at this, I'm guessing your BIOS exports an HPET address but for
some reason its failing to initialize the HPET counter (note the lack of "Using
HPET for base-timer" in the dmsg) and the HPET clocksource isn't catching this.
Comment 7 john stultz 2006-08-28 12:10:53 UTC
Created attachment 8889 [details]
patch to hopefully better check the hpet status

I've only compile tested this, but it should look to make sure there's more
then just a valid HPET address pointer before we register the HPET clocksource.
Could you test to see if it resolves the issue?
Comment 8 Michael Olbrich 2006-08-28 14:06:41 UTC
Ok, a few mof facts:

With this bug the clock is frozen. The output of date doesn't change at all.

Yes, with "clocksource=acpi_pm" the problem goes away.
The same is true for the patch.

I also checked some older kernels. It seems the hpet timer never worked for me.
The kernels pre 2.6.18-rc1 just recognized that the hpet timer didn't work, so
I never noticed.
So the question would now be: What needs to be done to enable the timer
correctly?
Anyway, thanks for the help so far.
Comment 9 john stultz 2006-08-28 14:20:49 UTC
Thanks for testing the patch so quickly! I'll send it on to Andrew for more testing.

As for fixing the HPET so it is usable, we'd need to add some more debugging
info to find out which part of hpet_enable() is failing. Its likely a hardware
issue, so it might be nothing can be done to make it work, and if something can
be done  I suspect a BIOS update will be necessary.
Comment 10 Michael Olbrich 2006-08-28 16:09:51 UTC
Well, I'm allready using the newest available BIOS version.
I also added some debug output:
Is 0xF0800000 a reasonable value for hpet_virt_address.
I very much doubt though that -1 is a reasonable value for "id" or
"hpet_period".
There is not much that can be done about it, is there?
Comment 11 john stultz 2006-08-28 16:17:00 UTC
Unfortunately I suspect not, unless the hardware vendor creates a BIOS fix (and
assuming that would be sufficient). The -1 value for hpet_period points to an
incorrect HPET table.
Comment 12 john stultz 2006-09-06 16:45:28 UTC
Fix included into 2.6.18-rc6. Closing.

Note You need to log in before you can comment on or make changes to this bug.