Most recent kernel where this bug did not occur: 2.6.17.3 I did not test 2.6.17.11 yet but I'm pretty sure the bug was introduced in 2.6.18-rc1 Distribution: Debian unstable Hardware Environment: Thinkpad Z61m It has a Pentium M CoreDuo. /proc/cpuinfo says: Genuine Intel(R) CPU T2500 @ 2.00GHz Problem Description: When CONFIG_HPET_TIMER is enabled any call to nanosleep blocks indefinitely. For me that's in a initramfs during udevsettle. Alt+PrintScreen+T says something about nanosleep. replaceing udevsettle with "busybox sleep 1" blocks as well with similar results for Alt+PrintScreen+T. The waiting process can be killed with Alt+PrintScreen+I and the boot process continues until the next nanosleep. When CONFIG_HPET_TIMER is disabled the problem does not occur. I have tested 2.6.18-rc1 - 2.6.18-rc4 and several 2.6.18-rc?-mm? kernels. The problem occurs for all of them. I assume the bug was intoduced with the big i386 timer related patches. "indefinitely" is of course a guess. I did however wait for serveral minutes for a "busybox sleep 1" to complete.
argh. Are you able to get us a copy of the kernel boot log? demsg output, or serial console?
Created attachment 8882 [details] dmesg output This is the dmesg output when booting with init=/bin/bash I'm using debian's initramfs-tools here. After setting up a temporary rootfs it starts udevd, udevtrigger and udevsettle. At this point I have to press Alt+PrintScreen+I to continue.
Created attachment 8883 [details] kernel tasks This is the output from Alt+PrintScreen+T at the point where I pressed Alt+PrintScreen+I in the dmesg output. This is from a different startup (with the same configuration) because the dmesg buffer was too small.
Hmm.. Is time incrementing properly? If you wait 5 seconds (using your watch) inbetween calls to "date" does the output look correct?
Also I assume if you boot w/ "clocksource=acpi_pm" the problem goes away as well?
Further looking at this, I'm guessing your BIOS exports an HPET address but for some reason its failing to initialize the HPET counter (note the lack of "Using HPET for base-timer" in the dmsg) and the HPET clocksource isn't catching this.
Created attachment 8889 [details] patch to hopefully better check the hpet status I've only compile tested this, but it should look to make sure there's more then just a valid HPET address pointer before we register the HPET clocksource. Could you test to see if it resolves the issue?
Ok, a few mof facts: With this bug the clock is frozen. The output of date doesn't change at all. Yes, with "clocksource=acpi_pm" the problem goes away. The same is true for the patch. I also checked some older kernels. It seems the hpet timer never worked for me. The kernels pre 2.6.18-rc1 just recognized that the hpet timer didn't work, so I never noticed. So the question would now be: What needs to be done to enable the timer correctly? Anyway, thanks for the help so far.
Thanks for testing the patch so quickly! I'll send it on to Andrew for more testing. As for fixing the HPET so it is usable, we'd need to add some more debugging info to find out which part of hpet_enable() is failing. Its likely a hardware issue, so it might be nothing can be done to make it work, and if something can be done I suspect a BIOS update will be necessary.
Well, I'm allready using the newest available BIOS version. I also added some debug output: Is 0xF0800000 a reasonable value for hpet_virt_address. I very much doubt though that -1 is a reasonable value for "id" or "hpet_period". There is not much that can be done about it, is there?
Unfortunately I suspect not, unless the hardware vendor creates a BIOS fix (and assuming that would be sufficient). The -1 value for hpet_period points to an incorrect HPET table.
Fix included into 2.6.18-rc6. Closing.