Bug 14280
Summary: | slow clock on Amilo Pro 2030 with NO_HZ=y | ||
---|---|---|---|
Product: | Platform Specific/Hardware | Reporter: | Bernhard M. Wiedemann (kernelbmw2017) |
Component: | i386 | Assignee: | platform_i386 |
Status: | RESOLVED DOCUMENTED | ||
Severity: | low | CC: | rjw, tglx |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.31 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
dmesg with NO_HZ active
Debug patch against 2.6.31 |
Description
Bernhard M. Wiedemann
2009-09-30 18:28:07 UTC
What was most recent kernel version that worked properly on your hardware? Also could you provide the output from: cat /sys/devices/system/clocksource/clocksource0/current_clocksource cat /sys/devices/system/clocksource/clocksource0/available_clocksource thanks -john > cat /sys/devices/system/clocksource/clocksource0/available_clocksource acpi_pm tsc > cat /sys/devices/system/clocksource/clocksource0/current_clocksource acpi_pm I ran a two-year-old openSUSE-10.3 with a 2.6.22 kernel before and never had clock issues, but this might well be, because NO_HZ was not in it. I could probably boot up the old system with kernels from in-between (new one is on ext4 so harder there) or I could use live systems to test. oh, I just found that this seems to amend the problem: echo tsc > /sys/devices/system/clocksource/clocksource0/current_clocksource is there a boot-parameter for that? or some way the kernel will automatically figure out to use that? Huh. Can you provide the full dmesg? It *sounds* like NO_HZ is somehow causing the acpi_pm to slow down or halt occasionally. I'm not sure how that would be connected to NO_HZ. When you boot with nohz=off, do you see the same acpi_pm return for current_clocksource? it has same clocksource: cat /proc/cmdline /sys/devices/system/clocksource/clocksource0/current_clocksource root=/dev/sda3 resume=/dev/sda1 splash=silent quiet vga=0x317 nohz=off acpi_pm maybe the system could miss timer ticks with NO_HZ? I think the chipset does not support cpufreq, so that should not be an issue. Created attachment 23212 [details]
dmesg with NO_HZ active
Well, one of the expected side effects of NO_HZ is to miss timer ticks, however the timekeeping code is fine with that since it doesn't use ticks to account for time, instead we accumulate time directly from the clocksource, so as long as the clocksource doesn't wrap, it shouldn't be an issue. I guess it could be that you're going tickless for more then 2.5 seconds and the ACPI PM is wrapping and you lose a bit of time that way. Seems a bit unlikely. If you do something like the following: $ while true; do echo "" > /dev/null; done Do you see the time drift behind? Also, what are you using to measure the time drift? ntpdate? thanks -john I'm using the current equivalent of ntpdate:
> sntp -v -P no -r pool.ntp.orgs
sntp options: a=2 v=1 e=0.100 E=5.000 P=2147483647.000
d=15 c=5 x=0 op=1 l=/etc/sntp.pid f= pool.ntp.org
sntp: time changed by 2.901 secs to 2009 Oct 01 05:46:30.323 +/- 0.012+0.041
The very interesting part is that with running your busy-loop line as well as my favourite perl -e 'while(1){}' (which does not go to kernel) there is no more clock-drift. But as soon as I stop it, drift comes back.
Should help to narrow it down.
I also think, if some counter was wrapping, I would be seeing time jump in distinct 2.5sec-steps instead of slewing nearly continuously over time.
tests with
while sleep 10 ; do sntp -v -P no -r $server ; done
show some variation between 0.32 and 0.4 with at most 0.025sec steps.
Created attachment 23290 [details]
Debug patch against 2.6.31
Could you test this patch to see if it changes behavior? I'm trying to determine if there's a bug in the NOHZ code that is causing the clock adjustment code to go crazy, or if the clocksource is actually slowing down when we go into oneshot mode.
I built a 2.6.31 with the patch and drift is still the same. 4% when idle and 0% when busy. Bernhard: Sorry for the late response, I didn't see your last data point. So it seems your hardware is actually speeding up and slowing down. I suspect its some sort of chipset bug. You might try seeing if there are any BIOS power-management options that could be affecting this. Something like "Spread-spectrum timing" maybe? For now I guess nohz=off will be required for your box. Thomas, would you be opposed to some sort of pci quirk to force this on? > So it seems your hardware is actually speeding up and slowing down. I suspect
> its some sort of chipset bug.
>
> You might try seeing if there are any BIOS power-management options that
> could
> be affecting this. Something like "Spread-spectrum timing" maybe?
>
> For now I guess nohz=off will be required for your box. Thomas, would you be
> opposed to some sort of pci quirk to force this on?
Not convinced yet :)
Bernhard, can you please let the machine idle and just run:
# while true; do sleep 1; done
Does that still expose the problem ?
Thanks,
tglx
I'm back on opensuse-11.2-RC2's 2.6.31.5-0.1-pae kernel now, for other testing reasons. I went init 2 and killed even cron and cupsd and did while sleep 1; do true; done which is pretty much the same. bug was still there. acpi_pm clock is 4% slow. I also saw that TSC timer is ~1% fast, too. This machine comes with phoenix BIOS which has an amazingly user-friendly settings dialog with 8 (eight) settings, none of which comes close to "spread spectrum" (closest is NX bit enable) I'll go for nohz=off for the meanwhile. If the above debug-patch is important for this test, please tell, so that I can retry. What's the status here? Have you tried any newer kernels since then? Is the problem still present in there? tried today http://ftp.suse.com/pub/projects/kernel/kotd/master/i586/kernel-default-2.6.32.2-0.0.25.2acc4d7.i586.rpm It still shows same behaviour. Only acpi_pm clocksource available there and it is slow with NOHZ, but runs fine with nohz=off. update: installed 2.6.37-rc3-git6 on this machine and the bug is still there. Luckily nohz=off still helps. acpi_pm is not really a good clocksource. Have you tried to add tsc=reliable to the kernel command line? Also you may need to add hpet=force to the kernel command line. Can you please test that too? So I booted 2.6.37-22-default without nohz=off but with tsc=reliable hpet=force and my viafb framebuffer cursor still would not blink, unless I produced keyboard interrupts. /sys/devices/system/clocksource/clocksource0/available_clocksource hpet acpi_pm /sys/devices/system/clocksource/clocksource0/current_clocksource tsc I did echo hpet > current_clocksource and dmesg had "Switching to clocksource hpet" /proc/interrupts showed as few as 20 "Local timer interrupts" in 3 minutes. with i8042 being the only other interrupt echo tsc > current_clocksource caused dmesg to have "Override clocksource tsc is not HRT compatible. Cannot switch while in HRT/NOHZ mode Switching to clocksource tsc" It is hard to tell much about timing accuracy, because network does not work without interrupts. Comparing `date` between machines indicates, that it is not drifting too much with either tsc or hpet. Have you tried hpet=force without tsc=reliable ? both hpet=force alone and tsc=reliable alone help to get reliable timekeeping. echo acpi_pm > current_clocksource makes it return to the buggy behaviour, so this indicates that acpi_pm is bad on this machine with NOHZ. The other side effects described in comment 18 are probably from viafb. Yes, that most likely is the case. Please always run with hpet=force alone as a workaround, we can't do much about fixing your machine. |