Bug 11432 - tick clock too slow by 0.5%
Summary: tick clock too slow by 0.5%
Status: REJECTED WILL_NOT_FIX
Alias: None
Product: Timers
Classification: Unclassified
Component: Other (show other bugs)
Hardware: All Linux
: P1 high
Assignee: john stultz
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-08-26 08:32 UTC by Oliver Hamann
Modified: 2009-03-24 10:53 UTC (History)
0 users

See Also:
Kernel Version: 2.6.26.3
Subsystem:
Regression: ---
Bisected commit-id:


Attachments

Description Oliver Hamann 2008-08-26 08:32:31 UTC
The clock whose values are returned by the C library function "times" runs too slow on my system, by about 0.5%. I think this must be a kernel bug, because glibc forwards the call directly to sys_times (right?).

I had the bug with SMP kernel 2.6.25.10 and 2.6.26.3 (unpatched from kernel.org) on an Intel Core2Duo machine, under OpenSuse 11.0 64-bit. The same system copied to an Athlon machine and with kernel 2.6.25.10 reconfigured non-SMP does not have the bug. I also have an elder system on the Intel machine, Suse 9.3 with SMP kernel 2.6.23.9, but this one even does not have the bug. So the bug started somewhere between 2.6.23.10 and 2.6.25.10, and it still nests in 2.6.26.3, and it seems to concern SMP kernels only, or Intel machines.

The bug can be seen in the stopwatch of Eagle Mode 0.71.0 (http://eaglemode.sf.net). That stopwatch uses both, the "time" and "times" functions, in order to be persistent across reboots and to show centiseconds as long as the system is not rebooted. Through the bug, the stopwatch stops showing centiseconds after about three to five minutes, because it realizes a too large difference between the two clocks. More precise: First, the stopwatch shows results from the times function with centiseconds, then it toggles a while (quite ugly), and then it shows results from the time functions without centiseconds. By comparison with a real-world stopwatch it can be seen that the first phase loses nearly one second in three minutes (~0.5%).

Eagle Mode 0.72.0 will be resistant against the bug, but I think fixing it is quite important, because bad clocks can bring satellites down. Therefore I apply a high severity.
Comment 1 john stultz 2008-08-26 12:09:36 UTC
I'm not sure I'm fully understanding how you're making use of the times() interface. Could you clarify this a bit?

time() returns the number of seconds since the Epoch.

times() returns a tms struct that provides the amount of cpu time the process has consumed. Thus if the process sleeps, blocks, or is preempted, it won't consume any cpu time. Further on SMP systems, the cpu time for a process with threads can grow faster then time(), since two threads can consume cpu time on different processors in parallel. Thus the return values of the two functions are not strictly linked.

Also If you want finer resolution time then just seconds, consider gettimeofday() or clock_gettime().
Comment 2 Oliver Hamann 2008-08-27 00:04:09 UTC
I use the return value of the times function, not the results in the tms struct. That return value is from a continuous clock which never stops or jumps, and which has no defined start point. It is measured in clock ticks. The number of ticks per second can be get with sysconf(_SC_CLK_TCK) (mostly this results 100).

That clock is widely used in many applications for all kind of delta-time measurements, because unlike the time and gettimeofday functions, it does not jump when the system clock is adjusted. clock_gettime requires librt and is not so portable.

Yes, using gettimeofday in the stopwatch is the solution which I prepared for Eagle Mode 0.72.0.
Comment 3 Alan 2009-03-24 10:53:07 UTC
0.5% is unfortunately within the sort of timer clock error range found on some hardware and if your hardware can't count time accurately you need to look at xntpd either to lock time externally or to slew continually to compensate

Note You need to log in before you can comment on or make changes to this bug.