Bug 11370

Summary: adjtimex() fails to set frequency correctly
Product: Timers Reporter: Martin Ziegler (ziegler)
Component: OtherAssignee: john stultz (john.stultz)
Status: CLOSED CODE_FIX    
Severity: low CC: akpm, john.stultz, zippel
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.26-1 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: C-program
userland demo of the bug
adjtimex freq fix

Description Martin Ziegler 2008-08-19 08:49:02 UTC
adjtimex() (in sys/timex.h) reduces odd positive frequencies by 1.
Negative frequencies are correctly set.

Note:
Kernels from 2.6.21 to 2.6.25 adjtimex() changed frequencies to the next frequency
divisible by 16.  In older kernels adjtimex() set frequencies correctly.

Note also that, if you use adjtimex() to change the frequency as non-root, and then use it to read the actual frequency, it reports the "expected" new frequency. While of course nothing has changed.
Comment 1 Andrew Morton 2008-08-19 12:52:22 UTC
(reassigned to Roman, marked as regression)

Would you be able to provide a simple testcase please?
Comment 2 Martin Ziegler 2008-08-19 17:00:30 UTC
Created attachment 17319 [details]
C-program

sets given frequencies with adjtimex() and prints the results
Comment 3 Martin Ziegler 2008-08-19 17:10:02 UTC
in 2.6.26 the command 

"adj_test 19 18 17 -17 -18 -19" 

prints 

"adjtimex returns 5, new frequency 18
adjtimex returns 5, new frequency 18
adjtimex returns 5, new frequency 16
adjtimex returns 5, new frequency -17
adjtimex returns 5, new frequency -18
adjtimex returns 5, new frequency -19"
---------------------------------------------------
With kernel 2.6.21-2.6.25 the output was
"adjtimex returns 5, new frequency 16
adjtimex returns 5, new frequency 16
adjtimex returns 5, new frequency 16
adjtimex returns 5, new frequency -16
adjtimex returns 5, new frequency -16
adjtimex returns 5, new frequency -16"
---------------------------------------------------
With kernels before (since April 2004) the output
had the correct new frequencies 
19 18 17 -17 -18 -19.

Regards,
Martin
Comment 4 john stultz 2008-08-19 18:37:02 UTC
Created attachment 17321 [details]
userland demo of the bug

Thanks for the bug report and test case!

Looks like there's an order of operations issue in converting the userland input into a kernel value and then back out. This test case illustrates the issue and how it might be fixed. I'll generate a patch with this fix and add it shortly.
Comment 5 john stultz 2008-08-19 18:56:33 UTC
Actually the fix in the demo will not work proper for large values. It seems do_div might be the simplest solution here.
Comment 6 john stultz 2008-08-19 19:25:30 UTC
Created attachment 17322 [details]
adjtimex freq fix 

Here's a fix that appears to resolve the issue for me.
Simply use div_s64() to convert back to userland, also removes the now unused PPM_SCALE_INV values.
Comment 7 john stultz 2008-08-19 19:29:45 UTC
I sent the patch to lkml for review and hopeful inclusion into -mm.
Comment 8 Roman Zippel 2008-08-20 04:43:48 UTC
(In reply to comment #0)
> adjtimex() (in sys/timex.h) reduces odd positive frequencies by 1.
> Negative frequencies are correctly set.

Why exactly is this a problem?
The error is so small (< .02nsec) it can't be of any practical significance. 
If you look at the BSD kernel (which use the reference implementation more or less directly) you'll see that they get even less resolution back.
Comment 9 Martin Ziegler 2008-08-20 06:26:15 UTC
In 2004 I wrote a program to change the clock parameters from time to time. It also checked wether the the clock parameters were still the same as set in the last run. This "naive" check stopped working with 2.6.21. 

To restore the behavior of the pre-2.6.21 kernels will not improve the clock accuracy - of course. But it may help programmers not to misinterpret the manpage of adjtimex(). 

Reagrds,
Martin
Comment 10 john stultz 2008-08-20 08:18:44 UTC
Martin: Could you be more specific about pre-2.6.21 kernels? Looking at the 2.6.20->2.6.21 diff, I only see whitespace changes with regards to the freq value.
Comment 11 Martin Ziegler 2008-08-20 08:52:55 UTC
One year ago I filed this as bug #43877 against linux-image-2.6.21-2-686.

From what I said there I gather now that the bug was not present in 2.6.18,
and I had no information then about 2.6.19 and 2.6.20)

(So "pre-2.6.21" was not accurate, sorry)

Regards,
Martin
Comment 12 Roman Zippel 2008-08-20 19:28:44 UTC
(In reply to comment #9)

> In 2004 I wrote a program to change the clock parameters from time to time.
> It
> also checked wether the the clock parameters were still the same as set in
> the
> last run. This "naive" check stopped working with 2.6.21. 

Your assumption that you can get the original value back is incorrect. It's quite valid for a NTP implementation to work with even less resolution. What you get is the internal frequency representation, it's a best effort.
Please fix your program, the real bug is in there and it makes your program nonportable.

That said, the rounding in the kernel can be improved by using 19 as PPM_SCALE_INV_SHIFT (which is the same amount as the factor 2 in PPM_SCALE, so the initial shift won't drop any input bits), the inverse divide will then produce a slightly larger value, which is then correctly rounded by the final shift.
Comment 13 john stultz 2008-12-04 13:38:46 UTC
Romans fix for this issue has been upstream for awhile.
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=d40e944c25fb4642adb2a4c580a48218a9f3f824

I'm going to mark this is as closed.