Bug 11027
Summary: | random forward time jumps | ||
---|---|---|---|
Product: | Platform Specific/Hardware | Reporter: | Mario Frasca (mfrasca) |
Component: | PPC-32 | Assignee: | platform_ppc-32 |
Status: | RESOLVED OBSOLETE | ||
Severity: | high | CC: | alan, john.stultz |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.32 | Subsystem: | |
Regression: | Yes | Bisected commit-id: |
Description
Mario Frasca
2008-07-02 10:49:58 UTC
I marked this as a regression. I believe this is due to the timebase frequency being incorrect. I see similar behavior on my mac mini. My understanding, last I looked into it, is we're pulling the frequency from the firmware tables, but it is somehow incorrect. Trying to use the calibrated method also failed to improve the freq. For instance: on my current 2.6.24-19-powerpc kernel, running on a 1.4ghz mac mini, I get an uncorrectable ~2300ppm drift. replying to John: to me it does not seem the same as a drift... what happens on your system if you disable ntp and run the same bash one-liner given here in the report? do you also see a "normal" behaviour and on top of it some random jumps? Hmmm. Good point. So I see lots of jumps, but of a similar magnitude, since the drift is constant. Yours is much more erratic. Can you attach a dmesg log? I'm baffled as to what changes between 2.6.8 and 2.6.14 might have caused this. Is there any chance you can bisect the versions down a bit further? I'm running Debian on a very slow system... recompiling a kernel costs quite some time and recompiling 7 will cost 7 times that... I can't remember from where I had retrieved those precompiled kernels and I haven't got them any more (needed some space). can you point me at some info about recompiling (in particular configuring) the kernel so that I get the same that is distributed by Debian? Mario: Just wanted to check in and see if this issue still existed with more recent kernel versions? yes, it's still there, in kernel 2.6.30. 2.6.30-1-powerpc 2010-01-01 06:00:01 up 15 days, 9:07, 1 user, load average: 1.45, 0.67, 0.32 2.6.30-1-powerpc 2010-01-02 06:00:01 up 20 days, 5:19, 0 users, load average: 0.40, 0.11, 0.03 2.6.30-1-powerpc 2010-01-03 06:00:02 up 21 days, 7:14, 0 users, load average: 0.06, 0.07, 0.02 2.6.30-1-powerpc 2010-01-04 06:00:02 up 22 days, 8:42, 0 users, load average: 0.04, 0.01, 0.00 2.6.30-1-powerpc 2010-01-05 06:00:01 up 23 days, 13:50, 0 users, load average: 0.07, 0.02, 0.00 the HUGE amount of time jumps on the 1st of January were possibly linked to me doing a safe-upgrade over a ssh link (most jumps occur while load is high). update: an update in January made the situation temporarily very close to acceptable. the update was not to the kernel version, I would not know what was the part modified, maybe you can help me look into the logs, but anyway, on the 7th of January I rebooted the system into the same kernel as the last logs I gave here above and these are the first few lines: 2.6.30-1-powerpc 2010-01-07 08:23:22 REBOOT 2.6.30-1-powerpc 2010-01-08 06:00:01 up 21:54, 7 users, load average: 0.04, 0.29, 0.22 . . . 2.6.30-1-powerpc 2010-01-12 06:00:01 up 4 days, 21:54, 7 users, load average: 0.00, 0.05, 0.12 . . . 2.6.30-1-powerpc 2010-01-18 06:00:02 up 10 days, 21:58, 10 users, load average: 0.06, 0.33, 0.24 2.6.30-1-powerpc 2010-01-19 06:00:01 up 11 days, 22:03, 8 users, load average: 0.02, 0.24, 0.23 this would be good enough for me and did not change when I booted into kernel 2.6.32-trunk-powerpc the 20th of February. I updated and rebooted the system the 3rd of March. the behaviour did not seem to change significantly, that is, the few values I collected remain within the limits of one or less minutes per day. rebooted and the 7th of March and things got worse, with daily cumulative jumps up to 20 minutes. after an other update and reboot, I'm bact to this: 2.6.32-trunk-powerpc 2010-03-14 10:02:39 REBOOT 2.6.32-trunk-powerpc 2010-03-15 06:00:01 up 20:00, 5 users, load average: 0.04, 0.37, 0.45 2.6.32-trunk-powerpc 2010-03-16 06:00:01 up 1 day, 20:09, 3 users, load average: 0.71, 0.74, 0.53 2.6.32-trunk-powerpc 2010-03-17 06:00:01 up 2 days, 20:24, 3 users, load average: 2.30, 0.96, 0.56 2.6.32-trunk-powerpc 2010-03-18 06:00:01 up 3 days, 20:35, 3 users, load average: 0.00, 0.07, 0.26 2.6.32-trunk-powerpc 2010-03-19 06:00:01 up 4 days, 21:35, 5 users, load average: 0.00, 1.09, 1.66 2.6.32-trunk-powerpc 2010-03-20 06:00:01 up 5 days, 22:53, 3 users, load average: 2.00, 2.29, 2.66 2.6.32-trunk-powerpc 2010-03-21 06:00:33 up 7 days, 28 min, 3 users, load average: 18.64, 10.34, 5.88 a bit more statistical info... in the period 7th January - 20th February I had at least 21 random forward time jumps of entity 1s or above (I don't keep track of individual jumps, I correct them once each minute and so two jumps of 4 seconds in the same minute are to me indistinguishable from one of 8 seconds). largest jump was 166s. in the period 20th February to now I observed 500 more jumps, longest 856s. there seem to be some preferred values, like around 170s... whatever the cause might be. Mario: My apologies for the embarrassingly slow response here. I was just reminded of this issue by Alan's kernel version update. If you still are hitting this problem, could you please attach dmesg output? at the moment my server is hard-disk-dead, so I'm not experiencing the problem any more :( |