Bug 6748 - Clock drifts by 30% for SMP kernel w/APIC
Summary: Clock drifts by 30% for SMP kernel w/APIC
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Platform Specific/Hardware
Classification: Unclassified
Component: i386 (show other bugs)
Hardware: i386 Linux
: P2 high
Assignee: platform_i386
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-06-25 10:55 UTC by Joachim Frieben
Modified: 2007-06-12 09:38 UTC (History)
4 users (show)

See Also:
Kernel Version: 2.6.17-git5
Subsystem:
Regression: ---
Bisected commit-id:


Attachments

Description Joachim Frieben 2006-06-25 10:55:20 UTC
Most recent kernel where this bug did not occur: n/a
Distribution: Fedora Core Development
Hardware Environment: Dual Pentium II Overdrive on PR440FX 
Software Environment: Fedora Core Development w/SMP enabled kernel
Problem Description: Back in October 2001 (!), I posted a bug report at Red Hat
bugzilla,

  https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=55223 ,

for kernel release 2.4.9 according to which switching on "APIC" made the system
clock run faster. I do not know of any earlier release that might have worked
better, because it was then that I sat up the system. For recent kernels, the
drift is about 30% when the system is idle, thus w/o any network traffic. When
data was transferred over the network, the clock ran forward at up to multiples
of the nominal speed, although for recent kernels this has become less severe.

The issue only occurs for the "SMP" kernel when "APIC" is enabled. Booting the
"UP" kernel or adding "noapic" to the kernel options makes the clock behave
correctly. Consequently, this bug has nothing to with with recent bug reports
about observing clock drift on (single-core) AMD64 systems!

This bug has been roaming for almost 5 years without ever being looked at. I
think it is definitely worth being considered for the hopefully soon to come
kernel maintenance release which has been suggested by A. Morton.

Steps to reproduce:
Install 2.6.17-git5 on PR440FX based system and reboot without adding "noapic"
to the kernel options.
Comment 1 Andrew Morton 2006-06-25 11:02:37 UTC
On Sun, 25 Jun 2006 10:56:32 -0700
bugme-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=6748
> 
>            Summary: Clock drifts by 30% for SMP kernel w/APIC
>     Kernel Version: 2.6.17-git5
>             Status: NEW
>           Severity: high
>              Owner: platform_i386@kernel-bugs.osdl.org
>          Submitter: jfrieben@hotmail.com

Bizarre.  What could cause such large errors?

Comment 2 Joachim Frieben 2006-06-25 11:18:20 UTC
Please note that in my original report, I had mentioned the following bug report:

  https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=24680

which confirms my observation. I can thus rule out a hardware issue. Btw, it
seems that downgrading to a 2.2.x "SMP" kernel solves the issue ;)
Comment 3 Roman Zippel 2006-06-26 09:50:11 UTC
Hi,

On Sun, 25 Jun 2006, Andrew Morton wrote:

> Bizarre.  What could cause such large errors?

I have to agree with Alan, this basically can only happen if something is 
triggering spurious timer interrupts. This needs someone with more 
knowledge about APIC interrupt routing.
BTW the new clock code might make problem "disappear", as the time is not 
always directly incremented with every timer interrupt anymore.

bye, Roman

Comment 4 Joachim Frieben 2006-07-04 04:28:11 UTC
Still broken in kernel version "2.6.17-git15".
Comment 5 john stultz 2006-07-10 11:40:48 UTC
Joachim: As Roman said, the new timekeeping code might resolve this issue
(spruious interrupts won't cause time to increase). Do you still see this
problem w/ 2.6.18-rc1+?
Comment 6 Joachim Frieben 2006-07-15 05:26:52 UTC
No luck with kernel "2.6.18-rc1-git7" :(
Comment 7 john stultz 2006-09-21 13:00:57 UTC
Joachim, Thanks for your tenacity on this issue. :)
One possibility would be to add a dmi blacklist entry for your system, so the
apic will always be disabled. Want to add dmidecode output to this bug and we'll
see what we can do?
Comment 8 Adrian Bunk 2007-02-06 14:35:03 UTC
Please reopen this bug if:
- it is still present with kernel 2.6.20 and
- you can provide the requested information.
Comment 9 Joachim Frieben 2007-02-07 00:16:00 UTC
In reply to comment #8:
Hi Adrian, which is the "requested information" that I have allegedly failed to
produce? It is obvious to me that "blacklisting" the "PR440FX" according to
comment #7 is not a serious proposition. I am not aware of anything missing ..
Comment 10 Dan Carpenter 2007-02-07 00:36:35 UTC
Blacklist in comment #7 just means write a check to automatically disable the
apic so the users don't have to screw with grub.  You're doing it anyway, so we
might as well make it automatic until someone finds a better fix.

Could you type "dmidecode" and attach the output?

Comment 11 Joachim Frieben 2007-02-07 01:22:25 UTC
In reply to comment #10:
Hardwiring "noapic" for this particular piece of hardware implies that it would
not even be possible to test future kernels with respect to the current bug.
Comment 12 Thomas Gleixner 2007-02-07 07:18:30 UTC
Joachim,

can you boot a recent -mm kernel on it? Please disable CONFIG_NOHZ and
CONFIG_HIGH_RES_TIMERS. Add apic=verbose to the kernel commandline and report
the APIC timer init output.

Thanks

 tglx
Comment 13 Joachim Frieben 2007-06-12 09:38:27 UTC
After upgrading from "Fedora core 6" to "Fedora7", all reported problems have disappeared. The current kernel version is 2.6.21.2. Thanks!

Note You need to log in before you can comment on or make changes to this bug.