Bug 5545 - Timer drift too high on >=2.6.10
Summary: Timer drift too high on >=2.6.10
Status: CLOSED INVALID
Alias: None
Product: Timers
Classification: Unclassified
Component: Realtime Clock (show other bugs)
Hardware: i386 Linux
: P2 normal
Assignee: john stultz
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-11-03 14:29 UTC by Jean-Chrstian de Rivaz
Modified: 2005-11-04 11:47 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.14
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
linux 2.6.9 kernel log (12.65 KB, text/plain)
2005-11-03 14:31 UTC, Jean-Chrstian de Rivaz
Details
2.6.9 kernel log (28.11 KB, text/plain)
2005-11-03 14:35 UTC, Jean-Chrstian de Rivaz
Details
2.6.10 kernel log (30.78 KB, text/plain)
2005-11-03 14:36 UTC, Jean-Chrstian de Rivaz
Details

Description Jean-Chrstian de Rivaz 2005-11-03 14:29:31 UTC
Most recent kernel where this bug did not occur: 2.6.9
Distribution: Debian Sarge
Hardware Environment: MSI K7N2 Delta + Athlon XP 3200+ 2.0GHz (downclocked)
Software Environment: 
Problem Description:

ntpd fail to work starting 2.6.10 on this hardware. Also the drift-test.py from
John Stultz show a very high drift starting 2.6.10:

A have tested 7 differents vanilla kernel on the same suspect hardware:

               2.6.8  : ntpd working : drift from    -77ppm to   -144ppm
               2.6.9  : ntpd working : drift from    -99ppm to   -231ppm
               2.6.10 : ntpd failed  : drift from -37825ppm to -29912ppm
               2.6.12 : ntpd failed  : drift from -43429ppm to -45251ppm
CONFIG_HZ=100  2.6.14 : ntpd failed  : drift from  -7598ppm to  -4410ppm
CONFIG_HZ=250  2.6.14 : ntpd failed  : drift from -13519ppm to -12538ppm
CONFIG_HZ=1000 2.6.14 : ntpd failed  : drift from -14497ppm to -19543ppm 

Steps to reproduce:

Use 'ntpdate <server>' then start ntpd. With ntpq use the 'pe', 'assID' and 'rv
<id>' command to wait at least 5 polls, then ntpd should show the 'sys.peer'
condition if it work or 'rejected' if it fail.

See the 'NTP broken with 2.6.14' thead on the LKLM.
Comment 1 Jean-Chrstian de Rivaz 2005-11-03 14:31:34 UTC
Created attachment 6462 [details]
linux 2.6.9 kernel log
Comment 2 Jean-Chrstian de Rivaz 2005-11-03 14:35:56 UTC
Created attachment 6463 [details]
2.6.9 kernel log
Comment 3 Jean-Chrstian de Rivaz 2005-11-03 14:36:33 UTC
Created attachment 6464 [details]
2.6.10 kernel log
Comment 4 john stultz 2005-11-03 15:02:17 UTC
Looking at the dmessage differences, I suspect this is related to changes in the
ACPI layer.

Here are some snippits of the diff.
2.6.9 vs 2.6.10
-talla kernel: ACPI: BIOS IRQ0 pin2 override ignored.
 talla kernel: ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
 talla kernel: ACPI: INT_SRC_OVR (bus 0 bus_irq 14 global_irq 14 high edge)
 talla kernel: ACPI: INT_SRC_OVR (bus 0 bus_irq 15 global_irq 15 high edge)
+talla kernel: ACPI: IRQ0 used by override.
+talla kernel: ACPI: IRQ2 used by override.
 talla kernel: ACPI: IRQ9 used by override.
 talla kernel: ACPI: IRQ14 used by override.
 talla kernel: ACPI: IRQ15 used by override.
 talla kernel: Enabling APIC mode:  Flat.  Using 1 I/O APICs
 talla kernel: Using ACPI (MADT) for SMP configuration information
 talla kernel: Built 1 zonelists
-talla kernel: mapped APIC to ffffd000 (fee00000)
-talla kernel: mapped IOAPIC to ffffc000 (fec00000)
 talla kernel: ENABLING IO-APIC IRQs
-talla kernel: ..TIMER: vector=0x31 pin1=0 pin2=-1
+talla kernel: ..TIMER: vector=0x31 pin1=2 pin2=-1
+talla kernel: ..MP-BIOS bug: 8254 timer not connected to IO-APIC
+talla kernel: ...trying to set up timer (IRQ0) through the 8259A ...  failed.
+talla kernel: ...trying to set up timer as Virtual Wire IRQ... failed.
+talla kernel: ...trying to set up timer as ExtINT IRQ... works.
 talla kernel: NET: Registered protocol family 16
 talla kernel: PCI: PCI BIOS revision 2.10 entry at 0xfbbb0, last bus=3
 talla kernel: PCI: Using configuration type 1
 talla kernel: mtrr: v2.0 (20020519)
-talla kernel: ACPI: Subsystem revision 20041105
+talla kernel: ACPI: Subsystem revision 20040816
Comment 6 john stultz 2005-11-03 20:03:51 UTC
Ack. The diff above in comment #4 is backward. The -'s are 2.6.10 and the +'s
are 2.6.9. I mistakenly saved the files with the wrong names.

This aligns with Len Brown's note on lkml:
"NFORCE2 on an ACPI-enabled kernel should automatically invoke
the acpi_skip_timer_override BIOS workaround -- as
the NFORCE family of chip-sets have the timer interrupt
attached to pin-0, but some of them shipped with
a bogus BIOS over-ride telling Linux the timer is on pin-2."

Still not sure why the problem crops up after this fix has been included.

The reporter is having pretty sever BIOS issues, so until they are resolved, I'm
thinking we should mark this as INVALID. 


Jean-Chrstian: Please reopen this if the problem persists after you get your
BIOS issues sorted. Thanks again for the great testing and feedback on lkml!
Comment 7 john stultz 2005-11-04 11:47:26 UTC
Just as a followup, Jean-Christian resolved his BIOS confusion and updated to
the current BIOS. Now this issue does not appear. I'm marking this as closed.

Note You need to log in before you can comment on or make changes to this bug.