Bug 6296 - Time stops on IBM Netvista 8317
Summary: Time stops on IBM Netvista 8317
Status: CLOSED CODE_FIX
Alias: None
Product: Timers
Classification: Unclassified
Component: gettimeofday (show other bugs)
Hardware: i386 Linux
: P2 normal
Assignee: john stultz
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-03-28 05:28 UTC by Andy Duplain
Modified: 2010-01-05 00:57 UTC (History)
3 users (show)

See Also:
Kernel Version: 2.6.16
Subsystem:
Regression: No
Bisected commit-id:


Attachments
Kernel logging (27.36 KB, text/plain)
2006-03-29 01:01 UTC, Andy Duplain
Details
Kernel log with "noapic" boot option (25.90 KB, text/plain)
2006-03-29 12:06 UTC, Andy Duplain
Details
Patch to fix problem in 2.6.16 and 2.6.16.1 (847 bytes, patch)
2006-03-31 01:45 UTC, Andy Duplain
Details | Diff
x86 ioapic timer ACK fix (3.22 KB, patch)
2007-11-18 07:37 UTC, Ingo Molnar
Details | Diff

Description Andy Duplain 2006-03-28 05:28:36 UTC
Most recent kernel where this bug did not occur: 2.6.15 (with correcting 
patches)
Distribution: Debian Etch
Hardware Environment: IBM Netvista 8317
Software Environment: Linux 2.6.16
Problem Description: Time slows down and stops when using TSC timesource.

Steps to reproduce:
Boot vanilla kernel 2.6.16
Wait 12-24 hours.
Note time difference from actual time.
Comment 1 Andy Duplain 2006-03-28 05:38:36 UTC
This bug is the same as 2544, but the patch from Maciej W. Rozycki no longer 
works under 2.6.16.  I have the latest BIOS installed - dated July 2004.
Comment 2 Andy Duplain 2006-03-29 01:01:51 UTC
Created attachment 7699 [details]
Kernel logging

This is my kernel log file during boot-up.  No errors are reported, however.
Comment 3 john stultz 2006-03-29 10:45:48 UTC
Hmm. This is a uniprocessor system without HT? That is different then other
similar reports. Does booting w/ noapic avoid the issue?
Comment 4 Andy Duplain 2006-03-29 11:48:16 UTC
I will try it and let you know John.
Comment 5 Andy Duplain 2006-03-29 12:06:29 UTC
Created attachment 7711 [details]
Kernel log with "noapic" boot option
Comment 6 Andy Duplain 2006-03-30 06:01:52 UTC
The system has been running for 15 hours now.  No errors reported and time is 
keeping well (NTP corrects by 1.0 - 1.5 sec/hr - normally it's 0.15 sec/hr - so 
no big change there).  System is very sluggish though and atsar who absolutetly 
0 zero cpu usage over that period, even though there are several heavy CPU 
processes running. Therefore the system seems better with the "noapic" kernel 
boot flag than without.
Comment 7 Andy Duplain 2006-03-31 01:45:13 UTC
Created attachment 7729 [details]
Patch to fix problem in 2.6.16 and 2.6.16.1

This is a modified version of the original patch from Maciej W. Rozycki for
2.6.16 (and .1).  My system has been running without problem for 14 hours now
with it applied.
Comment 8 Andy Crook 2006-10-02 00:39:05 UTC
The same problem on NetVista 8309 (latest BIOS) with 2.6.17.*
Patch fixed the problem for me.
Comment 9 john stultz 2006-10-18 13:15:16 UTC
I suspect this issue still exists, but I'm curious if the behavior has changed
w/ 2.6.18 and greater?
Comment 10 Andy Crook 2007-01-11 01:31:53 UTC
With unpatched 2.6.18 and 2.6.19 time still stopped occasionally.
Hopefully patch still helps fixing the problem on my NetVista.
Comment 11 Natalie Protasevich 2007-07-07 12:55:04 UTC
Have you tried running with later kernels, can you confirm please that the problem has been resolved?
Thanks.
Comment 12 Andy Crook 2007-07-17 01:06:46 UTC
Seems like 2.6.21.6 runs OK on my NetVista for 2 days now.

I've changed my NetVista 8309 for 8319 recently, but still had to apply the patch to 2.6.20 kernel.

So I'll wait 2-3 days more and let you know if everything is good now.
Comment 13 Ingo Molnar 2007-11-18 07:35:51 UTC
This patch is still not upstream - but we are now defaulting to NMI watchdog disabled, which might hide the bug.

I've ported the patch to arch/x86 and we've added it to the x86 queue of patches.
Comment 14 Ingo Molnar 2007-11-18 07:37:08 UTC
Created attachment 13597 [details]
x86 ioapic timer ACK fix

Attached the 2.6.24-rc3 ported patch.

Note You need to log in before you can comment on or make changes to this bug.