Created attachment 23447 [details] lshw + kernel config file Hello, With 2.6.31.4, dmesg : [ 92.387360] CE: hpet increasing min_delta_ns to 15000 nsec [ 283.944283] CE: hpet increasing min_delta_ns to 22500 nsec [ 647.384454] CE: hpet increasing min_delta_ns to 33750 nsec [ 767.384264] CE: hpet increasing min_delta_ns to 50624 nsec [ 908.384290] CE: hpet increasing min_delta_ns to 75936 nsec [ 956.384337] CE: hpet increasing min_delta_ns to 113904 nsec [ 977.384378] CE: hpet increasing min_delta_ns to 170856 nsec [ 1004.384320] CE: hpet increasing min_delta_ns to 256284 nsec [ 1169.384432] CE: hpet increasing min_delta_ns to 384426 nsec [ 1265.387316] CE: hpet increasing min_delta_ns to 576638 nsec [ 1268.384294] CE: hpet increasing min_delta_ns to 864956 nsec with 2.6.31.3 and before messages don't go above 33750. My system is a laptop using x86_64 ubuntu karmic snapshot, custom kernel. lshw + config file attached. Regards, Thibault Mondary
(switched to email. Please respond via emailed reply-to-all, not via the bugzilla web interface). On Sat, 17 Oct 2009 09:29:35 GMT bugzilla-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=14426 > > Summary: CE: hpet increasing min_delta_ns flood > Product: Timers > Version: 2.5 > Kernel Version: 2.6.31.4 > Platform: All > OS/Version: Linux > Tree: Mainline > Status: NEW > Severity: normal > Priority: P1 > Component: Realtime Clock > AssignedTo: timers_realtime-clock@kernel-bugs.osdl.org > ReportedBy: thibm@free.fr > Regression: Yes > > > Created an attachment (id=23447) > --> (http://bugzilla.kernel.org/attachment.cgi?id=23447) > lshw + kernel config file > > Hello, > > With 2.6.31.4, dmesg : > > [ 92.387360] CE: hpet increasing min_delta_ns to 15000 nsec > [ 283.944283] CE: hpet increasing min_delta_ns to 22500 nsec > [ 647.384454] CE: hpet increasing min_delta_ns to 33750 nsec > [ 767.384264] CE: hpet increasing min_delta_ns to 50624 nsec > [ 908.384290] CE: hpet increasing min_delta_ns to 75936 nsec > [ 956.384337] CE: hpet increasing min_delta_ns to 113904 nsec > [ 977.384378] CE: hpet increasing min_delta_ns to 170856 nsec > [ 1004.384320] CE: hpet increasing min_delta_ns to 256284 nsec > [ 1169.384432] CE: hpet increasing min_delta_ns to 384426 nsec > [ 1265.387316] CE: hpet increasing min_delta_ns to 576638 nsec > [ 1268.384294] CE: hpet increasing min_delta_ns to 864956 nsec > > > with 2.6.31.3 and before messages don't go above 33750. > > > My system is a laptop using x86_64 ubuntu karmic snapshot, custom kernel. > lshw > + config file attached. > It's a regression (or at lesat a worrisome-looking change) within -stable, and is presumably also present in 2.6.32-rcX. ftp://ftp.kernel.org/pub/linux/kernel/v2.6/ChangeLog-2.6.31.4 is pretty short. This has to be a prime suspect: commit 112a62ddb32952943517bd170f0eeb26a6f0738f Author: john stultz <johnstul@us.ibm.com> Date: Thu Oct 8 13:31:45 2009 -0700 PIT fixes to unbreak suspend/resume (bug #14222) And commit 94c517d6c84e819035f58a4748066d1929681d0e Author: Eero Nurkkala <ext-eero.nurkkala@nokia.com> Date: Wed Oct 7 11:54:26 2009 +0300 NOHZ: update idle state also when NOHZ is inactive
(arg, resend with reporter cc'ed) (switched to email. Please respond via emailed reply-to-all, not via the bugzilla web interface). On Sat, 17 Oct 2009 09:29:35 GMT bugzilla-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=14426 > > Summary: CE: hpet increasing min_delta_ns flood > Product: Timers > Version: 2.5 > Kernel Version: 2.6.31.4 > Platform: All > OS/Version: Linux > Tree: Mainline > Status: NEW > Severity: normal > Priority: P1 > Component: Realtime Clock > AssignedTo: timers_realtime-clock@kernel-bugs.osdl.org > ReportedBy: thibm@free.fr > Regression: Yes > > > Created an attachment (id=23447) > --> (http://bugzilla.kernel.org/attachment.cgi?id=23447) > lshw + kernel config file > > Hello, > > With 2.6.31.4, dmesg : > > [ 92.387360] CE: hpet increasing min_delta_ns to 15000 nsec > [ 283.944283] CE: hpet increasing min_delta_ns to 22500 nsec > [ 647.384454] CE: hpet increasing min_delta_ns to 33750 nsec > [ 767.384264] CE: hpet increasing min_delta_ns to 50624 nsec > [ 908.384290] CE: hpet increasing min_delta_ns to 75936 nsec > [ 956.384337] CE: hpet increasing min_delta_ns to 113904 nsec > [ 977.384378] CE: hpet increasing min_delta_ns to 170856 nsec > [ 1004.384320] CE: hpet increasing min_delta_ns to 256284 nsec > [ 1169.384432] CE: hpet increasing min_delta_ns to 384426 nsec > [ 1265.387316] CE: hpet increasing min_delta_ns to 576638 nsec > [ 1268.384294] CE: hpet increasing min_delta_ns to 864956 nsec > > > with 2.6.31.3 and before messages don't go above 33750. > > > My system is a laptop using x86_64 ubuntu karmic snapshot, custom kernel. > lshw > + config file attached. > It's a regression (or at lesat a worrisome-looking change) within -stable, and is presumably also present in 2.6.32-rcX. ftp://ftp.kernel.org/pub/linux/kernel/v2.6/ChangeLog-2.6.31.4 is pretty short. This has to be a prime suspect: commit 112a62ddb32952943517bd170f0eeb26a6f0738f Author: john stultz <johnstul@us.ibm.com> Date: Thu Oct 8 13:31:45 2009 -0700 PIT fixes to unbreak suspend/resume (bug #14222) And commit 94c517d6c84e819035f58a4748066d1929681d0e Author: Eero Nurkkala <ext-eero.nurkkala@nokia.com> Date: Wed Oct 7 11:54:26 2009 +0300 NOHZ: update idle state also when NOHZ is inactive
Created attachment 23642 [details] Debug patch Mind trying this patch out to see why clockevents_program_event is failing? If you could boot with it and then attach the full dmesg it would be great. thanks -john
Created attachment 23681 [details] Dmesg using debug patch
Created attachment 23683 [details] Another debug patch Thanks for the testing Thibault! Looking at the output, the hpet's set_next_event() function is failing, which I can't see as happening due to the commits listed above (thought I could be wrong). Could you run again with this debug patch, which should give us some more insight about what's actually happening with the hpet->set_next_event() code? thanks -john
I cannot boot using your lastest patch. Early in the boot process (before mounting root fs) I get an infinite loop of "hpet_next_event: now: %ld then: %ld delta: %ld" with increasing values ; I have to do an hard reset to shutdown the computer. It might be as cnt2 is always > cnt Thibault
Thibault, Did you happen to capture any of the output? I'd be interested in seeing how drastically it was off by. Don't have time today, but if you can't capture any output, I'll try to rework the debug patch so its rate limited and won't kill bootup. thanks -john
Created attachment 23752 [details] rate limited debug patch Here's the rate limited debug patch. I'd be interested to see what the values for cnt, cnt2 and diff are. thanks -john
Created attachment 23759 [details] Screenshot of boot sequence With your lastest debug patch (removing all previously applied patches) I cannot boot as I get flooded of "CE : increasing min_delta_ns to xxxxxxxxx" with xxxxxxxx a changing big number I haven't time to read. I modified kernel/time/tick-oneshot.c to suppress this message and reboot. It fails with a kernel segfault but I can see your debug messages, I took a "screenshot". Thibault
Created attachment 23761 [details] yet another debug patch Sigh. Forgive me, I got caught by a sign bug, and so the previous debug patch was outputing everything. I've reworked some of the code so its more clear whats going on, and have actually tested this in a kvm session. I'd be interested to see how it acts on your box. Sorry again for all the failed attempts so far. -john
Created attachment 23769 [details] Dmesg using lastest debug patch Now it works, so I attached full dmesg. Thibault
Hey Thibault, sorry for the slow response here. Looking at the log, I'm not seeing any sign issues or overflows. It seems instead that you're occasionally seeing ~23-30k cycle gaps. Not sure right off what might be causing it, but it could be hardware SMIs, or long running irq code preempting you. Running out of ideas, so a few things you might check: 1) Does 2.6.32-rc8 show this behavior? 2) Does running w/ nohz=off change the behavior? Another item I noticed: [ 4.325227] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 190.42 Tue Oct 20 20:25:42 PDT 2009 Can you reproduce this behavior without the nvidia binary module?
Hi John, I haven't yet tested 1 and 3 but running with nohz=off solve the problem, no more min_delta_ns messages. Thibault (In reply to comment #12) > Hey Thibault, sorry for the slow response here. > > Looking at the log, I'm not seeing any sign issues or overflows. It seems > instead that you're occasionally seeing ~23-30k cycle gaps. Not sure right > off > what might be causing it, but it could be hardware SMIs, or long running irq > code preempting you. > > Running out of ideas, so a few things you might check: > > 1) Does 2.6.32-rc8 show this behavior? > 2) Does running w/ nohz=off change the behavior? > > Another item I noticed: > [ 4.325227] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 190.42 Tue > Oct > 20 20:25:42 PDT 2009 > > Can you reproduce this behavior without the nvidia binary module?
On Mon, Nov 02, 2009 at 10:31:28PM -0800, Andrew Morton wrote: > > (arg, resend with reporter cc'ed) > > (switched to email. Please respond via emailed reply-to-all, not via the > bugzilla web interface). > > On Sat, 17 Oct 2009 09:29:35 GMT bugzilla-daemon@bugzilla.kernel.org wrote: > > > http://bugzilla.kernel.org/show_bug.cgi?id=14426 > > > > Summary: CE: hpet increasing min_delta_ns flood > > Product: Timers > > Version: 2.5 > > Kernel Version: 2.6.31.4 > > Platform: All > > OS/Version: Linux > > Tree: Mainline > > Status: NEW > > Severity: normal > > Priority: P1 > > Component: Realtime Clock > > AssignedTo: timers_realtime-clock@kernel-bugs.osdl.org > > ReportedBy: thibm@free.fr > > Regression: Yes > > > > > > Created an attachment (id=23447) > > --> (http://bugzilla.kernel.org/attachment.cgi?id=23447) > > lshw + kernel config file > > > > Hello, > > > > With 2.6.31.4, dmesg : > > > > [ 92.387360] CE: hpet increasing min_delta_ns to 15000 nsec > > [ 283.944283] CE: hpet increasing min_delta_ns to 22500 nsec > > [ 647.384454] CE: hpet increasing min_delta_ns to 33750 nsec > > [ 767.384264] CE: hpet increasing min_delta_ns to 50624 nsec > > [ 908.384290] CE: hpet increasing min_delta_ns to 75936 nsec > > [ 956.384337] CE: hpet increasing min_delta_ns to 113904 nsec > > [ 977.384378] CE: hpet increasing min_delta_ns to 170856 nsec > > [ 1004.384320] CE: hpet increasing min_delta_ns to 256284 nsec > > [ 1169.384432] CE: hpet increasing min_delta_ns to 384426 nsec > > [ 1265.387316] CE: hpet increasing min_delta_ns to 576638 nsec > > [ 1268.384294] CE: hpet increasing min_delta_ns to 864956 nsec > > > > > > with 2.6.31.3 and before messages don't go above 33750. > > > > > > My system is a laptop using x86_64 ubuntu karmic snapshot, custom kernel. > lshw > > + config file attached. > > > > It's a regression (or at lesat a worrisome-looking change) within > -stable, and is presumably also present in 2.6.32-rcX. > > ftp://ftp.kernel.org/pub/linux/kernel/v2.6/ChangeLog-2.6.31.4 is pretty > short. > > This has to be a prime suspect: > > commit 112a62ddb32952943517bd170f0eeb26a6f0738f > Author: john stultz <johnstul@us.ibm.com> > Date: Thu Oct 8 13:31:45 2009 -0700 > > PIT fixes to unbreak suspend/resume (bug #14222) > > And > > commit 94c517d6c84e819035f58a4748066d1929681d0e > Author: Eero Nurkkala <ext-eero.nurkkala@nokia.com> > Date: Wed Oct 7 11:54:26 2009 +0300 > > NOHZ: update idle state also when NOHZ is inactive John, Eero, any ideas? thanks, greg k-h
Hi, This bug seems to be nvidia-related. I tested using another version of nvidia drivers (185.18.36), and the bug disappears (I got [ 1914.080316] CE: hpet increasing min_delta_ns to 33750 nsec instead of [ 1268.384294] CE: hpet increasing min_delta_ns to 864956 nsec ) It might be a 190 and 192 series related bug, perhaps due to "* Fixed a power management regression that prevented some notebooks and systems with integrated GeForce 8 and 9 series GPUs from suspending." in nvidia changelog. Thibault
Oops the previous comment was probably too ambitious... I just got : [ 2368.632441] CE: hpet increasing min_delta_ns to 170856 nsec [ 2485.632331] CE: hpet increasing min_delta_ns to 256284 nsec [ 2524.632348] CE: hpet increasing min_delta_ns to 384426 nsec [ 2551.632308] CE: hpet increasing min_delta_ns to 576638 nsec so the problem isn't fixed but it appears slowly than before when I use 185.18.36 nvidia driver. Thibault
Thibault, How about without the nvidia driver all together?
On Sat, 2009-12-05 at 11:50 -0800, Greg KH wrote: > On Mon, Nov 02, 2009 at 10:31:28PM -0800, Andrew Morton wrote: > > > > (arg, resend with reporter cc'ed) > > > > (switched to email. Please respond via emailed reply-to-all, not via the > > bugzilla web interface). > > > > On Sat, 17 Oct 2009 09:29:35 GMT bugzilla-daemon@bugzilla.kernel.org wrote: > > > > > http://bugzilla.kernel.org/show_bug.cgi?id=14426 > > > > > > Summary: CE: hpet increasing min_delta_ns flood > > > Product: Timers > > > Version: 2.5 > > > Kernel Version: 2.6.31.4 > > > Platform: All > > > OS/Version: Linux > > > Tree: Mainline > > > Status: NEW > > > Severity: normal > > > Priority: P1 > > > Component: Realtime Clock > > > AssignedTo: timers_realtime-clock@kernel-bugs.osdl.org > > > ReportedBy: thibm@free.fr > > > Regression: Yes > > > > > > > > > Created an attachment (id=23447) > > > --> (http://bugzilla.kernel.org/attachment.cgi?id=23447) > > > lshw + kernel config file > > > > > > Hello, > > > > > > With 2.6.31.4, dmesg : > > > > > > [ 92.387360] CE: hpet increasing min_delta_ns to 15000 nsec > > > [ 283.944283] CE: hpet increasing min_delta_ns to 22500 nsec > > > [ 647.384454] CE: hpet increasing min_delta_ns to 33750 nsec > > > [ 767.384264] CE: hpet increasing min_delta_ns to 50624 nsec > > > [ 908.384290] CE: hpet increasing min_delta_ns to 75936 nsec > > > [ 956.384337] CE: hpet increasing min_delta_ns to 113904 nsec > > > [ 977.384378] CE: hpet increasing min_delta_ns to 170856 nsec > > > [ 1004.384320] CE: hpet increasing min_delta_ns to 256284 nsec > > > [ 1169.384432] CE: hpet increasing min_delta_ns to 384426 nsec > > > [ 1265.387316] CE: hpet increasing min_delta_ns to 576638 nsec > > > [ 1268.384294] CE: hpet increasing min_delta_ns to 864956 nsec > > > > > > > > > with 2.6.31.3 and before messages don't go above 33750. > > > > > > > > > My system is a laptop using x86_64 ubuntu karmic snapshot, custom kernel. > lshw > > > + config file attached. > > > > > > > It's a regression (or at lesat a worrisome-looking change) within > > -stable, and is presumably also present in 2.6.32-rcX. > > > > ftp://ftp.kernel.org/pub/linux/kernel/v2.6/ChangeLog-2.6.31.4 is pretty > short. > > > > This has to be a prime suspect: > > > > commit 112a62ddb32952943517bd170f0eeb26a6f0738f > > Author: john stultz <johnstul@us.ibm.com> > > Date: Thu Oct 8 13:31:45 2009 -0700 > > > > PIT fixes to unbreak suspend/resume (bug #14222) > > > > And > > > > commit 94c517d6c84e819035f58a4748066d1929681d0e > > Author: Eero Nurkkala <ext-eero.nurkkala@nokia.com> > > Date: Wed Oct 7 11:54:26 2009 +0300 > > > > NOHZ: update idle state also when NOHZ is inactive > > > John, Eero, any ideas? This may be nvidia binary driver related. -john
(In reply to comment #17) > Thibault, How about without the nvidia driver all together? This bug isn't nvidia related. I have tried without nvidia driver and module, nothing has changed. But I think it's cpufreq + nohz related. I have tried to boot normally (ie without nohz option), without nvidia module, but setting cpufreq governor to performance instead of ondemand, and messages go away. Thibault
Hi, I have similar issue. On both 2.6.31 and 2.6.32. # dmesg | grep min_delta_ns [ 432.445091] CE: hpet increasing min_delta_ns to 15000 nsec [ 702.977006] CE: hpet increasing min_delta_ns to 22500 nsec [ 702.977006] CE: hpet increasing min_delta_ns to 33750 nsec [53371.336086] CE: hpet increasing min_delta_ns to 50624 nsec I have HP EliteBook 8530p (with ATI Radeon HD 3650) running Gentoo with custom kernel config. I'm going to try your debug patch and will return with results.
Created attachment 24766 [details] dmesg output w/ hpet_next_event() warning Hello. I was forced to upgrade my system from a C2D (X48 chipset) to a i7 860 (P55) and ever since started seeing "CE: hpet increasing min_delta_ns to 15000 nsec". Same kernel, same software, just different dimensio... um... hardware. :-) I have attached my dmesg output which contains an interestng warning about hpet_next_event() which might be related? Is there anything I can do to help diagnose this problem? So long, matthew
Hellow. I had the same problems. I blamed nvidia in this problem. But I have found the solution: acpi_irq_nobalance kernel parameter. May be it will help somebody also!
Hi, This problem seems to be fixed on my system since 2.6.32.7, perhaps because of : - nohz: Prevent clocksource wrapping during idle or : - x86: Reenable TSC sync check at boot, even with NONSTOP_TSC I get only few min_delta_ns messages. Thibault
Hi, I also get these (2.6.32.6) [ 406.748873] CE: hpet increasing min_delta_ns to 15000 nsec [ 2782.241638] CE: hpet increasing min_delta_ns to 22500 nsec [ 5214.726746] CE: hpet increasing min_delta_ns to 33750 nsec [39907.113901] CE: hpet2 increasing min_delta_ns to 15000 nsec [291373.877748] CE: hpet3 increasing min_delta_ns to 15000 nsec [465800.113054] CE: hpet increasing min_delta_ns to 50624 nsec Will also move to 2.6.32.7 and see how it works. Thanks for the new report. Justin.
With 2.6.32.7: [ 406.748873] CE: hpet increasing min_delta_ns to 15000 nsec [ 2782.241638] CE: hpet increasing min_delta_ns to 22500 nsec [ 5214.726746] CE: hpet increasing min_delta_ns to 33750 nsec [39907.113901] CE: hpet2 increasing min_delta_ns to 15000 nsec [291373.877748] CE: hpet3 increasing min_delta_ns to 15000 nsec [465800.113054] CE: hpet increasing min_delta_ns to 50624 nsec
[465800.113054] CE: hpet increasing min_delta_ns to 50624 nsec [465800.113054] CE: hpet increasing min_delta_ns to 50624 nsec You got the same hpet events at exactly the same time ? Are you sure you're looking at the right "dmesg" ? On my board increasing events are random, and I think they are related to cpufreq "ondemand" frequency changes (ie setting performance governor solve the problem). But since 2.6.32.7 I'm (and I think you too) no flooded anymore (compare my "[ 1268.384294] CE: hpet increasing min_delta_ns to 864956 nsec" from 2.6.31.4) Thibault
Hi, Never had an issue with the flooding as you reported, just over time you will see quite a few of them. In reference to that one entry with the two different kernels it appears to run at exactly the same time for the 465800 entry. I was just curious and it seems the change in cpufreq frequency changes are what cause these entries. If 2.6.32.7 is working for you and continues to work OK for others it appears then the issues are resolved? Thanks. Justin.
No finally this issue isn't yet resolved for me. Today I got : [ 1240.241299] CE: hpet increasing min_delta_ns to 15000 nsec /*Browsing with firefox..*/ [ 2172.159156] CE: hpet increasing min_delta_ns to 22500 nsec /*Play quake3like game with wine*/ [ 3466.310718] hrtimer: interrupt too slow, forcing clock min delta to 4859694 ns [ 3844.311338] CE: hpet increasing min_delta_ns to 33750 nsec [ 4570.309851] CE: hpet increasing min_delta_ns to 50624 nsec /*Finished playing, now using firefox*/ [ 4948.306688] CE: hpet increasing min_delta_ns to 75936 nsec [ 5026.307100] CE: hpet increasing min_delta_ns to 113904 nsec [ 5152.307104] CE: hpet increasing min_delta_ns to 170856 nsec [ 5716.306215] CE: hpet increasing min_delta_ns to 256284 nsec Messages seems to appear later than before, but they appear. The problem isn't yet fixed in my opinion I'll try "acpi_irq_nobalance" tips. Thibault
Hi, With "acpi_irq_nobalance" the problem seems to be solved. I get early [ 13.100313] CE: hpet increasing min_delta_ns to 15000 nsec [ 46.060401] CE: hpet increasing min_delta_ns to 22500 nsec [ 311.963169] CE: hpet increasing min_delta_ns to 33750 nsec but nothing else after 4 hours of usage. So this problem is probably related to cpufreq and irq balancing. Yours, Thibault
I am also going to jump into this bug. I also see this and I have irqbalance process installed. Feb 9 15:07:02 segfault kernel: [16815.019410] CE: hpet increasing min_delta_ns to 50624 nsec Feb 9 15:07:09 segfault kernel: [16821.242430] CE: hpet increasing min_delta_ns to 75936 nsec Feb 9 15:07:16 segfault kernel: [16828.675520] CE: hpet increasing min_delta_ns to 113904 nsec Feb 9 15:07:45 segfault kernel: [16857.425391] CE: hpet increasing min_delta_ns to 170856 nsec I do not know which kernel this started happening at but it is noticed. It might be even more pronounced for me in cases kernel: [15983.822389] CE: hpet increasing min_delta_ns to 22500 nsec I have not seen it go higher lately though. cpu family : 6 model : 23 model name : Intel(R) Core(TM)2 Duo CPU T9400 @ 2.53GHz stepping : 10 cpu MHz : 1600.000 cache size : 6144 KB Today's Linus pull 2.6.33-rc7 + drm patches as well. I do not attribute this to radeon however.
Could someone who can reproduce this issue try bisecting the problem?
Created attachment 25068 [details] Proposed small patch This small patch for 2.6.32.8 is a workaround to the problem on my box. I saw the kernel already tries 2 times to program min_delta_ns. I increased the value to allow the kernel to try 4 times before increasing min_delta_ns. No more warnings, even without acpi_irq_nobalance. I think it's related to cpufreq step time. Please test on your boxes if it works and it has no unattended side effects. Yours, Thibault
John, The problem I was seeing appears to be exacerbated by many cpus all trying to access the tick_broadcast_lock in tick_broadcast_oneshot_control() while going into C3 state. A number of cpus can have their next timer expire while waiting for that spinlock.
sivanich: Hrm.. I'm not sure how the tick_broadcast_lock would interrupt tick_dev_program_event(), since the warning is caused when hpet_next_event() via clockevent_program_event() fails over and over inside that function. Looking at hpet_next_event: cnt = hpet_readl(HPET_COUNTER); cnt += (u32) delta; hpet_writel(cnt, HPET_Tn_CMP(timer)); return (s32)(hpet_readl(HPET_COUNTER) - cnt) >= 0 ? -ETIME : 0; So for this to fail, HPET_COUNTER value must have increased by more then delta in the time between the two reads. We're not hitting any locks. So either there is some sort of hardware issue where the hpet_read isn't always returning sane values (see bug #14996) or something is interrupting us between the two reads and taking a long time. This could be an SMI, or a really bad driver interrupt handler. Now, the side effect of something plugging this up would be tick_program_event taking a long time, which would cause more cpu contention on teh tick_broadcast_lock, but I don't see it as a cause in of itself. I guess one idea would be to disable interrupts in that time to see if it avoids the issue, that way we can tell if its a driver issue or a SMI.
Created attachment 25097 [details] Use spinlock_irqsave to disable irqs in hpet_next_event This patch adds locking around the hpet_next_event internals. Its really just to make sure we're not hitting any sort of driver that's taking a long time while we set the event to expire. I could have probably used local_irq_disable() but this should also remove the possibility that lots of cpus hammering on the hpet_write might cause the write to stall and take longer then expected. It would be interesting to see if the messages continue with this patch or go away. If they continue, its likely a hardware issue or an SMI issue, which we can't really control. If they go away, we'll have to look at what is going on in irq context that is preempting this function for so long. Oh, and this patch is totally un-tested, so my apologies if it doesn't boot.
@john I tested your hpet_lock patch and messages are still there : [ 174.407297] CE: hpet increasing min_delta_ns to 22500 nsec [ 336.409272] CE: hpet increasing min_delta_ns to 33750 nsec [ 621.407307] CE: hpet increasing min_delta_ns to 50624 nsec [ 636.409751] CE: hpet increasing min_delta_ns to 75936 nsec [ 972.408627] CE: hpet increasing min_delta_ns to 113904 nsec Increasing to 3 the number of try in tick-oneshot.c implies I only see 15000 nsec, setting to 4 => messages go away. But sometimes I get "hrtimer: interrupt too slow, forcing clock min delta to" ... warning anyway. On another computer using an atom processor I never see any hpet message, it might be an hardware problem. Thibault
I have the same problem. When using ubuntu 9.10 (2.6.31) and 10.04 (2.6.32) on an Acer Aspire 5920G I see the same hpet warnings (but no warnings using older versions). I never see any hpet messages when running a kernel >= 2.6.31 on an Asus EEE 901go (Atom).
Hi there, Same errors here: [ 934.631254] CE: hpet increased min_delta_ns to 11250 nsec [ 1990.060009] CE: hpet increased min_delta_ns to 16875 nsec Running 2.6.34-rc4-git6, compiled from kernel.org sources, using Debian Sid amd64, on a Vostro 1320 (Core 2 Duo P8700 and Nvidia 9300GS, with nvidia binary drivers).
Hi, also have the same problem here. Intel Core i7 with a Asus P7P55D-E Mainboard. Kernel is Vanilla 2.6.34-0.1. Cheers
This bug is still happening with a Core 2 Duo P8400, running vanilla 2.6.34.
I've been getting this bug on my server for the last month after: - Reconfiguring the bios. - Updating the kernel. My symptoms are (whether it's related to this bug or not i dont know): - This bug: http://lists.alioth.debian.org/pipermail/nut-upsuser/2010-June/006062.html (which i think may be related to timing issues), it would start appearing shortly after restarting the server and never stop with a line appearing every ~2s. - USB and onboard NIC drop outs, they would always come back after about 5 seconds. This would happen after about 2 minutes of maxing out the NIC bandwidth with a file copy. - Flooded logs with the min_delta message. My thoughts were - Faulty South Bridge (I believe NIC and USB are on there). - Disable HPET in bios or in the Kernel (as some have already suggested). - Disable CPU powersaving options (AMD: C1E, CnQ etc. Intel: SpeedStep etc.). - Use an older Kernel. After disabling C1E I have not had any of these issues. Everything is back to normal. I may be quick to jump the gun here but I would experience one of the above symptoms right after starting the server or soon after beginning a file copy. This guy was experiencing clock skewing: http://www.overclockers.com/forums/showthread.php?t=605348 "I was debugging my system why clock drifts occurs so badly. It has gotten to a point that I have to update every 30 minutes in order to update clock every 30 minutes to correct shift in clock that is more than 3 minutes delta." "CnQ works by lowering the CPU multiplier and vCore, so the whole CPU goes into a slowed-down mode, usually half the speed of normal. C1E works on individual cores. It flushes the core's L2 cache into L3 in case there's any data another core might need, then it puts that core into a deep sleep. The difference is the others cores are still running and computing normally. On a quad this means, at the far end, only one core is running while the others are virtually turned off - which is quite a power savings. There's also a lot less heat so a rig using SmartFan too would see a big drop in noise levels as well ..." Intel has a similar feature and probably results in the same effect. I hope this information may be of use, if i experience any of the symptoms ill post back.
Created attachment 32352 [details] dmesg output Noticed this issue on Kernel 2.6.35.7 (AMD Phenom(tm) II X4 940 Processor (4 cpu cores)), dmesg attached.
I have noticed that after a suspend/resume cycle I don't receive any warning of "increasing min_delta_ns" (dell vostro 1700 laptop, core2 T8300, 2.6.36, 64 bits)
Same here on custom compiled 2.6.36.2 kernel
Ok, I'm closing this then.
Does anyone see the problem with the current mainline kernel (2.6.39-rc5+) or the recently released 2.6.38.5 ?
On Mon, 2 May 2011, bugzilla-daemon@bugzilla.kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=14426 > > > Rafael J. Wysocki <rjw@sisk.pl> changed: > > What |Removed |Added > ---------------------------------------------------------------------------- > Status|REOPENED |NEEDINFO > > > > > --- Comment #46 from Rafael J. Wysocki <rjw@sisk.pl> 2011-05-02 20:41:20 --- > Does anyone see the problem with the current mainline kernel (2.6.39-rc5+) or > the recently released 2.6.38.5 ? > > -- > Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email > ------- You are receiving this mail because: ------- > You are on the CC list for the bug. > Hi, $ dmesg | grep min_delta_ns [ 3.769620] CE: hpet5 increased min_delta_ns to 7500 nsec [ 3.786724] CE: hpet5 increased min_delta_ns to 11250 nsec [ 4.275465] CE: hpet2 increased min_delta_ns to 7500 nsec [ 4.296541] CE: hpet2 increased min_delta_ns to 11250 nsec [ 4.662089] CE: hpet6 increased min_delta_ns to 7500 nsec [ 4.846077] CE: hpet6 increased min_delta_ns to 11250 nsec [ 6.870928] CE: hpet4 increased min_delta_ns to 7500 nsec [ 6.891200] CE: hpet4 increased min_delta_ns to 11250 nsec [ 25.753579] CE: hpet3 increased min_delta_ns to 7500 nsec [ 25.773720] CE: hpet3 increased min_delta_ns to 11250 nsec [ 38.525926] CE: hpet increased min_delta_ns to 20113 nsec $ uname -a Linux p34 2.6.38.4 #1 SMP Thu Apr 28 07:25:35 EDT 2011 x86_64 GNU/Linux Not quite 2.6.38.5, but close enough? :) Justin.
Sure. So the problem most likely is still in the mainline.
Its interesting how I get almost the same result as Justin: /home/anders> dmesg | grep min_delta_ns [ 27.092437] CE: hpet4 increased min_delta_ns to 7500 nsec [ 27.092447] CE: hpet4 increased min_delta_ns to 11250 nsec [ 49.188645] CE: hpet3 increased min_delta_ns to 7500 nsec [ 49.188653] CE: hpet3 increased min_delta_ns to 11250 nsec [ 60.883167] CE: hpet6 increased min_delta_ns to 7500 nsec [ 60.883180] CE: hpet6 increased min_delta_ns to 11250 nsec [ 74.245493] CE: hpet5 increased min_delta_ns to 7500 nsec [ 74.245500] CE: hpet5 increased min_delta_ns to 11250 nsec [ 83.023448] CE: hpet2 increased min_delta_ns to 7500 nsec [ 83.023459] CE: hpet2 increased min_delta_ns to 11250 nsec [ 7387.703238] CE: hpet increased min_delta_ns to 20113 nsec /home/anders> uname -a Linux plutten 2.6.38-8-generic-pae #42-Ubuntu SMP Mon Apr 11 05:17:09 UTC 2011 i686 i686 i386 GNU/Linux /home/anders>
In addition, I have to report two hangs with no logfile entries in /var/log/kern.log. The first which flashed caps lock and scroll lock (and a blank screen). The second just the blank screen, no flashing. I also get a lot of entries in /var/log/kern.log that is about the network card. The log entries looks like this: May 15 15:25:56 plutten kernel: [ 7875.662147] r8169 0000:02:00.0: eth0: link up When such a spam flood with theese messages starts, I can get more than one message per second. They can continue for hours. One of the spam floods ended like this: May 15 15:39:02 plutten kernel: [ 8657.661391] r8169 0000:02:00.0: eth0: link up May 15 15:39:16 plutten kernel: [ 8671.768767] r8169 0000:02:00.0: eth0: link up May 15 15:39:17 plutten kernel: [ 8672.321775] r8169 0000:02:00.0: eth0: link up May 15 15:39:17 plutten kernel: [ 8672.616229] r8169 0000:02:00.0: eth0: link up May 15 15:39:17 plutten kernel: [ 8673.212944] r8169 0000:02:00.0: eth0: link up May 16 08:12:07 plutten kernel: [67922.084991] exe (6268): /proc/6268/oom_adj is deprecated, please use /proc/6268/oom_score_adj instead. May 16 16:13:25 plutten kernel: [96644.143048] CE: hpet increased min_delta_ns to 20113 nsec May 16 17:48:49 plutten kernel: [102337.652372] CE: hpet increased min_delta_ns to 30169 nsec May 16 20:18:06 plutten kernel: [111245.669320] CE: hpet increased min_delta_ns to 45253 nsec and then the log was silent until around 22 a clock. It continued: May 16 22:00:49 plutten kernel: [117375.965778] r8169 0000:02:00.0: eth0: link up May 16 22:00:50 plutten kernel: [117376.367557] r8169 0000:02:00.0: eth0: link up May 16 22:00:50 plutten kernel: [117376.407479] r8169 0000:02:00.0: eth0: link up May 16 22:00:52 plutten kernel: [117378.539852] r8169 0000:02:00.0: eth0: link up It continued in 6 minutes and then the system hanged. It was the second hang that I have mentioned, i.e. no flashing diods. I have also got this message: May 15 15:28:21 plutten kernel: [ 8019.916509] net_ratelimit: 4 callbacks suppressed and May 15 15:25:55 plutten kernel: [ 7874.969899] NOHZ: local_softirq_pending 08 ...both coming occasionally at the same time as the eth0 floods.
I am seeing this with kernel 3.0+. It triggers when running the 'xine-ui' multimedia player. With kernels before 3.0, I would always get one or two bumps of hpet's min_delta_ns, but with 3.0 it flooded out of control at least once: Aug 5 18:00:16 localhost kernel: [ 2198.027036] CE: hpet increased min_delta_ns to 20113 nsec Aug 5 18:00:43 localhost kernel: [ 2225.397281] CE: hpet increased min_delta_ns to 30169 nsec Aug 5 18:02:02 localhost kernel: [ 2304.169430] CE: hpet increased min_delta_ns to 45253 nsec Aug 5 18:14:52 localhost kernel: [ 3073.717007] CE: hpet increased min_delta_ns to 67879 nsec Aug 5 18:20:02 localhost kernel: [ 3383.781066] CE: hpet increased min_delta_ns to 101818 nsec Aug 5 18:20:03 localhost kernel: [ 3385.194004] CE: hpet increased min_delta_ns to 152727 nsec Aug 5 18:20:04 localhost kernel: [ 3385.958004] CE: hpet increased min_delta_ns to 229090 nsec Aug 5 18:20:05 localhost kernel: [ 3386.833015] CE: hpet increased min_delta_ns to 343635 nsec Aug 5 18:20:05 localhost kernel: [ 3387.133036] CE: hpet increased min_delta_ns to 515452 nsec Aug 5 18:20:10 localhost kernel: [ 3391.856021] CE: hpet increased min_delta_ns to 773178 nsec Aug 5 18:20:10 localhost kernel: [ 3391.951919] CE: hpet increased min_delta_ns to 1000000 nsec Aug 5 18:21:36 localhost kernel: [ 3478.493604] CE: Reprogramming failure. Giving up Aug 5 18:21:51 localhost kernel: [ 3493.095060] CE: Reprogramming failure. Giving up Aug 5 18:35:33 localhost kernel: [ 4315.255092] CE: Reprogramming failure. Giving up Aug 5 18:38:35 localhost kernel: [ 4497.464004] CE: Reprogramming failure. Giving up Aug 5 18:38:40 localhost kernel: [ 4501.792110] CE: Reprogramming failure. Giving up Aug 5 18:38:40 localhost kernel: [ 4502.216125] CE: Reprogramming failure. Giving up Aug 5 18:38:41 localhost kernel: [ 4502.624099] CE: Reprogramming failure. Giving up Aug 5 19:03:40 localhost kernel: [ 6001.878253] CE: Reprogramming failure. Giving up Aug 5 19:03:40 localhost kernel: [ 6002.075280] CE: Reprogramming failure. Giving up I quickly moved to 3.0.1 when it became available, and I have not seen this sort of total failure yet. I still get this kind of action, though: $ dmesg | grep 'CE:' [ 6572.534521] CE: hpet increased min_delta_ns to 20113 nsec [51451.966819] CE: hpet increased min_delta_ns to 30169 nsec [51451.967019] CE: hpet increased min_delta_ns to 45253 nsec [51451.967019] CE: hpet increased min_delta_ns to 67879 nsec That is slightly worse than what I was seeing pre-3.0; maybe the total failure above was a one-time freak occurrence, but I thought I should at least mention it in case a kernel dev wants to know more about my hardware and/or kernel config.
Still having the same problems. Any clue why so? 2011/8/12 <bugzilla-daemon@bugzilla.kernel.org>: > https://bugzilla.kernel.org/show_bug.cgi?id=14426 > > > Dave Witbrodt <dawitbro@sbcglobal.net> changed: > > What |Removed |Added > ---------------------------------------------------------------------------- > CC| |dawitbro@sbcglobal.net > > > > > --- Comment #51 from Dave Witbrodt <dawitbro@sbcglobal.net> 2011-08-12 > 17:08:33 --- > I am seeing this with kernel 3.0+. It triggers when running the 'xine-ui' > multimedia player. > > With kernels before 3.0, I would always get one or two bumps of hpet's > min_delta_ns, but with 3.0 it flooded out of control at least once: > > Aug 5 18:00:16 localhost kernel: [ 2198.027036] CE: hpet increased > min_delta_ns to 20113 nsec > Aug 5 18:00:43 localhost kernel: [ 2225.397281] CE: hpet increased > min_delta_ns to 30169 nsec > Aug 5 18:02:02 localhost kernel: [ 2304.169430] CE: hpet increased > min_delta_ns to 45253 nsec > Aug 5 18:14:52 localhost kernel: [ 3073.717007] CE: hpet increased > min_delta_ns to 67879 nsec > Aug 5 18:20:02 localhost kernel: [ 3383.781066] CE: hpet increased > min_delta_ns to 101818 nsec > Aug 5 18:20:03 localhost kernel: [ 3385.194004] CE: hpet increased > min_delta_ns to 152727 nsec > Aug 5 18:20:04 localhost kernel: [ 3385.958004] CE: hpet increased > min_delta_ns to 229090 nsec > Aug 5 18:20:05 localhost kernel: [ 3386.833015] CE: hpet increased > min_delta_ns to 343635 nsec > Aug 5 18:20:05 localhost kernel: [ 3387.133036] CE: hpet increased > min_delta_ns to 515452 nsec > Aug 5 18:20:10 localhost kernel: [ 3391.856021] CE: hpet increased > min_delta_ns to 773178 nsec > Aug 5 18:20:10 localhost kernel: [ 3391.951919] CE: hpet increased > min_delta_ns to 1000000 nsec > Aug 5 18:21:36 localhost kernel: [ 3478.493604] CE: Reprogramming failure. > Giving up > Aug 5 18:21:51 localhost kernel: [ 3493.095060] CE: Reprogramming failure. > Giving up > Aug 5 18:35:33 localhost kernel: [ 4315.255092] CE: Reprogramming failure. > Giving up > Aug 5 18:38:35 localhost kernel: [ 4497.464004] CE: Reprogramming failure. > Giving up > Aug 5 18:38:40 localhost kernel: [ 4501.792110] CE: Reprogramming failure. > Giving up > Aug 5 18:38:40 localhost kernel: [ 4502.216125] CE: Reprogramming failure. > Giving up > Aug 5 18:38:41 localhost kernel: [ 4502.624099] CE: Reprogramming failure. > Giving up > Aug 5 19:03:40 localhost kernel: [ 6001.878253] CE: Reprogramming failure. > Giving up > Aug 5 19:03:40 localhost kernel: [ 6002.075280] CE: Reprogramming failure. > Giving up > > I quickly moved to 3.0.1 when it became available, and I have not seen this > sort of total failure yet. I still get this kind of action, though: > > $ dmesg | grep 'CE:' > [ 6572.534521] CE: hpet increased min_delta_ns to 20113 nsec > [51451.966819] CE: hpet increased min_delta_ns to 30169 nsec > [51451.967019] CE: hpet increased min_delta_ns to 45253 nsec > [51451.967019] CE: hpet increased min_delta_ns to 67879 nsec > > That is slightly worse than what I was seeing pre-3.0; maybe the total > failure > above was a one-time freak occurrence, but I thought I should at least > mention > it in case a kernel dev wants to know more about my hardware and/or kernel > config. > > -- > Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email > ------- You are receiving this mail because: ------- > You are on the CC list for the bug. >
Cannot reproduce with linux-3.2. Wish I knew who to thank!
Hi all, May I know exact patch or commit id or version, to resolves this issue. From Comment 18, patch having commit id "112a62ddb32952943517bd170f0eeb26a6f0738f" is applied to file arch/x86/kernel/i8253.c, which is initial suspect. But arch/x86/kernel/i8253.c file is not changed in version v3.2 or v3.2.9, and last change is on date 2011-06-09. From comment 51, issue is present in version 3.0.1 and comment 53 is saying it is resolved in 3.2. Mentioned in one of comments, patch for the fix is applied to file "kernel/time/tick-oneshot.c" but mentioned that it is not exact fix in later comments. Which is exact fix/patch for this bug, please attach the same. Thanks & Regards Satish.G