Bug 9475

Summary: jiffies counter leaps on CPU frequency changes
Product: Power Management Reporter: Stefano Brivio (stefano.brivio)
Component: cpufreqAssignee: cpufreq (cpufreq)
Status: CLOSED CODE_FIX    
Severity: normal CC: acpi-bugzilla, davej, lenb, mingo, rjwysocki
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.24-rc3 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 9243    

Description Stefano Brivio 2007-11-29 08:36:23 UTC
Most recent kernel where this bug did not occur: unknown. I'm sorry, I can't test 2.6.23 because of some dm-crypt issue.

Distribution: Gentoo.

Hardware Environment:
st3@morte ~ $ cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 13
model name      : Intel(R) Pentium(R) M processor 1.50GHz
stepping        : 6
cpu MHz         : 1500.000
cache size      : 2048 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 2
wp              : yes
flags           : fpu vme de pse tsc msr mce cx8 sep mtrr pge mca cmov pat clflush dts acpi mmx fxsr sse sse2 ss tm pbe bts est tm2
bogomips        : 2993.49
clflush size    : 64

Software environment: acpi-cpufreq module.

Problem Description:
when I switch the CPU frequency, the jiffies counter gets scaled by the ratio of <old_frequency/new_frequency>, e.g.:

morte ~ # cpuspeedy max
cpuspeedy: CPU speed is now 1.50 GHz (100%) temperature is 55 C
morte ~ # tail -n1 /var/log/messages
[10609.463888] wmaster0: STA 00:14:c1:35:8d:eb Average rate: 10 (50/5)
morte ~ # cpuspeedy med
cpuspeedy: CPU speed is now 1 GHz (66%) temperature is 72 C
[15996.701445] wmaster0: STA 00:14:c1:35:8d:eb Average rate: 10 (60/6)

This isn't just an aesthetic issue, as many device drivers rely on the jiffies counter.

Steps to reproduce:
see above.
Comment 1 Rafael J. Wysocki 2007-12-01 12:35:07 UTC
Please mark it as a regression.

Apart from this, I'm going to add it to the list of recent regressions.
Comment 2 Ingo Molnar 2007-12-04 13:14:49 UTC
Ugh, this seems to be a nasty regression.

Dave, is this some GTOD weirdness? I didnt know jiffies are fixed up during cpu-freq change - that would mess up all sorts of things like all timer_list timers.
Comment 3 Dave Jones 2007-12-04 13:28:24 UTC
we've been doing adjust_jiffies() on cpu speed transitions forever, so this isn't a regression in that sense.  Why it's causing a problem now is a mystery though.

Even more puzzling, on lkml, Stefano commented out the adjust_jiffies() code, and still saw leaping counters, so I'm really at a loss to explain what's going on.
Comment 4 Ingo Molnar 2007-12-04 13:47:06 UTC
> we've been doing adjust_jiffies() on cpu speed transitions forever, so 
> this isn't a regression in that sense.  Why it's causing a problem now 
> is a mystery though.
> 
> Even more puzzling, on lkml, Stefano commented out the 
> adjust_jiffies() code, and still saw leaping counters, so I'm really 
> at a loss to explain what's going on.

hm, maybe i'm missing something obvious, but adjust_jiffies() seems to 
fix up loops_per_jiffy, not the jiffies variable. The bugreport included 
a /var/log/messages snippets that showed printk timestamps - the printk 
timestamps are directly derived from the 'jiffies' variable. Those 
timestamps jumped back and forth according to CPU frequency. So 
_something_ does seem to affect the jiffies variable - and it's not 
adjust_jiffies(). Or at least so i think.
Comment 5 Stefano Brivio 2007-12-04 14:04:19 UTC
(In reply to comment #4)

> hm, maybe i'm missing something obvious, but adjust_jiffies() seems to 
> fix up loops_per_jiffy, not the jiffies variable. The bugreport included 
> a /var/log/messages snippets that showed printk timestamps - the printk 
> timestamps are directly derived from the 'jiffies' variable. Those 
> timestamps jumped back and forth according to CPU frequency. So 
> _something_ does seem to affect the jiffies variable - and it's not 
> adjust_jiffies(). Or at least so i think.

Yes, I'm wondering about this as well. But so far, I hadn't been able to find out where in the kernel a CPU frequency switch could trigger a sudden change to the jiffies counter. The only related thing I observed is that adjust_jiffies() is called, but that doesn't seem to be the culprit.

BTW, I just sorted out the dm-crypt issue which prevented me to test 2.6.23. Will let you know ASAP about further tests.
Comment 6 Ingo Molnar 2007-12-04 14:09:15 UTC
> Yes, I'm wondering about this as well. But so far, I hadn't been able 
> to find out where in the kernel a CPU frequency switch could trigger a 
> sudden change to the jiffies counter. The only related thing I 
> observed is that adjust_jiffies() is called, but that doesn't seem to 
> be the culprit.

one thing that changes jiffies is the NOHZ code. But that should really 
only be a forwards adjustment. NTP also changes it. Weird.
Comment 7 Dave Jones 2007-12-04 14:25:04 UTC
does booting with nohz=off change anything?
Comment 8 Stefano Brivio 2007-12-05 11:28:42 UTC
(In reply to comment #7)
> does booting with nohz=off change anything?

That was already off. I tried to turn it on, but nothing changed.
Comment 9 Stefano Brivio 2007-12-05 11:38:31 UTC
I just managed to boot 2.6.23. Nothing changes. Do you want me to bisect some specific tree?
Comment 10 Ingo Molnar 2007-12-05 12:43:59 UTC
> I just managed to boot 2.6.23. Nothing changes. Do you want me to 
> bisect some specific tree?

nothing changes == same problem?

If yes then please try to find the oldest kernel that still works fine. 
2.6.20 would be a good pick i think, 2.6.16 after that. Then if you 
could try to bisect between the youngest "good" and the oldest "bad" 
tree that would be fantastic ... Expect a compile-fest though.
Comment 11 Stefano Brivio 2007-12-06 17:26:16 UTC
Fixed. Please see:

http://lkml.org/lkml/2007/12/6/377

Thank you for your suggestions.
Comment 12 Len Brown 2007-12-06 20:32:13 UTC
Some Intel TSC facts, for the record:

if (family == 6 && model < 14) || (family == 15 && model < 3)
   TSC rate = Core MHz rate (ie. it changes with core/bus ratio shifts)
   WRMSR on the TSC writes bottom 32-bits, clears upper 32 bits.

For these families with models later than above
   TSC rate = constant (does not change with core/bus ratio shifts)
   WRMSR TSC writes all 64 bits.

The system here is a family 6, model 13, so it it will
indeed see variable speed.

But note that the bigger problem may be that on all Intel processors
with C3 that have shipped to date, the TSC COMPLETELY STOPS
for the duration of C3.

I believe that tsc_unstable is set upon either of these situations.

I don't follow all the twisty turny paths in the code here --
and it is wonderful that you bisected and found the cause;
but it isn't immediately obvious that this is due to variable
speed TSC, but it may be due to TSC stoppage.

if you disable cpufreq and it goes away, you know it was variable TSC.
if you boot with "idle=poll" and keep cpufreq enabled and it goes
away, then you know it was TSC stoppage.
Comment 13 Stefano Brivio 2007-12-06 21:16:09 UTC
(In reply to comment #12)
> Some Intel TSC facts, for the record:
> 
> if (family == 6 && model < 14) || (family == 15 && model < 3)
>    TSC rate = Core MHz rate (ie. it changes with core/bus ratio shifts)
>    WRMSR on the TSC writes bottom 32-bits, clears upper 32 bits.

Thank you for the info.

> But note that the bigger problem may be that on all Intel processors
> with C3 that have shipped to date, the TSC COMPLETELY STOPS
> for the duration of C3.

During C2 as well, maybe?

> I believe that tsc_unstable is set upon either of these situations.

Yes, no worries, I hit both :)

Nov 29 07:55:29 morte [    6.530752] Marking TSC unstable due to: possible TSC halt in C2.
Nov 29 07:55:29 morte [    6.530761] Time: acpi_pm clocksource has been installed.
Nov 29 07:55:29 morte [    6.620700] Clocksource tsc unstable (delta = -456217276 ns)

and

Nov 29 20:50:31 morte [  297.562722] Marking TSC unstable due to: cpufreq changes.

Anyway, I tried to follow your two suggestions and I must say that variable TSC causes issues, while TSC stoppage doesn't, in my case. In both cases, the timestamp isn't correct, but while TSC stoppage causes, well, a stop, the big trouble is when the timestamp leaps back because of cpufreq changes, and drivers get confused by this behaviour.
Comment 14 Rafael J. Wysocki 2007-12-07 17:40:55 UTC
*** Bug 9524 has been marked as a duplicate of this bug. ***
Comment 15 Stefano Brivio 2007-12-18 16:59:35 UTC
Fix: http://lkml.org/lkml/2007/12/7/100
Comment 16 Ingo Molnar 2007-12-19 05:46:56 UTC
> Fix: http://lkml.org/lkml/2007/12/7/100

note that due to intrusiveness this will likely only show up in v2.6.25.
Comment 17 Len Brown 2011-07-30 05:31:51 UTC
closed by:


commit b842271fbb9c8b5fd0e1c3e1895a3b67ba5bcc54
Author: Ingo Molnar <mingo@elte.hu>
Date:   Fri Jan 25 21:07:59 2008 +0100

    sched: remove printk_clock()
    
    printk_clock() is obsolete - it has been replaced with cpu_clock().
    
    Signed-off-by: Ingo Molnar <mingo@elte.hu>