Subject : High [extra timer interrupt] count in powertop since 2.6.36 Submitter : Ian Kumlien <pomac@demius.net> Date : 2010-10-30 23:52 Message-ID : alpine.LNX.2.00.1010310148450.24290@twilight.pomac.com References : http://marc.info/?l=linux-kernel&m=128848330304431&w=2 This entry is being used for tracking a regression from 2.6.35. Please don't close it until the problem is fixed in the mainline.
Looks like there is a patch here for high "extra timer interrupt" values in 2.6.36: http://lkml.org/lkml/2010/9/28/115
I missed the patch and will try that next, i just wanted to add that the timer count seems to increase trough uptime - I upgraded to 2.6.37-rc1-git8 to test and it was fine, but right now it peaks at 300 extra timer interrupts and thats after ~25 hours uptime.
2.6.37-rc1-git8 + the patch - no success
I think this is related to: Clocksource tsc unstable (delta = -25767719801 ns) Switching to clocksource hpet On this AMD cpu that has a constant_tsc flag - seems to happen after 24 hours or so, then the interrupts keep increasing. So it looks like it's related to this: http://www.gossamer-threads.com/lists/linux/kernel/1294035 (I don't know if there is a bug entry for it)
Without the patch, there is no loss of tsc clock (at least not yet) but the extra timer interrupt wakeups keep increasing...
It turns out that you shouldn't trust powertop. I'm gonna clean up ans submit this patch: diff --git a/powertop.c b/powertop.c index 74eb328..9b2ada7 100644 --- a/powertop.c +++ b/powertop.c @@ -241,6 +241,7 @@ static void do_proc_irq(void) return; while (!feof(file)) { char *c; + char *start; int nr = -1; uint64_t count = 0; int special = 0; @@ -252,23 +253,17 @@ static void do_proc_irq(void) if (!c) continue; /* deal with NMI and the like.. make up fake nrs */ - if (line[0] != ' ' && (line[0] < '0' || line[0] > '9')) { - if (strncmp(line,"NMI:", 4)==0) - nr=20000; - if (strncmp(line,"RES:", 4)==0) - nr=20001; - if (strncmp(line,"CAL:", 4)==0) - nr=20002; - if (strncmp(line,"TLB:", 4)==0) - nr=20003; - if (strncmp(line,"TRM:", 4)==0) - nr=20004; - if (strncmp(line,"THR:", 4)==0) - nr=20005; - if (strncmp(line,"SPU:", 4)==0) - nr=20006; + start = line; + while (*start == ' ') + start++; + if (isalpha(*start)) + { +#define MAKE4(ch0, ch1, ch2, ch3) (int)(ch0 | (ch1 << 8) | (ch2 << 16) | (ch3 << 24)) + nr = MAKE4(start[0],start[1],start[2],start[3]); special = 1; - } else +#undef MAKE4 + } + else nr = strtoull(line, NULL, 10); if (nr==-1)
If you want to track it: http://www.bughost.org/pipermail/power/2010-November/002029.html I think that this could be set to closed - I doubt it's a actual kernel issue beyond changing the format of /proc/interrupts.
Thx for following up on this.
Uhm.. btw, did you find when this change to /proc/interrupts was introduced? Chances are, that this was an accidential change in the kernel ABI which should be reversed.
one machine: cat /proc/interrupts CPU0 CPU1 ... NMI: 0 0 Non-maskable interrupts LOC: 23508960 26173864 Local timer interrupts SPU: 0 0 Spurious interrupts PMI: 0 0 Performance monitoring interrupts IWI: 0 0 IRQ work interrupts RES: 39035305 37826125 Rescheduling interrupts CAL: 76547 75363 Function call interrupts TLB: 105737 107007 TLB shootdowns TRM: 0 0 Thermal event interrupts THR: 0 0 Threshold APIC interrupts MCE: 0 0 Machine check exceptions MCP: 384 384 Machine check polls ERR: 1 MIS: 0 Other machine: cat /proc/interrupts CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 .... NMI: 0 0 0 0 0 0 Non-maskable interrupts LOC: 17167860 16612451 7879817 6802772 5827157 2049716 Local timer interrupts SPU: 0 0 0 0 0 0 Spurious interrupts PMI: 0 0 0 0 0 0 Performance monitoring interrupts IWI: 0 0 0 0 0 0 IRQ work interrupts RES: 7663728 5396717 3521158 2472727 1903596 3524534 Rescheduling interrupts CAL: 80233 71037 88573 95769 109727 71864 Function call interrupts TLB: 144910 149537 103547 102747 103824 105271 TLB shootdowns TRM: 0 0 0 0 0 0 Thermal event interrupts THR: 0 0 0 0 0 0 Threshold APIC interrupts MCE: 0 0 0 0 0 0 Machine check exceptions MCP: 1068 1068 1068 1068 1068 1068 Machine check polls ERR: 0 MIS: 0 Same kernel. But it's not only that - it's the check for the "labels" that is outdated.
Can you also provide /proc/interrupts from a good kernel?
CPU0 CPU1 NMI: 0 0 Non-maskable interrupts LOC: 83064988 44758271 Local timer interrupts SPU: 0 0 Spurious interrupts PMI: 0 0 Performance monitoring interrupts PND: 0 0 Performance pending work RES: 534211 599941 Rescheduling interrupts CAL: 201314 1250582 Function call interrupts TLB: 1519795 1547577 TLB shootdowns TRM: 0 0 Thermal event interrupts THR: 0 0 Threshold APIC interrupts MCE: 0 0 Machine check exceptions MCP: 4654 4654 Machine check polls ERR: 1 MIS: 0 Would be better MCP and ERR might be counted as something unexpected but since thats during 16 days it wont affect powertop in the same way. The space in the beginning of my six core machine is odd but beyond that it's a matter of adding support for the new labels in powertop.
Btw, the commit that changed "PND" to "IWI" is commit e360adbe29241a0194e10e20595360dd7b98a2b3 Author: Peter Zijlstra <a.p.zijlstra@chello.nl> Date: Thu Oct 14 14:01:34 2010 +0800 irq_work: Add generic hardirq context callbacks
Also after having made myself intimate with this bug, I agree with your conclusion, that this is just a case of buggy userspace. I'm closing this as invalid.