Bug 22332

Summary: 10Ks of wakeups in tickless kernel
Product: ACPI Reporter: Vitus Jensen (vjensen)
Component: Power-ProcessorAssignee: acpi_power-processor
Status: CLOSED INSUFFICIENT_DATA    
Severity: normal CC: acpi-bugzilla, alan, lenb, linville, me, mickflemm, rui.zhang
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 3.2 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: powertop -d output of tickless kernel
dmesg output of nohz kernel
2 contents of /proc/interrupts, 10 seconds delay in between
Output of cat /sys/firmware/acpi/interrupts/*
75.000 wakeups before starting X11
/proc/config.gz of running kernel
67364 wakeups while wgetting at 10K/s via cable
34700 wakeups while running mpg123 (47/s for audio)
45000 wakeup/s while doing nothing (no ath5k, no USB)
finnix-104 (v3.2.0-1) after connecting wlan

Description Vitus Jensen 2010-11-07 11:49:22 UTC
Latest working kernel version: 2.6.27
Earliest failing kernel version: 2.6.30-r5 (perhaps earlier)
Distribution: gentoo Stable
Hardware Environment: Thinkpad R51e w/ Intel Pentium M 1.73 GHz

Even when not-starting X11 powertop reports a very high amount of wakeups without visible reasons for it.  The wakeup count for single sources max out around 50/s, while the wakeup count reaches values from 10.000 to 80.000.  /proc/interupts reflects the wakeups displayed by powertop while /sys/firmware/acpi/interrupts/* shows 0-2 interrupts.  As a result the CPU seldom reaches C3 and the power consumption is high.

In 2.6.27 wakeup count of the tickless kernel stays around 100 when idle, 50 without X11.  After learning about https://bugzilla.kernel.org/show_bug.cgi?id=12788 I disabled NOHZ and got a system with around 200 wakeups.

Steps: to reproduce: just boot any tickless kernel
Comment 1 Vitus Jensen 2010-11-07 11:51:22 UTC
Created attachment 36432 [details]
powertop -d output of tickless kernel

Showing 27000 wakeups when idle in a Gnome session.
Comment 2 Vitus Jensen 2010-11-07 12:18:51 UTC
Created attachment 36442 [details]
dmesg output of nohz kernel

The kernel is patched because of that bug: https://bugzilla.kernel.org/show_bug.cgi?id=13155.  But the wakeups are independ of the version of the EC Driver code.
Comment 3 Vitus Jensen 2010-11-07 12:20:00 UTC
Created attachment 36452 [details]
2 contents of /proc/interrupts, 10 seconds delay in between
Comment 4 Vitus Jensen 2010-11-07 12:20:57 UTC
Created attachment 36462 [details]
Output of cat /sys/firmware/acpi/interrupts/*
Comment 5 Vitus Jensen 2010-11-07 12:22:26 UTC
Created attachment 36472 [details]
75.000 wakeups before starting X11
Comment 6 Zhang Rui 2010-11-08 02:17:04 UTC
It seems it's USB that keeps on waking up the system, and this should have been improved a lot by the runtime PM in latest kernel.

Vitus,
can you please try 2.6.36 kernel with CONFIG_PM_RUNTIME set and see if the problem still exist?
Comment 7 Vitus Jensen 2010-11-08 16:23:51 UTC
The kernel already had CONFIG_PM_RUNTIME enabled and when powertop prompted to press 'P' to  enable runtime power management I did so.

When booting to console and unloading USB drivers (usbcore remains inuse, *hcd is unloaded) nothing changed.  After unloading ath5k powertop reports ca. 40 wakeups/s, but occasional this skyrocks to 6.000 or more.  There is no pattern, 1 min, 2 min, no high wakeup count for 5 minutes.
Comment 8 Vitus Jensen 2010-11-08 16:25:31 UTC
Created attachment 36742 [details]
/proc/config.gz of running kernel

This is the configuration used in all tickless 2.6.36 kernels.
Comment 9 Zhang Rui 2010-11-09 00:38:25 UTC
(In reply to comment #7)
> When booting to console and unloading USB drivers (usbcore remains inuse,
> *hcd
> is unloaded) nothing changed.  After unloading ath5k powertop reports ca. 40
> wakeups/s, but occasional this skyrocks to 6.000 or more.  There is no
> pattern,
> 1 min, 2 min, no high wakeup count for 5 minutes.

this sounds like an ath5k driver problem as the interrupt is growing fast as well.
Comment 10 Vitus Jensen 2010-11-09 06:00:13 UTC
(In reply to comment #9)
> (In reply to comment #7)
> > When booting to console and unloading USB drivers (usbcore remains inuse,
> *hcd
> > is unloaded) nothing changed.  After unloading ath5k powertop reports ca.
> 40
> > wakeups/s, but occasional this skyrocks to 6.000 or more.  There is no
> pattern,
> > 1 min, 2 min, no high wakeup count for 5 minutes.
> 
> this sounds like an ath5k driver problem as the interrupt is growing fast as
> well.

Perhaps.  But it's only growing by 40-50 interrupts/s and not 50.000.  And the wakeups still occur after at5k is unloaded, only very seldom.

I will switch over to cable ethernet and see how this reacts after really using the machine.
Comment 11 John W. Linville 2010-11-09 14:00:30 UTC
I'm not convinced that this is ath5k either, but let's Cc them just in case...
Comment 12 Bob Copeland 2010-11-10 05:24:39 UTC
I'm a bit confused, can you characterize the # of wakeups with/without ath5k?  You seem to say unloading it drops the number of wakeups, but that it doesn't have a lot of interrupts?  Or is the issue that #interrupts doesn't seem to correlate with wakeups as reported by powertop?

/proc/interrupts output seems to show about 400 interrupts on that shared line for 10 seconds, so 40/s which is probably not too unusual if the card is actively receiving beacons with a few APs around.
Comment 13 Vitus Jensen 2010-11-10 06:13:31 UTC
As ath5k doesn't support powersave currently, 40/s is indeed the value always shown for ath.  And output of powertop and /proc/interrupts is matching in this respect.

I've unloaded ath5k and support and used the cable.  The next powertop outputs show that the high wakeup count isn't related to a specific interrupt line, it even happens when there is no noticable interrupt activity.  But in that case I was only able to catch a 846/s case.
Comment 14 Vitus Jensen 2010-11-10 06:17:20 UTC
Created attachment 37002 [details]
67364 wakeups while wgetting at 10K/s via cable

ath5k unloaded, USB *hcd unloaded
Comment 15 Vitus Jensen 2010-11-10 06:28:19 UTC
Created attachment 37012 [details]
34700 wakeups while running mpg123 (47/s for audio)
Comment 16 Vitus Jensen 2010-11-10 22:24:05 UTC
Created attachment 37082 [details]
45000 wakeup/s while doing nothing (no ath5k, no USB)

This is only a screenshot of powertop as it's not predictable when this happens without any noticable interrupt activity.  I was watching powertop running in a screen session resumed from a ssh connection.  Usual wakeups/s in this constellation are around 40/s.
Comment 17 John W. Linville 2011-01-25 20:16:10 UTC
Ok, not sure where we left this...is this issue still valid?
Comment 18 Vitus Jensen 2011-01-26 22:03:13 UTC
(In reply to comment #17)
> Ok, not sure where we left this...is this issue still valid?

It is.  Why shouldn't it?

I'm running kernel 2.6.36-00001-gf6257ae very successfully since 30th of october... with CONFIG_HZ_100=y.  The one commit is a port of the Embedded Controller driver v2.0 from 2.6.27 with ec_int=0.  EC interrupts kill the Thinkpad R51e.
Comment 19 John W. Linville 2011-01-26 22:15:08 UTC
Maybe I'm missing it, but it seems like you are just showing some numbers with no clear indication of either what you think the numbers should be or why you think ath5k is at fault?
Comment 20 Vitus Jensen 2011-01-27 01:23:12 UTC
John, perhaps this wasn't too clear but in the very first comment I stated that the wakeup count I expect from a tickless kernel is 50-100 per seconds.  Depending on load.

As for at5k, that was your idea, see comment #11
Comment 21 John W. Linville 2011-01-27 15:28:58 UTC
Ah, well thanks for sorting that out. :-)  FWIW, it looks more like it was Zhang in comment 9 -- Zhang, can you elaborate on what makes you suspect ath5k?
Comment 22 John W. Linville 2011-04-27 19:30:54 UTC
Well, I can't see anything I can do here -- sending back to ACPI... :-(
Comment 23 Zhang Rui 2012-01-18 02:23:18 UTC
It's great that kernel bugzilla is back.

can you please verify if the problem still exists in the latest upstream
kernel?
Comment 24 Zhang Rui 2012-05-24 07:40:13 UTC
bug closed as there is no response from the bug reporter.
please feel free to reopen it if the problem still exists in the latest upstream kernel.
Comment 25 Vitus Jensen 2012-05-27 19:42:29 UTC
Sorry, I hadn't noticed that bugzilla is online again and mails were caught in the spam filter.

Yes, current kernels show a high wakeup rate too.  Tested with v3.4 from kernel.org on gentoo.  But as gentoo makes recreating it a little harder I tried some live CDs, too.
Comment 26 Vitus Jensen 2012-05-27 19:44:57 UTC
Created attachment 73428 [details]
finnix-104 (v3.2.0-1) after connecting wlan

powertop -d, /proc/stats, /proc/interrupts and dmesg output.
Comment 27 Vitus Jensen 2012-05-28 09:41:12 UTC
knoppix v6.7.1 shows a high wakeup rate, too.  But that is kernel 3.0, finnix was the newest I could find.

Meanwhile I experimented with kernel v3.4 and idle=xxx in the commandline, fully booted system (gnome, wlan, nfsclient):

idle=mwait    85 wakeups

idle=halt     85 wakeups

idle=poll     110 wakeups

idle=nomwait  low wakeups, system freeze

(no idle)     around 50.000 wakeups, system freeze

The freezes might be related to https://bugzilla.kernel.org/show_bug.cgi?id=13155, I haven't backported the old EC driver to 3.4.

While powertop does no longer report CPU entering any C-states the system is stable running idle=mwait since yesterday, suspends/resumes fine.  Does this tell you something?  Is this a workaround or the recommended way to use a tickless kernel in case of problems?
Comment 28 Len Brown 2012-10-30 02:11:01 UTC
is this still an issue with the current upstream kernel?
Comment 29 Vitus Jensen 2012-11-11 01:28:31 UTC
I sold that laptop last week, beside this wakeup problem it had kernel panics half of the boots after changing wlan ap from d-link to zyxel.  The last kernel I used was 3.4 with halt=mwait parameter.

If someone still uses the Thinkpad R51e I recommend kernel 2.6.23 (no EC problems) and the separate madwifi package (which supports wlan powersave mode).
Comment 30 Zhang Rui 2012-11-28 03:02:45 UTC
(In reply to comment #29)
> I sold that laptop last week,

well,, sorry about that. :(

I'll close this bug as we can not debug any more without reproducing it.
anyone can drop a note here if you can reproduce the problem on the same model laptop.