Bug 8100

Summary: dynticks makes ksoftirqd1 use unreasonable amount of cpu time
Product: Timers Reporter: Emil Karlson (jekarlson)
Component: OtherAssignee: john stultz (john.stultz)
Status: CLOSED CODE_FIX    
Severity: low CC: bunk, tglx
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.21-rc2 Subsystem:
Regression: --- Bisected commit-id:
Attachments: lspci -vvv on my macbook
dmesg on 2.6.21-rc2 with dynticks
a working config
the dysfunctional config
trace-it output requested
Patch to fix the problem

Description Emil Karlson 2007-02-28 09:34:07 UTC
Most recent kernel where this bug did *NOT* occur:
any kernel without dynticks

Distribution:
Debian etch with linux-2.6.21-rc{2,1}

Hardware Environment: 
Macbook core2 with bios emulation

Software Environment:
The problem is obvious when listening to shoutcast stream with kmplayer and 
artsd via wi-fi with  wpa (wpa_supplicant)

Problem Description:
ksoftirqd1 uses ~30% cpu-time (by top) no other symptoms, while
without dyntikcs cpu-load in similar circumstances is negligible.
This might be a dynticks feature rather than bug.

Steps to reproduce:
Just watch the top, if the bug is reproducible, probably just booting should 
suffice.
Comment 1 Emil Karlson 2007-02-28 09:49:32 UTC
Created attachment 10553 [details]
lspci -vvv on my macbook
Comment 2 Emil Karlson 2007-02-28 09:51:49 UTC
Created attachment 10554 [details]
dmesg on 2.6.21-rc2 with dynticks
Comment 3 Anonymous Emailer 2007-02-28 14:00:27 UTC
Reply-To: akpm@linux-foundation.org

On Wed, 28 Feb 2007 09:34:10 -0800
bugme-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=8100
> 
>            Summary: dynticks makes ksoftirqd1 use unreasonable amount of cpu
>                     time
>     Kernel Version: 2.6.21-rc2
>             Status: NEW
>           Severity: low
>              Owner: johnstul@us.ibm.com
>          Submitter: jkarlson@cc.hut.fi
> 
> 
> Most recent kernel where this bug did *NOT* occur:
> any kernel without dynticks
> 
> Distribution:
> Debian etch with linux-2.6.21-rc{2,1}
> 
> Hardware Environment: 
> Macbook core2 with bios emulation
> 
> Software Environment:
> The problem is obvious when listening to shoutcast stream with kmplayer and 
> artsd via wi-fi with  wpa (wpa_supplicant)
> 
> Problem Description:
> ksoftirqd1 uses ~30% cpu-time (by top) no other symptoms, while
> without dyntikcs cpu-load in similar circumstances is negligible.
> This might be a dynticks feature rather than bug.
> 
> Steps to reproduce:
> Just watch the top, if the bug is reproducible, probably just booting should 
> suffice.
> 
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug, or are watching someone who is.

Comment 4 Thomas Gleixner 2007-02-28 14:40:13 UTC
Is there a way to reproduce this with a simpler software environment ?

	tglx


Comment 5 Emil Karlson 2007-03-01 01:38:51 UTC
Mocp doesn't give nearly as bad results 2% of cpu time, just idling gives ~1%, 
which is still higher than on uneffected kernels.

Seems that just mplayer on local files suffices to give ~30%.

Just noted wi-fi was partially binary ndiswrapper, sorry about that - still 
reproducible without it (not loaded).

MPlayer dev-SVN-rUNKNOWN-4.1.2 (C) 2000-2007 MPlayer Team
CPU: Intel(R) Core(TM)2 CPU         T7200  @ 2.00GHz (Family: 6, Model: 15, 
Stepping: 6)
CPUflags:  MMX: 1 MMX2: 1 3DNow: 0 3DNow2: 0 SSE: 1 SSE2: 1
Compiled with runtime CPU detection.
Can't open joystick device /dev/input/js0: No such file or directory
Can't init input joystick
mplayer: could not connect to socket
mplayer: No such file or directory
Failed to open LIRC support. You will not be able to use your remote control.

Playing kompressormusic_-_VITAMINS_ARE_GOOD.mp3.
Audio file file format detected.
==========================================================================
Opening audio decoder: [mp3lib] MPEG layer-2, layer-3
AUDIO: 44100 Hz, 2 ch, s16le, 128.0 kbit/9.07% (ratio: 16000->176400)
Selected audio codec: [mp3] afm: mp3lib (mp3lib MPEG layer-2, layer-3)
==========================================================================
AO: [alsa] 48000Hz 2ch s16le (2 bytes per sample)
Video: no video
Comment 6 Emil Karlson 2007-03-08 01:43:19 UTC
Created attachment 10653 [details]
a working config

Seems that this is at least partially configuration issue added some timer
stuff to config on 2.6.21-rc2, which pacified ksoftirqd. Now kernel seems to
work for this bug.
Comment 7 Emil Karlson 2007-03-08 01:48:58 UTC
Created attachment 10654 [details]
the dysfunctional config

This won't work
Comment 8 Anonymous Emailer 2007-03-09 13:23:50 UTC
Reply-To: pavel@ucw.cz

Hi!

> bugme-daemon@bugzilla.kernel.org wrote:
> 
> > http://bugzilla.kernel.org/show_bug.cgi?id=8100
> > 
> >            Summary: dynticks makes ksoftirqd1 use unreasonable amount of cpu
> >                     time
> >     Kernel Version: 2.6.21-rc2
> >             Status: NEW
> >           Severity: low
> >              Owner: johnstul@us.ibm.com
> >          Submitter: jkarlson@cc.hut.fi
> > 
> > 
> > Most recent kernel where this bug did *NOT* occur:
> > any kernel without dynticks
> > 
> > Distribution:
> > Debian etch with linux-2.6.21-rc{2,1}
> > 
> > Hardware Environment: 
> > Macbook core2 with bios emulation
> > 
> > Software Environment:
> > The problem is obvious when listening to shoutcast stream with kmplayer and 
> > artsd via wi-fi with  wpa (wpa_supplicant)
> > 
> > Problem Description:
> > ksoftirqd1 uses ~30% cpu-time (by top) no other symptoms, while
> > without dyntikcs cpu-load in similar circumstances is negligible.
> > This might be a dynticks feature rather than bug.

top lies. top has always lied, now it lies more. Do not trust top.
RESOLVED/INVALID?

							Pavel
Comment 9 Emil Karlson 2007-03-09 20:32:33 UTC
If top lies, in this case, it also does a good job in convincing the cpu fan to 
add cycles. I can get ps readings, if you want.
Comment 10 Thomas Gleixner 2007-03-10 00:29:38 UTC
> If top lies, in this case, it also does a good job in convincing the cpu fan
> to 
> add cycles. I can get ps readings, if you want.

I diffed your good and bad configs. It changes way too many options at
once, so it's hard to tell what might be the cause of the problem.

Can you work from the good one and identify the option, which makes
things actually bad ?

Thanks

	tglx
Comment 11 Emil Karlson 2007-03-10 03:08:19 UTC
CONFIG_HIGH_RES_TIMERS=y

Unsetting this alone will make the problem reappear.
Comment 12 Thomas Gleixner 2007-03-10 03:28:18 UTC
> CONFIG_HIGH_RES_TIMERS=y
> 
> Unsetting this alone will make the problem reappear.

What happens, if you switch off CONFIG_NO_HZ too ?

	tglx


Comment 13 Emil Karlson 2007-03-10 04:07:08 UTC
Unsetting CONFIG_NO_HZ fixes the problem but also deselects 
CONFIG_TICK_ONESHOT.
Comment 14 Thomas Gleixner 2007-03-10 04:55:58 UTC
> Unsetting CONFIG_NO_HZ fixes the problem but also deselects 
> CONFIG_TICK_ONESHOT.

Right, this depends on each other.

Is the problem still there with -rc3 ?

	tglx


Comment 15 Emil Karlson 2007-03-10 05:18:04 UTC
yes, the problem is still present in 2.6.21-rc3
Comment 16 Emil Karlson 2007-03-24 13:17:29 UTC
Created attachment 10934 [details]
trace-it output requested

bugzilla does not accept files bigger than 10000kiB, sorry.
Comment 17 Thomas Gleixner 2007-03-24 23:54:29 UTC
> Created an attachment (id=10934)
>  --> (http://bugzilla.kernel.org/attachment.cgi?id=10934&action=view)
> trace-it output requested
> 
> bugzilla does not accept files bigger than 10000kiB, sorry.

No problem bz2 is fine. Somthing went wrong with the tracing. It's only
event trace.

Bah. I missed the line:

	system("echo 1 > /proc/sys/kernel/mcount_enabled");

New version at:
http://tglx.de/private/tglx/2.6.21-rc4-trace/trace-it.c

Sorry. Can you please retry ?

Thanks,

	tglx



Comment 18 Emil Karlson 2007-03-25 02:24:07 UTC
The new trace can be found at 
http://users.tkk.fi/~jkarlson/tavaraa/trace-with-mplayer.gz

4MiB unpacked...
Comment 19 Thomas Gleixner 2007-03-25 03:33:12 UTC
> The new trace can be found at 
> http://users.tkk.fi/~jkarlson/tavaraa/trace-with-mplayer.gz
> 
> 4MiB unpacked...

Good. This gives more info. Please try the attached patch.

Thanks

	tglx

Comment 20 Thomas Gleixner 2007-03-25 03:35:11 UTC
Created attachment 10935 [details]
Patch to fix the problem

Gack. It discards the patch in the mail.
Comment 21 Emil Karlson 2007-03-25 04:35:04 UTC
Tried the patch, works on 2.6.21-rc4 with dynticks without hrtimer. It should 
fix the problem afaik.
Comment 22 Adrian Bunk 2007-03-25 20:52:36 UTC
The patch fixing this bug is included in 2.6.21-rc5.