Bug 10616

Summary: Horrendous Audio Stutter - current git
Product: Process Management Reporter: Rafael J. Wysocki (rjw)
Component: SchedulerAssignee: Ingo Molnar (mingo)
Status: CLOSED CODE_FIX    
Severity: normal CC: a.p.zijlstra, dhaval, elendil, seanlkml
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.5.25-git Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 10492    

Description Rafael J. Wysocki 2008-05-07 15:26:18 UTC
Subject    : Horrendous Audio Stutter - current git
Submitter  : "Parag Warudkar" <parag.warudkar@gmail.com>
Date       : 2008-05-02 20:14
References : http://lkml.org/lkml/2008/5/1/440
Handled-By : Peter Zijlstra <peterz@infradead.org>
Patch      : http://lkml.org/lkml/2008/5/2/126

This entry is being used for tracking a regression from 2.6.25.  Please don't
close it until the problem is fixed in the mainline.
Comment 1 Rafael J. Wysocki 2008-05-12 12:26:29 UTC
*** Bug 10640 has been marked as a duplicate of this bug. ***
Comment 2 Rafael J. Wysocki 2008-05-15 14:40:37 UTC
Regressions list annotation:
References : http://lkml.org/lkml/2008/5/11/230
Comment 3 Rafael J. Wysocki 2008-05-20 15:28:11 UTC
Regressions list annotation:
References : http://lkml.org/lkml/2008/5/18/178
Comment 4 Rafael J. Wysocki 2008-06-08 09:46:41 UTC
Frans Pop said:

"This issue was traced to FAIR_GROUP_SCHED, which is no longer present 
in -rc5 as the patches that introduced it have been reverted."

Closing.
Comment 5 Andrew Clayton 2008-06-12 06:15:11 UTC
I hate to be a party pooper. But I was running 2.6.26-rc5-git2 and compiling a 2.6.26-rc5-git5 last night, I had music playing in Rhythmbox and was getting  audio skips all over the place.

Things have never been quite right since early 2.6.25-rcX. By the time 2.6.25 was out things had improved a lot but there was still the odd bit of stuttering (audio and desktop/mouse wise).

Early 2.6.26-rcX was seemingly looking pretty good. But -rc5-git is looking almost as bad as 2.6.25-rc was to me.

I've never had any of the fair/group scheduling stuff enabled. I will double check again tonight in case it got enabled somehow.

System is an AMD Athlon 1.5GHz (UP) with 768MB RAM running Fedora 8. 
Comment 6 Ingo Molnar 2008-06-12 06:32:07 UTC
> Early 2.6.26-rcX was seemingly looking pretty good. But -rc5-git is 
> looking almost as bad as 2.6.25-rc was to me.

in case you'd like to try another kernel, it might be worth trying 
latest tip/master:

  http://people.redhat.com/mingo/tip.git/README

just to see whether this problem got fixed via tip/sched-devel ...
Comment 7 Andrew Clayton 2008-06-12 16:55:54 UTC
Thanks, I'll keep that in mind.

Interestingly I've just built 2.6.26-rc6 under 2.6.26-rc5-git5 without a glitch.
Comment 8 Ingo Molnar 2008-06-12 21:43:31 UTC
> Thanks, I'll keep that in mind.
> 
> Interestingly I've just built 2.6.26-rc6 under 2.6.26-rc5-git5 without 
> a glitch.

another thing to keep in mind is latencytop: that tool can help bring 
actual hard numbers about the type of delays that happen on your system, 
and their sources within the kernel. See http://latencytop.org - you 
need that tool plus a CONFIG_LATENCYTOP=y kernel.

	Ingo
Comment 9 Sean Estabrooks 2008-06-15 19:30:02 UTC
Getting severe interactivity problems here too with 2.6.26-rc6, here are a couple samples from latency top in case it helps:

Cause                                                Maximum     Percentage
Scheduler: waiting for cpu                        106.0 msec         35.3 %
blk_execute_rq scsi_execute scsi_execute_req scsi_  7.0 msec          5.0 %
do_sys_poll sys_poll system_call_after_swapgs       4.9 msec         27.5 %
do_select core_sys_select sys_select system_call_a  4.9 msec         18.7 %
blk_execute_rq scsi_execute scsi_execute_req sd_re  4.9 msec          6.7 %
futex_wait do_futex sys_futex system_call_after_sw  4.9 msec          5.6 %
blk_execute_rq sg_io scsi_cmd_ioctl cdrom_ioctl id  1.9 msec          0.5 %
log_wait_commit journal_stop journal_force_commit   1.6 msec          0.1 %
ide_do_drive_cmd ide_cd_queue_pc cdrom_check_statu  1.0 msec          0.3 %
cd_open do_open blkdev_open


Cause                                                Maximum     Percentage
sync_buffer __wait_on_buffer sync_dirty_buffer jou642.4 msec          7.0 %
sync_page sync_page_killable __lock_page_killable 252.9 msec         19.6 %
sync_buffer __wait_on_buffer __bread ext3_free_bra175.8 msec          2.0 %
get_request_wait __make_request generic_make_reque150.4 msec          1.4 %
sync_buffer __wait_on_buffer __ext3_get_inode_loc 146.9 msec          0.7 %
sync_buffer __wait_on_buffer __bread ext3_free_bra141.6 msec          0.8 %
sync_buffer __wait_on_buffer bh_submit_read read_b112.2 msec          0.3 %
sync_page __lock_page find_lock_page filemap_fault 96.1 msec          0.5 %
sync_buffer __wait_on_buffer __bread ext3_free_bra 87.5 msec          2.9 %


Happy to provide additional info or tests as requested...
Cheers
Comment 10 Peter Zijlstra 2008-06-17 02:10:41 UTC
On Sun, 2008-06-15 at 19:30 -0700, bugme-daemon@bugzilla.kernel.org
wrote:
> http://bugzilla.kernel.org/show_bug.cgi?id=10616

> ------- Comment #9 from seanlkml@sympatico.ca  2008-06-15 19:30 -------
> Getting severe interactivity problems here too with 2.6.26-rc6, here are a
> couple samples from latency top in case it helps:
> 
> Cause                                                Maximum     Percentage
> Scheduler: waiting for cpu                        106.0 msec         35.3 %

106ms isn't too bad - depending on how busy the system is.

> Cause                                                Maximum     Percentage
> sync_buffer __wait_on_buffer sync_dirty_buffer jou642.4 msec          7.0 %
> sync_page sync_page_killable __lock_page_killable 252.9 msec         19.6 %
> sync_buffer __wait_on_buffer __bread ext3_free_bra175.8 msec          2.0 %
> get_request_wait __make_request generic_make_reque150.4 msec          1.4 %
> sync_buffer __wait_on_buffer __ext3_get_inode_loc 146.9 msec          0.7 %
> sync_buffer __wait_on_buffer __bread ext3_free_bra141.6 msec          0.8 %
> sync_buffer __wait_on_buffer bh_submit_read read_b112.2 msec          0.3 %
> sync_page __lock_page find_lock_page filemap_fault 96.1 msec          0.5 %
> sync_buffer __wait_on_buffer __bread ext3_free_bra 87.5 msec          2.9 %

This looks like ext3 borkage, and while annoying isn't anything the
scheduler can do anything about.