http://i.imgur.com/s3Asc.png I'm seeing that issue on multiple servers. When I change kernel from 3.2.23 to: 3.6.3 3.4.15 3.2.32 and 3.2.27 I got this loadavg spikes every 1-3 hours (see png). Somewere between <3.2.24;3.2.27> something must have gone wrong Those load spikes are about 40%sys and 40%usr load, loadavg spikes from 0,3 (very idle system) to 35 for a 1 second and then falls quicly.
Through long and painfull process I've determined that 3.2.23 - works fine 3.2.24 - has this bug Can you check what changes in 3.2.24 could be responsible for such loadavg spikes?
I've made diff and I see that 3.2.24 is a version where leap seconds were fixed. Perhaps there are some unintended consequences of that that causes this problem? Something that does not play well with core distribution files (I use old centos 5.2 and only update services that are accessible from web).
I see that 3.2.24 did some changes to NO_HZ. When I recompiled with CONFIG_NO_HZ=n problem went away. Why was it there in first place?