Subject : [2.6.35.x regression] rcu_preempt_state stall warning and machine slow-downs Submitter : Matthias Dahl <ml_kernel@mortal-soul.de> Date : 2010-09-01 6:47 Message-ID : 201009010847.20236.ml_kernel@mortal-soul.de References : http://marc.info/?l=linux-kernel&m=128332418629305&w=2 This entry is being used for tracking a regression from 2.6.34. Please don't close it until the problem is fixed in the mainline.
Created attachment 29122 [details] kernel stall warning This just happened to me again while shutting down. It took several minutes for the shutdown to complete for no apparent reason.
Created attachment 29242 [details] anoter rcu stall warning Sorry for posting again but this bug is really nerve wrecking and my knowledge of the kernel internals is not as complete as I'd like to trace this myself. Attached is another instance of a rcu stall warning. This almost exclusively happens while the machine is under load. All kernels prior 2.6.35 are fine. What pops up in all backtraces is one core is in delay_tsc(). I don't know if this is relevant. Yesterday I got the following effect which falls under the slowdown category: Watching a video w/ vlc, the video and audio started stuttering and always mostly recovered while I was moving the mouse around. As soon as I stopped, it was back to stuttering. This has happened before too exclusively w/ 2.6.35.x. :-(
Forgot the following important bit: Restarting X or anything alike did not help, I had to restart the machine which took longer than usual to restart. There were no processes stuck or consuming any relevant amount of CPU time.
Created attachment 29922 [details] rcu stall w/ nasty slowdown This is a backtrace from a rcu stall warning that happened w/ a nasty slowdown.
I have the exact same problem, you described it very well. I use a Lenovo G550 Laptop. /var/log/messages shows Sep 15 12:15:51 mobil pulseaudio[1456]: ratelimit.c: 19 events suppressed Sep 15 12:15:57 mobil pulseaudio[1456]: ratelimit.c: 25 events suppressed Sep 15 12:16:02 mobil pulseaudio[1456]: ratelimit.c: 6 events suppressed while video-playback. These warning stop when I stop the video. The system however stays unusably slow / "stucky". Reboot takes too long at that point, so I mostly interrupt it. Could you tell me, where I can check for these rcu stall warnings as well? thanks!
Created attachment 30142 [details] Fix by Thomas Gleixner This fixes the problem for me on 2.6.35. I'll test it against a current tree aswell. Author: Thomas Gleixner
> --- Comment #6 from Martin Kepplinger <martinkepplinger@eml.cc> 2010-09-15 > 17:14:07 --- > Created an attachment (id=30142) > --> (https://bugzilla.kernel.org/attachment.cgi?id=30142) > Fix by Thomas Gleixner > > This fixes the problem for me on 2.6.35. I'll test it against a current tree > aswell. Author: Thomas Gleixner That fix is in 2.6.35.5 now. Can the other reporters please re-test ? Thanks, tglx
I'm running 2.6.35.5 without problems. I can really call it stable (and actually use it) now. Thanks for the effort, martin
Hi. Sorry for my late response. I've been running 2.6.35.4 w/ both patches posted to the kernel list (by Thomas Gleixner) for several days now and have seen no more rcu stall warnings or slowdowns. Thanks _a lot_ for the fix. So long, matthias.