Bug 37322
Summary: | high load, unresponsive system spinning in rcu_kthread | ||
---|---|---|---|
Product: | Process Management | Reporter: | Ferenc Wágner (wferi) |
Component: | Scheduler | Assignee: | Ingo Molnar (mingo) |
Status: | RESOLVED OBSOLETE | ||
Severity: | normal | CC: | alan, florian, maciej.rutecki, paulmck, rjw |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 3.0.0-rc6 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Bug Depends on: | |||
Bug Blocks: | 32012 | ||
Attachments: |
somewhat pared down syslog from bootup to shutdown
config of the affected kernel (2.6.38.3 with an identical config never produced this issue) Diagnostic patch for TINY_RCU's presumed wakeup storms. screenshot of 'top' and syslog shortly before the system recovered RCU trace output around system breakdown (gzip compressed) |
Description
Ferenc Wágner
2011-06-12 14:53:34 UTC
Created attachment 61682 [details]
config of the affected kernel (2.6.38.3 with an identical config never produced this issue)
Created attachment 61882 [details]
Diagnostic patch for TINY_RCU's presumed wakeup storms.
Hmmm... Not much change in TINY_RCU between 2.6.38 and 2.6.39.
Nevertheless, if there is a problem, the most likely cause is some sort of wakeup storm. Could you please try out this diagnostic patch? It should do a once-per-10-seconds printk() to check up on the wakeup-storm assumption.
Thanks for looking into this! I backported your patch in attachment 61882 [details] to 2.6.39.1 like:
diff --git a/kernel/rcutiny.c b/kernel/rcutiny.c
index 0c343b9..ebfa4bf 100644
--- a/kernel/rcutiny.c
+++ b/kernel/rcutiny.c
@@ -213,11 +213,20 @@ static int rcu_kthread(void *arg)
*/
static void invoke_rcu_kthread(void)
{
+ static unsigned long oldjiffies;
+ static unsigned long invokecount;
unsigned long flags;
local_irq_save(flags);
have_rcu_kthread_work = 1;
wake_up(&rcu_kthread_wq);
+ invokecount++;
+ if (UINT_CMP_GE(jiffies, oldjiffies + 10 * HZ)) {
+ printk(KERN_ALERT "invoke_rcu_kthread(): %lu invocations/s",
+ invokecount / 10);
+ oldjiffies = jiffies;
+ invokecount = 0;
+ }
local_irq_restore(flags);
}
Hope it's acceptable. Booting with it shows the following:
wferi@szonett:~$ dmesg|fgrep rcu
[ 310.523674] invoke_rcu_kthread(): 959 invocations/s
[ 321.496407] invoke_rcu_kthread(): 7 invocations/s
[ 333.496689] invoke_rcu_kthread(): 1 invocations/s
[ 343.508377] invoke_rcu_kthread(): 1 invocations/s
[ 353.912314] invoke_rcu_kthread(): 2 invocations/s
[ 363.943577] invoke_rcu_kthread(): 2 invocations/s
[ 373.944206] invoke_rcu_kthread(): 1 invocations/s
[ 383.944860] invoke_rcu_kthread(): 2 invocations/s
[ 393.945506] invoke_rcu_kthread(): 1 invocations/s
[ 403.946143] invoke_rcu_kthread(): 1 invocations/s
[ 413.946796] invoke_rcu_kthread(): 1 invocations/s
[ 423.947459] invoke_rcu_kthread(): 2 invocations/s
[ 433.948121] invoke_rcu_kthread(): 1 invocations/s
[ 443.948690] invoke_rcu_kthread(): 14 invocations/s
[ 453.949253] invoke_rcu_kthread(): 36 invocations/s
[ 463.949801] invoke_rcu_kthread(): 23 invocations/s
[ 473.950289] invoke_rcu_kthread(): 63 invocations/s
[ 483.950788] invoke_rcu_kthread(): 49 invocations/s
[ 493.951303] invoke_rcu_kthread(): 23 invocations/s
[ 503.952004] invoke_rcu_kthread(): 12 invocations/s
[ 513.952603] invoke_rcu_kthread(): 10 invocations/s
[ 523.953099] invoke_rcu_kthread(): 20 invocations/s
[ 533.953716] invoke_rcu_kthread(): 17 invocations/s
[ 543.954314] invoke_rcu_kthread(): 21 invocations/s
[ 553.954789] invoke_rcu_kthread(): 15 invocations/s
[ 563.955380] invoke_rcu_kthread(): 3 invocations/s
[ 573.955967] invoke_rcu_kthread(): 10 invocations/s
[ 583.955423] invoke_rcu_kthread(): 23 invocations/s
[ 593.956965] invoke_rcu_kthread(): 10 invocations/s
[ 603.957667] invoke_rcu_kthread(): 2 invocations/s
[ 613.958251] invoke_rcu_kthread(): 2 invocations/s
[ 623.958932] invoke_rcu_kthread(): 2 invocations/s
[ 633.959554] invoke_rcu_kthread(): 2 invocations/s
[ 643.960229] invoke_rcu_kthread(): 1 invocations/s
[ 653.960806] invoke_rcu_kthread(): 2 invocations/s
[ 663.961514] invoke_rcu_kthread(): 2 invocations/s
[ 673.962195] invoke_rcu_kthread(): 1 invocations/s
[ 683.962732] invoke_rcu_kthread(): 26 invocations/s
[ 693.963290] invoke_rcu_kthread(): 9 invocations/s
[ 703.963999] invoke_rcu_kthread(): 2 invocations/s
[ 713.964699] invoke_rcu_kthread(): 1 invocations/s
[ 723.965283] invoke_rcu_kthread(): 1 invocations/s
[ 733.965950] invoke_rcu_kthread(): 1 invocations/s
I wonder why it took 5 minutes for the first message to appear...
Anyway, the problem isn't manifest yet and I haven't got a reliable way to reproduce it. But hopefully it will reproduce itself quickly.
Okay, I hit the issue again. No idea what provoked it, I was just chatting away in Pidgin for quite some time already (but the system previously survived much longer Pidgin sessions after the last reboot). Interestingly, the system also recovered this time after 20 minutes of struggling, shortly after I managed to quit Pidgin... This didn't happen when I first encountered the issue. Anyway, under normal system operation I mostly get 2 or 3 invoke_rcu_kthread() invocations per second. While the system was unusable due to rcu_kthread spinning the CPU, I got only 0/1 invocations per second. This doesn't look like a wakeup storm for me (or I made an error adapting the diagnostic patch to 2.6.39.1). Created attachment 62562 [details]
screenshot of 'top' and syslog shortly before the system recovered
I went to 3.0.0-rc5 and compiled in the RCU tourture test alongside with your original diagnostic patch. After an "echo 1 >rcutorture_runnable" I started to get around 200 invoke_rcu_kthread() invocations per second instead of the usual 2-3 and a load slightly above 3. However, after "echo 0 >rcutorture_runnable", no torture summary appeared in dmesg, contrary to the torture.txt documentation: $ dmesg|grep rcu [ 0.217627] rcu-torture:--- Start of test: nreaders=2 nfakewriters=4 stat_interval=0 verbose=0 test_no_idle_hz=0 shuffle_interval=3 stutter=5 irqreader=1 fqs_duration=0 fqs_holdoff=0 fqs_stutter=3 test_boost=1/0 test_boost_interval=7 test_boost_duration=4 [ 311.588621] invoke_rcu_kthread(): 1256 invocations/s [ 321.838029] invoke_rcu_kthread(): 1 invocations/s [ 331.928567] invoke_rcu_kthread(): 1 invocations/s [ 342.989542] invoke_rcu_kthread(): 64 invocations/s [ 352.990217] invoke_rcu_kthread(): 2 invocations/s [ 362.990812] invoke_rcu_kthread(): 1 invocations/s [ 372.991529] invoke_rcu_kthread(): 1 invocations/s [...] [ 1953.094323] invoke_rcu_kthread(): 2 invocations/s [ 1963.095013] invoke_rcu_kthread(): 2 invocations/s [ 1973.095697] invoke_rcu_kthread(): 1 invocations/s [ 1983.096405] invoke_rcu_kthread(): 2 invocations/s [ 1993.097116] invoke_rcu_kthread(): 2 invocations/s [ 2003.097696] invoke_rcu_kthread(): 3 invocations/s [ 2013.098269] invoke_rcu_kthread(): 2 invocations/s [ 2023.098926] invoke_rcu_kthread(): 10 invocations/s [ 2033.099521] invoke_rcu_kthread(): 28 invocations/s [ 2043.100221] invoke_rcu_kthread(): 4 invocations/s [ 2053.100014] invoke_rcu_kthread(): 86 invocations/s [ 2063.100141] invoke_rcu_kthread(): 212 invocations/s [ 2073.100028] invoke_rcu_kthread(): 207 invocations/s [ 2083.100018] invoke_rcu_kthread(): 213 invocations/s [ 2093.100029] invoke_rcu_kthread(): 205 invocations/s [ 2103.115541] invoke_rcu_kthread(): 217 invocations/s [...] [ 4104.208014] invoke_rcu_kthread(): 197 invocations/s [ 4114.256282] invoke_rcu_kthread(): 206 invocations/s [ 4124.301588] invoke_rcu_kthread(): 208 invocations/s [ 4134.315642] invoke_rcu_kthread(): 211 invocations/s [ 4144.317513] invoke_rcu_kthread(): 88 invocations/s [ 4154.318120] invoke_rcu_kthread(): 2 invocations/s [ 4164.318804] invoke_rcu_kthread(): 3 invocations/s [ 4174.319487] invoke_rcu_kthread(): 3 invocations/s [ 4184.320177] invoke_rcu_kthread(): 2 invocations/s [ 4194.320857] invoke_rcu_kthread(): 6 invocations/s [ 4204.321531] invoke_rcu_kthread(): 21 invocations/s [ 4214.322115] invoke_rcu_kthread(): 6 invocations/s I also checked the full dmesg without grepping for anything. On the other hand, I haven't experienced this issue under 3.0.0-rc5 yet, but it never was reproducible anyway. 3.0.0-rc5 definitely exhibits the original issue. I got a hunch that it may be wireless related: the load seems to rise after the access point I'm associated to weakens, and the system recovers seconds after my radio finally disassociates from the unreachable access point. Unfortunately, this may take half an hour or so. I'm using ipw2200. But this is just a feeling, I've got no hard data and can't easily test this theory either. Is there a way to log what the rcu_kthread is actually doing? Or in whose name? Commit 6db4df143319 from linux-rcu should reduce the overhead. In addition, there are trace-events in git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu.git rcu/next that show how many callbacks are invoked at a time. This would give greater accuracy than the patch I sent you and that you backported (which does look good, by the way). The 1256 callbacks/second is high, but should not result in problems. (In reply to comment #8) > Commit 6db4df143319 from linux-rcu should reduce the overhead. Sorry, I'm afraid I can't find that commit (maybe because I've got no idea what linux-rcu is.) > In addition, there are trace-events in > git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu.git > rcu/next that show how many callbacks are invoked at a time. This would give > greater accuracy than the patch I sent you and that you backported (which > does > look good, by the way). I've never used tracers yet, but I'll give them a try when I get around. > The 1256 callbacks/second is high, but should not result in problems. No, that isn't the problem indeed. The first readout is 1256, which happens after 311 seconds of uptime, so it's actually a cumulated value. I'd be interested why it takes more than 5 minutes to get here, though. As I described in comment #4, when rcu_kthread is spinning the CPU, your debug patch reports only 0/1 invocations per second, ie. LESS than under normal operation (that is, 2/3 -- not much of a difference, but clearly visible in the logs). Ah, sorry! -rcu is a git tree at: git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu.git Please let me know if you need me to post a patch, but probably best if you can pull it from the git tree. Then I guessed right, but that tree doesn't seem to contain the commit 6db4df143319 you referenced, didn't you mean 53ef564d? Anyway, I pulled it and will give trace-events a shot. Created attachment 66312 [details]
RCU trace output around system breakdown (gzip compressed)
I directed trace_pipe into a file and enabled the RCU tracers with the following command:
szonett:/sys/kernel/debug/tracing/events/rcu# for i in rcu_*/enable; do echo 1 >$i; done
Eventually, I managed to reproduce the issue by clicking Disconnect in nm-applet (Network Manager got stuck on an unreachable access point). Part of syslog (starting before the disconnect, to sample normal behaviour):
Jul 18 00:14:36 szonett kernel: [14412.145524] rcu_process_callbacks(): 23 invocations in 10 seconds
Jul 18 00:14:46 szonett kernel: [14422.146135] rcu_process_callbacks(): 20 invocations in 10 seconds
Jul 18 00:14:56 szonett kernel: [14432.146833] rcu_process_callbacks(): 31 invocations in 10 seconds
Jul 18 00:15:06 szonett kernel: [14442.147536] rcu_process_callbacks(): 26 invocations in 10 seconds
Jul 18 00:15:06 szonett NetworkManager[1158]: <info> (wlan): device state change: 8 -> 3 (reason 39)
Jul 18 00:15:06 szonett NetworkManager[1158]: <info> (wlan): deactivating device (reason: 39).
Jul 18 00:15:06 szonett NetworkManager[1158]: <info> (wlan): canceled DHCP transaction, DHCP client pid 5787
Jul 18 00:15:06 szonett dnsmasq[1149]: reading /etc/resolv.conf
Jul 18 00:15:14 szonett NetworkManager[1158]: <error> [1310940914.813483] [nm-system.c:1229] check_one_route(): (wlan): error -34 returned from rtnl_route_del(): Netlink Error (errno = Numerical result out of range)
Jul 18 00:15:14 szonett dnsmasq[1149]: using nameserver 213.46.246.54#53
Jul 18 00:15:14 szonett dnsmasq[1149]: using nameserver 213.46.246.53#53
Jul 18 00:15:14 szonett dnsmasq[1149]: no servers found in /etc/resolv.conf, will retry
Jul 18 00:15:19 szonett kernel: [14452.148249] rcu_process_callbacks(): 40 invocations in 10 seconds
Jul 18 00:15:33 szonett kernel: [14463.874158] rcu_process_callbacks(): 44 invocations in 11 seconds
Jul 18 00:16:26 szonett kernel: [14475.777381] rcu_process_callbacks(): 17 invocations in 11 seconds
Jul 18 00:16:26 szonett kernel: [14490.786453] rcu_process_callbacks(): 3 invocations in 15 seconds
Jul 18 00:16:26 szonett kernel: [14501.856736] rcu_process_callbacks(): 2 invocations in 11 seconds
Jul 18 00:16:26 szonett kernel: [14512.768836] rcu_process_callbacks(): 3 invocations in 10 seconds
Jul 18 00:16:33 szonett kernel: [14524.000400] rcu_process_callbacks(): 2 invocations in 11 seconds
I slightly modified your diagnostic patch to print the number of callbacks and the length of the actual interval instead of the callback rate. This shows how the timing gets erratic and the callback rate diminishes when the load jumps to 20. The same can be seen on the trace output, for example if you plot it in Gnuplot with a command like:
plot "<zcat ./rcu_trace_knee.txt.gz | cut -c 29-41" u 0:1 w d
Unfortunately I can't precisely match the trace timestamps with the kernel timestamps in the syslog (what's their relation?), but I thinks its something like the above, ie. both syslog and the trace output include the breakdown point. I'll try enabling some WiFi driver traces as well, but that's really pure heuristics, I can't really interpret this output.
I'd much welcome further debugging ideas.
|