Bug 196583

Summary: RCU hang
Product: File System Reporter: Steinar H. Gunderson (steinar+kernel)
Component: NFSAssignee: bfields
Status: RESOLVED WILL_NOT_FIX    
Severity: high CC: bjoern, mail, trondmy
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 4.11.8 Subsystem:
Regression: No Bisected commit-id:

Description Steinar H. Gunderson 2017-08-04 14:10:35 UTC
Hi,

One of our NFS servers yesterday gave the following:



Aug  4 01:46:35 cassarossa kernel: [3048800.249336] INFO: rcu_sched self-detected stall on CPU
Aug  4 01:46:35 cassarossa kernel: [3048800.254788]     2-...: (5998 ticks this GP) idle=38f/140000000000001/0 softirq=33698432/33698432 fqs=2461 
Aug  4 01:46:35 cassarossa kernel: [3048800.264592]      (t=6001 jiffies g=108048207 c=108048206 q=66783)
Aug  4 01:46:35 cassarossa kernel: [3048800.269340] INFO: rcu_sched detected stalls on CPUs/tasks:
Aug  4 01:46:35 cassarossa kernel: [3048800.269358]     2-...: (5998 ticks this GP) idle=38f/140000000000001/0 softirq=33698432/33698432 fqs=2462 
Aug  4 01:46:35 cassarossa kernel: [3048800.269359]     (detected by 12, t=6002 jiffies, g=108048207, c=108048206, q=66783)
Aug  4 01:46:35 cassarossa kernel: [3048800.269364] Sending NMI from CPU 12 to CPUs 2:
Aug  4 01:46:35 cassarossa kernel: [3048800.270374] NMI backtrace for cpu 2
Aug  4 01:46:35 cassarossa kernel: [3048800.270375] CPU: 2 PID: 3959 Comm: rpc.svcgssd Not tainted 4.11.8 #1
Aug  4 01:46:35 cassarossa kernel: [3048800.270383] Hardware name: Supermicro X9DR3-F/X9DR3-F, BIOS 2.0a 03/27/2013
Aug  4 01:46:35 cassarossa kernel: [3048800.270383] task: ffff88105b6d54c0 task.stack: ffffc900085e0000
Aug  4 01:46:35 cassarossa kernel: [3048800.270384] RIP: 0010:io_serial_in+0x13/0x18
Aug  4 01:46:35 cassarossa kernel: [3048800.270384] RSP: 0018:ffff88085fa83c38 EFLAGS: 00000002
Aug  4 01:46:35 cassarossa kernel: [3048800.270385] RAX: 00f1c4dab623bb00 RBX: ffffffff81c5e620 RCX: 0000000000000000
Aug  4 01:46:35 cassarossa kernel: [3048800.270386] RDX: 00000000000003fd RSI: 0000000000000005 RDI: ffffffff81c5e620
Aug  4 01:46:35 cassarossa kernel: [3048800.270386] RBP: ffff88085fa83c38 R08: 0000000000000002 R09: 0000000000000000
Aug  4 01:46:35 cassarossa kernel: [3048800.270387] R10: 000000000000046d R11: ffffffff81bf0009 R12: 00000000000026f1
Aug  4 01:46:35 cassarossa kernel: [3048800.270388] R13: 0000000000000020 R14: ffffffff81317cd9 R15: 0000000000000044
Aug  4 01:46:35 cassarossa kernel: [3048800.270388] FS:  00007f0024bd6b80(0000) GS:ffff88085fa80000(0000) knlGS:0000000000000000
Aug  4 01:46:35 cassarossa kernel: [3048800.270389] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug  4 01:46:35 cassarossa kernel: [3048800.270390] CR2: 000055c268563598 CR3: 0000001058775000 CR4: 00000000000406e0
Aug  4 01:46:35 cassarossa kernel: [3048800.270390] Call Trace:
Aug  4 01:46:35 cassarossa kernel: [3048800.270390]  <IRQ>
Aug  4 01:46:35 cassarossa kernel: [3048800.270391]  serial_in+0x7/0x9
Aug  4 01:46:35 cassarossa kernel: [3048800.270391]  wait_for_xmitr+0x22/0x87
Aug  4 01:46:35 cassarossa kernel: [3048800.270392]  serial8250_console_putchar+0x17/0x29
Aug  4 01:46:35 cassarossa kernel: [3048800.270392]  uart_console_write+0x3b/0x46
Aug  4 01:46:35 cassarossa kernel: [3048800.270393]  serial8250_console_write+0x16f/0x1d0
Aug  4 01:46:35 cassarossa kernel: [3048800.270393]  ? msg_print_text+0x9e/0xf2
Aug  4 01:46:35 cassarossa kernel: [3048800.270393]  univ8250_console_write+0x1c/0x1e
Aug  4 01:46:35 cassarossa kernel: [3048800.270394]  console_unlock+0x36d/0x3db
Aug  4 01:46:35 cassarossa kernel: [3048800.270394]  vprintk_emit+0x1f9/0x20b
Aug  4 01:46:35 cassarossa kernel: [3048800.270394]  vprintk_default+0x18/0x1a
Aug  4 01:46:35 cassarossa kernel: [3048800.270395]  vprintk_func+0x45/0x47
Aug  4 01:46:35 cassarossa kernel: [3048800.270395]  printk+0x3e/0x46
Aug  4 01:46:35 cassarossa kernel: [3048800.270395]  rcu_check_callbacks+0x224/0x5e8
Aug  4 01:46:35 cassarossa kernel: [3048800.270396]  update_process_times+0x2a/0x4f
Aug  4 01:46:35 cassarossa kernel: [3048800.270396]  tick_sched_handle+0x2f/0x3b
Aug  4 01:46:35 cassarossa kernel: [3048800.270397]  tick_sched_timer+0x34/0x5a
Aug  4 01:46:35 cassarossa kernel: [3048800.270397]  __hrtimer_run_queues+0xb0/0x183
Aug  4 01:46:35 cassarossa kernel: [3048800.270397]  hrtimer_interrupt+0x5f/0xff
Aug  4 01:46:35 cassarossa kernel: [3048800.270398]  local_apic_timer_interrupt+0x46/0x49
Aug  4 01:46:35 cassarossa kernel: [3048800.270398]  smp_apic_timer_interrupt+0x24/0x34
Aug  4 01:46:35 cassarossa kernel: [3048800.270399]  apic_timer_interrupt+0x86/0x90
Aug  4 01:46:35 cassarossa kernel: [3048800.270399] RIP: 0010:qword_addhex+0x57/0x7e [sunrpc]
Aug  4 01:46:35 cassarossa kernel: [3048800.270399] RSP: 0018:ffffc900085e3d60 EFLAGS: 00000206 ORIG_RAX: ffffffffffffff10
Aug  4 01:46:35 cassarossa kernel: [3048800.270400] RAX: 00000000000004c5 RBX: ffff8802bc366180 RCX: 000000000000071b
Aug  4 01:46:35 cassarossa kernel: [3048800.270401] RDX: ffff8802ac05559b RSI: ffffc900085e3db4 RDI: ffffc900085e3db8
Aug  4 01:46:35 cassarossa kernel: [3048800.270401] RBP: ffffc900085e3d60 R08: ffff8802ac053b3b R09: 0000000000000036
Aug  4 01:46:35 cassarossa kernel: [3048800.270402] R10: 0000000000000039 R11: ffff88105b6d54c0 R12: ffffc900085e3db8
Aug  4 01:46:35 cassarossa kernel: [3048800.270402] R13: ffffc900085e3db4 R14: ffff88105aea3900 R15: ffff88085acc7080
Aug  4 01:46:35 cassarossa kernel: [3048800.270403]  </IRQ>
Aug  4 01:46:35 cassarossa kernel: [3048800.270403]  rsi_request+0x36/0x45 [auth_rpcgss]
Aug  4 01:46:35 cassarossa kernel: [3048800.270404]  cache_read.isra.24+0x104/0x2a5 [sunrpc]
Aug  4 01:46:35 cassarossa kernel: [3048800.270404]  cache_read_procfs+0x30/0x35 [sunrpc]
Aug  4 01:46:35 cassarossa kernel: [3048800.270404]  proc_reg_read+0x4e/0x63
Aug  4 01:46:35 cassarossa kernel: [3048800.270405]  __vfs_read+0x21/0x98
Aug  4 01:46:35 cassarossa kernel: [3048800.270405]  ? fsnotify_perm+0x60/0x6c
Aug  4 01:46:35 cassarossa kernel: [3048800.270405]  ? security_file_permission+0x36/0x3d
Aug  4 01:46:35 cassarossa kernel: [3048800.270406]  ? rw_verify_area+0x71/0xaa
Aug  4 01:46:35 cassarossa kernel: [3048800.270406]  vfs_read+0x9f/0xd7
Aug  4 01:46:35 cassarossa kernel: [3048800.270406]  SyS_read+0x48/0x73
Aug  4 01:46:35 cassarossa kernel: [3048800.270407]  entry_SYSCALL_64_fastpath+0x13/0x94
Aug  4 01:46:35 cassarossa kernel: [3048800.270407] RIP: 0033:0x7f0023988700
Aug  4 01:46:35 cassarossa kernel: [3048800.270408] RSP: 002b:00007ffdfeba1ff8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
Aug  4 01:46:35 cassarossa kernel: [3048800.270409] RAX: ffffffffffffffda RBX: 000055686af24270 RCX: 00007f0023988700
Aug  4 01:46:35 cassarossa kernel: [3048800.270409] RDX: 0000000000008000 RSI: 00007ffdfeba4120 RDI: 0000000000000003
Aug  4 01:46:35 cassarossa kernel: [3048800.270410] RBP: 00007f0023c43440 R08: 000055686af245d0 R09: 00007f00236a9f00
Aug  4 01:46:35 cassarossa kernel: [3048800.270410] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
Aug  4 01:46:35 cassarossa kernel: [3048800.270411] R13: 00007ffdfeba1dd0 R14: 0000000002695c6a R15: 0000000000000000
Aug  4 01:46:35 cassarossa kernel: [3048800.270411] Code: 0f b6 8f a1 00 00 00 55 48 89 e5 d3 e6 48 63 f6 48 03 77 10 8b 06 5d c3 0f b6 8f a1 00 00 00 55 8b 57 08 48 89 e5 d3 e6 01 f2 ec <0f> b6 c0 5d c3 0f b6 8f a1 00 00 00 89 d0 55 8b 57 08 48 89 e5 
Aug  4 01:46:35 cassarossa kernel: [3048800.658185] NMI backtrace for cpu 2
Aug  4 01:46:35 cassarossa kernel: [3048800.661978] CPU: 2 PID: 3959 Comm: rpc.svcgssd Not tainted 4.11.8 #1
Aug  4 01:46:35 cassarossa kernel: [3048800.668639] Hardware name: Supermicro X9DR3-F/X9DR3-F, BIOS 2.0a 03/27/2013
Aug  4 01:46:35 cassarossa kernel: [3048800.675905] Call Trace:
Aug  4 01:46:35 cassarossa kernel: [3048800.678663]  <IRQ>
Aug  4 01:46:35 cassarossa kernel: [3048800.680987]  dump_stack+0x4d/0x63
Aug  4 01:46:35 cassarossa kernel: [3048800.684614]  nmi_cpu_backtrace+0x75/0x87
Aug  4 01:46:35 cassarossa kernel: [3048800.688837]  ? irq_force_complete_move+0xd5/0xd5
Aug  4 01:46:35 cassarossa kernel: [3048800.693769]  nmi_trigger_cpumask_backtrace+0x4a/0xa7
Aug  4 01:46:35 cassarossa kernel: [3048800.699032]  arch_trigger_cpumask_backtrace+0x14/0x16
Aug  4 01:46:35 cassarossa kernel: [3048800.704388]  rcu_dump_cpu_stacks+0x92/0xbd
Aug  4 01:46:35 cassarossa kernel: [3048800.708786]  rcu_check_callbacks+0x234/0x5e8
Aug  4 01:46:35 cassarossa kernel: [3048800.713367]  update_process_times+0x2a/0x4f
Aug  4 01:46:35 cassarossa kernel: [3048800.717853]  tick_sched_handle+0x2f/0x3b
Aug  4 01:46:35 cassarossa kernel: [3048800.722082]  tick_sched_timer+0x34/0x5a
Aug  4 01:46:35 cassarossa kernel: [3048800.726231]  __hrtimer_run_queues+0xb0/0x183
Aug  4 01:46:35 cassarossa kernel: [3048800.730796]  hrtimer_interrupt+0x5f/0xff
Aug  4 01:46:35 cassarossa kernel: [3048800.735022]  local_apic_timer_interrupt+0x46/0x49
Aug  4 01:46:35 cassarossa kernel: [3048800.740043]  smp_apic_timer_interrupt+0x24/0x34
Aug  4 01:46:35 cassarossa kernel: [3048800.744869]  apic_timer_interrupt+0x86/0x90
Aug  4 01:46:35 cassarossa kernel: [3048800.749361] RIP: 0010:qword_addhex+0x57/0x7e [sunrpc]
Aug  4 01:46:35 cassarossa kernel: [3048800.754712] RSP: 0018:ffffc900085e3d60 EFLAGS: 00000206 ORIG_RAX: ffffffffffffff10
Aug  4 01:46:35 cassarossa kernel: [3048800.762691] RAX: 00000000000004c5 RBX: ffff8802bc366180 RCX: 000000000000071b
Aug  4 01:46:35 cassarossa kernel: [3048800.770232] RDX: ffff8802ac05559b RSI: ffffc900085e3db4 RDI: ffffc900085e3db8
Aug  4 01:46:35 cassarossa kernel: [3048800.777792] RBP: ffffc900085e3d60 R08: ffff8802ac053b3b R09: 0000000000000036
Aug  4 01:46:35 cassarossa kernel: [3048800.785351] R10: 0000000000000039 R11: ffff88105b6d54c0 R12: ffffc900085e3db8
Aug  4 01:46:35 cassarossa kernel: [3048800.792898] R13: ffffc900085e3db4 R14: ffff88105aea3900 R15: ffff88085acc7080
Aug  4 01:46:35 cassarossa kernel: [3048800.800449]  </IRQ>
Aug  4 01:46:35 cassarossa kernel: [3048800.802861]  rsi_request+0x36/0x45 [auth_rpcgss]
Aug  4 01:46:35 cassarossa kernel: [3048800.807783]  cache_read.isra.24+0x104/0x2a5 [sunrpc]
Aug  4 01:46:35 cassarossa kernel: [3048800.813064]  cache_read_procfs+0x30/0x35 [sunrpc]
Aug  4 01:46:35 cassarossa kernel: [3048800.818062]  proc_reg_read+0x4e/0x63
Aug  4 01:46:35 cassarossa kernel: [3048800.821944]  __vfs_read+0x21/0x98
Aug  4 01:46:35 cassarossa kernel: [3048800.825564]  ? fsnotify_perm+0x60/0x6c
Aug  4 01:46:35 cassarossa kernel: [3048800.829612]  ? security_file_permission+0x36/0x3d
Aug  4 01:46:35 cassarossa kernel: [3048800.834612]  ? rw_verify_area+0x71/0xaa
Aug  4 01:46:35 cassarossa kernel: [3048800.838759]  vfs_read+0x9f/0xd7
Aug  4 01:46:35 cassarossa kernel: [3048800.842199]  SyS_read+0x48/0x73
Aug  4 01:46:35 cassarossa kernel: [3048800.845655]  entry_SYSCALL_64_fastpath+0x13/0x94
Aug  4 01:46:35 cassarossa kernel: [3048800.850576] RIP: 0033:0x7f0023988700
Aug  4 01:46:35 cassarossa kernel: [3048800.854456] RSP: 002b:00007ffdfeba1ff8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
Aug  4 01:46:35 cassarossa kernel: [3048800.862451] RAX: ffffffffffffffda RBX: 000055686af24270 RCX: 00007f0023988700
Aug  4 01:46:35 cassarossa kernel: [3048800.869998] RDX: 0000000000008000 RSI: 00007ffdfeba4120 RDI: 0000000000000003
Aug  4 01:46:35 cassarossa kernel: [3048800.877554] RBP: 00007f0023c43440 R08: 000055686af245d0 R09: 00007f00236a9f00
Aug  4 01:46:35 cassarossa kernel: [3048800.885105] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
Aug  4 01:46:35 cassarossa kernel: [3048800.892659] R13: 00007ffdfeba1dd0 R14: 0000000002695c6a R15: 0000000000000000

As time went by, it kept on coming more and more often, and eventually, tasks such as khugepaged started hanging. After a few hours, the system could no longer respond to forks, was effectively broken, and had to be rebooted.
Comment 1 bfields 2017-08-05 00:48:03 UTC
I don't understand what's going on, but I'm suspicious of the serial console and printk stuff in there.  Do you have some kind of extra debug logging turn on?  It seems to me I've gotten myself into this kind of problem before when frequent printk's were sent to a serial console.
Comment 2 Steinar H. Gunderson 2017-08-05 09:09:01 UTC
There's a serial console, but no extra debug logging, and the rest of the syslog indicates the serial console should have been entirely quiet at the time (ie., just a getty standing still).
Comment 3 Steinar H. Gunderson 2017-08-05 09:14:05 UTC
Looking at it, the timing doesn't indicate the serial console is the root issue either:

3048800.249336: First RCU hang is detected
3048800.269340 (20 ms later): Second RCU hang is detected
3048800.270374 (21 ms later): Backtrace starts

It's not so strange that the backtrace would be hitting a CPU that was still busy printing out the “RCU hang” message, is it? 20 ms at 115200 bps is only 288 bytes, after all, which is roughly the length of one of the RCU hang messages.
Comment 4 bfields 2017-08-06 02:28:14 UTC
(In reply to Steinar H. Gunderson from comment #3)
> It's not so strange that the backtrace would be hitting a CPU that was still
> busy printing out the “RCU hang” message, is it? 20 ms at 115200 bps is only
> 288 bytes, after all, which is roughly the length of one of the RCU hang
> messages.

Yes, makes sense.  Hm.

This looks like some nice documentation on diagnosing that message: https://www.kernel.org/doc/Documentation/RCU/stallwarn.txt

Nothing in the cache_read code looks obviously suspicious to me, but I haven't looked very carefully yet.

I take it this has only happened once so far?  How recently did you upgrade to v4.11.8?
Comment 5 Steinar H. Gunderson 2017-08-06 09:51:00 UTC
Yes, only once. The upgrade to 4.11.8 was June 29, and it's been running since.
Comment 6 Steinar H. Gunderson 2017-08-19 16:41:45 UTC
It hung again today (with 4.11.8). We upgraded to 4.12.8 and got pretty much immediately the same bug. Back to 4.11.8, same bug quickly again.

We've gone down to 4.9.44, and it seems to be stable for now, but it's of course hard to say.
Comment 7 Steinar H. Gunderson 2017-08-19 16:57:26 UTC
Nope, 4.9.44 also died after half an hour:

[ 1413.576799] INFO: rcu_sched self-detected stall on CPU
[ 1413.582039]  12-...: (6000 ticks this GP) idle=09d/140000000000001/0 softirq=17200/17200 fqs=2222 
[ 1413.586806] INFO: rcu_sched detected stalls on CPUs/tasks:
[ 1413.586812]  12-...: (6000 ticks this GP) idle=09d/140000000000001/0 softirq=17200/17200 fqs=2222 
[ 1413.586815]  (detected by 18, t=6002 jiffies, g=37798, c=37797, q=75916)
[ 1413.586817] Task dump for CPU 12:
[ 1413.586820] rpc.svcgssd     R  running task        0  3181      1 0x00000008
[ 1413.586825]  ffff88085fb97100 ffff88105b59d4c0 ffff88085c2f8000 ffff881057176cc0
[ 1413.586827]  0000000000017100 ffffc9000c60ba38 ffffffff814b3f64 00ffffff81070295
[ 1413.586828]  ffff88085fb97100 0000000100000000 ffff88105b59d4c0 0000000000000000
[ 1413.586829] Call Trace:
[ 1413.586839]  [<ffffffff814b3f64>] ? __schedule+0x3a7/0x4a9
[ 1413.586841]  [<ffffffff814b40e3>] schedule+0x7d/0x87
[ 1413.586844]  [<ffffffff814b5cb5>] schedule_hrtimeout_range_clock+0x38/0xc1
[ 1413.586847]  [<ffffffff81150613>] ? __pollwait+0xbe/0xc7
[ 1413.586850]  [<ffffffff81142aaa>] ? fput+0x11/0x83
[ 1413.586852]  [<ffffffff81150883>] ? poll_freewait+0x43/0x94
[ 1413.586853]  [<ffffffff81151897>] ? do_sys_poll+0x3ec/0x46c
[ 1413.586857]  [<ffffffff8106756f>] ? try_to_wake_up+0x1e1/0x1f3
[ 1413.586859]  [<ffffffff810675e7>] ? default_wake_function+0xd/0xf
[ 1413.586861]  [<ffffffff81154188>] ? __d_lookup_rcu+0xbd/0xea
[ 1413.586862]  [<ffffffff81154188>] ? __d_lookup_rcu+0xbd/0xea
[ 1413.586867]  [<ffffffff8118a80e>] ? dquot_file_open+0x14/0x32
[ 1413.586871]  [<ffffffff811a3707>] ? ext4_file_open+0x1dc/0x213
[ 1413.586874]  [<ffffffff8120af98>] ? fsnotify_perm+0x60/0x6c
[ 1413.586876]  [<ffffffff811507f1>] ? poll_select_copy_remaining+0xfe/0xfe
[ 1413.586898]  [<ffffffffa02d2347>] ? qword_addhex+0x5b/0x7e [sunrpc]
[ 1413.586903]  [<ffffffffa0343fcc>] ? rsi_request+0x36/0x45 [auth_rpcgss]
[ 1413.586917]  [<ffffffffa02d3eb0>] ? cache_read.isra.21+0x104/0x2a5 [sunrpc]
[ 1413.586929]  [<ffffffffa02d4081>] ? cache_read_procfs+0x30/0x35 [sunrpc]
[ 1413.586931]  [<ffffffff81190751>] ? proc_reg_read+0x4e/0x63
[ 1413.586933]  [<ffffffff811407cf>] ? __vfs_read+0x21/0x98
[ 1413.586934]  [<ffffffff8120af98>] ? fsnotify_perm+0x60/0x6c
[ 1413.586936]  [<ffffffff8120bd3b>] ? security_file_permission+0x36/0x3d
[ 1413.586938]  [<ffffffff811412ec>] ? rw_verify_area+0x71/0xaa
[ 1413.586940]  [<ffffffff811413c4>] ? vfs_read+0x9f/0xd7
[ 1413.586942]  [<ffffffff8114215f>] ? SyS_read+0x48/0x73
[ 1413.586945]  [<ffffffff814b6860>] ? entry_SYSCALL_64_fastpath+0x13/0x94
[ 1413.808856]   (t=6024 jiffies g=37798 c=37797 q=75967)
[ 1413.814207] Task dump for CPU 12:
[ 1413.817671] rpc.svcgssd     R  running task        0  3181      1 0x00000008
[ 1413.824977]  ffff88085fb83e38 ffffffff8106947a 000000000000000c ffffffff81a44e80
[ 1413.832853]  ffff88085fb83e50 ffffffff8106b3c8 ffffffff81a44e80 ffff88085fb83e88
[ 1413.840719]  ffffffff810ed797 0000000c00000001 ffffffff81a44e80 ffff88085fb97e40
[ 1413.848590] Call Trace:
[ 1413.851184]  <IRQ> 
[ 1413.853159]  [<ffffffff8106947a>] sched_show_task+0xca/0xd7
[ 1413.859036]  [<ffffffff8106b3c8>] dump_cpu_task+0x32/0x35
[ 1413.864587]  [<ffffffff810ed797>] rcu_dump_cpu_stacks+0x8d/0xa1
[ 1413.870668]  [<ffffffff8108bcd5>] rcu_check_callbacks+0x234/0x5f7
[ 1413.876919]  [<ffffffff8108f5fc>] update_process_times+0x2a/0x4f
[ 1413.883081]  [<ffffffff8109af8e>] tick_sched_handle+0x32/0x3e
[ 1413.888983]  [<ffffffff8109b4b3>] tick_sched_timer+0x34/0x5a
[ 1413.894797]  [<ffffffff8108fdba>] __hrtimer_run_queues+0xb0/0x183
[ 1413.901045]  [<ffffffff8109021c>] hrtimer_interrupt+0x5f/0xff
[ 1413.906941]  [<ffffffff8102e8c4>] local_apic_timer_interrupt+0x46/0x49
[ 1413.913622]  [<ffffffff8102ed9f>] smp_apic_timer_interrupt+0x29/0x39
[ 1413.920133]  [<ffffffff814b742f>] apic_timer_interrupt+0x7f/0x90
[ 1413.926290]  <EOI> 
[ 1413.928278]  [<ffffffffa02d2347>] ? qword_addhex+0x5b/0x7e [sunrpc]
[ 1413.934853]  [<ffffffffa0343fcc>] rsi_request+0x36/0x45 [auth_rpcgss]
[ 1413.941457]  [<ffffffffa02d3eb0>] cache_read.isra.21+0x104/0x2a5 [sunrpc]
[ 1413.948407]  [<ffffffffa02d4081>] cache_read_procfs+0x30/0x35 [sunrpc]
[ 1413.955085]  [<ffffffff81190751>] proc_reg_read+0x4e/0x63
[ 1413.960640]  [<ffffffff811407cf>] __vfs_read+0x21/0x98
[ 1413.965934]  [<ffffffff8120af98>] ? fsnotify_perm+0x60/0x6c
[ 1413.971664]  [<ffffffff8120bd3b>] ? security_file_permission+0x36/0x3d
[ 1413.978344]  [<ffffffff811412ec>] ? rw_verify_area+0x71/0xaa
[ 1413.984152]  [<ffffffff811413c4>] vfs_read+0x9f/0xd7
[ 1413.989276]  [<ffffffff8114215f>] SyS_read+0x48/0x73
[ 1413.994396]  [<ffffffff814b6860>] entry_SYSCALL_64_fastpath+0x13/0x94
Comment 8 bfields 2017-08-24 16:00:25 UTC
So do we have any known-good version at this point, or does it look possible that this problem could have always been there?

I don't know why rsi_request() keeps showing up.  It doesn't do much, and isn't called in any interesting context.

That very first while loop in cache_read() is under a spinlock.  In theory some bug might cause us to spend too long in that loop.  stallwarn.txt says we can get these warnings if preemption is disabled too long (and I think an ordinary spinlock does disable preemption).  I'm not sure how to test that hypothesis....  Maybe patch in a counter incremented on each pass through that loop, and WARN_ON_ONCE() if it gets too large.

Might also be worth looking through stallwarn.txt to make sure none of the other cases apply there.  (Does your .config have CONFIG_PREEMPT turned off?  Are you doing any real-time scheduling?)
Comment 9 Steinar H. Gunderson 2017-08-24 16:03:22 UTC
Well, 4.4 hasn't crashed yet, so perhaps it's a good version? Hard to say.

CONFIG_PREEMPT is indeed not set. There is no real-time scheduling.
Comment 10 bfields 2017-08-24 16:20:45 UTC
(In reply to Steinar H. Gunderson from comment #9)
> Well, 4.4 hasn't crashed yet, so perhaps it's a good version? Hard to say.
> 
> CONFIG_PREEMPT is indeed not set.

So, any excessive looping could trigger this.

rsi_request/qword_addhex still seems like an odd place for that.  In theory corrupted rsi_in{handle,token}.len could cause it, but then it'd also be writing the whole time and I'd think we'd start seeing further oopses resulting from the random memory scribbles.

Maybe we're hitting the "goto again" case in cache_read(), and cache_read(), that also covers rsi_request calls.

So adding a counter and WARN might again be a way to check for that.  Or just adding a schedule() somewhere in that loop and seeing the problem still reproduces after that.
Comment 11 Steinar H. Gunderson 2017-08-24 16:29:08 UTC
> So, any excessive looping could trigger this.

Well, if so, essentially an infinite loop. The problem isn't the warnings per se; it is that all 12 cores (24 hyperthreads) of the machine start to hang, and eventually the machine and all I/O becomes unusable. My understand is that if you just had excessive and not infinite looping, you'd get the warnings but that the system would eventually recover?

It's possible that CONFIG_PREEMPT could make it jump out of the loop somehow; we should probably enable it. But it sounds more like a workaround than a true bug fix.

For the record, the machine isn't heavily loaded at all. Typical load average is 1–2.
Comment 12 bfields 2017-08-24 17:12:56 UTC
(In reply to Steinar H. Gunderson from comment #11)
> Well, if so, essentially an infinite loop. The problem isn't the warnings
> per se; it is that all 12 cores (24 hyperthreads) of the machine start to
> hang, and eventually the machine and all I/O becomes unusable. My understand
> is that if you just had excessive and not infinite looping, you'd get the
> warnings but that the system would eventually recover?

I'm not sure how many loops are required before it becomes a problem.

I'm not sure why it's eventually hanging.  There is probably important RCU job that are never getting done.  It's also possible that a slow serial console, though not the original cause, is adding to the problem somehow.

> It's possible that CONFIG_PREEMPT could make it jump out of the loop
> somehow; we should probably enable it.

It won't affect any of those loops directly but it will allow the looping task to be temporarily preempted for other work.

> But it sounds more like a workaround than a true bug fix.

Yes, just looking for ways to figure out where the bug is.

Could you try applying this?  Should be harmless at worst:

--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -767,6 +767,7 @@ static ssize_t cache_read(struct file *filp, char __user *buf, size_t count,
        struct cache_reader *rp = filp->private_data;
        struct cache_request *rq;
        struct inode *inode = file_inode(filp);
+       int loopcount = 0;
        int err;
 
        if (count == 0)
@@ -837,8 +838,11 @@ static ssize_t cache_read(struct file *filp, char __user *buf, size_t count,
                } else
                        spin_unlock(&queue_lock);
        }
-       if (err == -EAGAIN)
+       if (err == -EAGAIN) {
+               loopcount++;
+               WARN_ON_ONCE(loopcount > 100);
                goto again;
+       }
        inode_unlock(inode);
        return err ? err :  count;
Comment 13 Steinar H. Gunderson 2017-08-24 19:15:36 UTC
Unfortunately, it's a server with ~1000 users; I'm not at a liberty to reboot it at will.
Comment 14 Steinar H. Gunderson 2017-10-09 15:49:40 UTC
Hi,

After ~50 days on 4.4.83 with no problems, it triggered today:

[Mon Oct  9 15:49:28 2017] INFO: rcu_sched self-detected stall on CPU
[Mon Oct  9 15:49:28 2017] INFO: rcu_sched detected stalls on CPUs/tasks:

[Mon Oct  9 15:49:28 2017]      3-...: (6000 ticks this GP) idle=809/140000000000001/0 softirq=197166633/197166633 fqs=5370 
[Mon Oct  9 15:49:28 2017]      
[Mon Oct  9 15:49:28 2017] (detected by 17, t=6002 jiffies, g=165187832, c=165187831, q=901055)
[Mon Oct  9 15:49:28 2017] Task dump for CPU 3:
[Mon Oct  9 15:49:28 2017] rpc.svcgssd     R
[Mon Oct  9 15:49:28 2017]   running task    
[Mon Oct  9 15:49:28 2017]     0 28629      1 0x00000008
[Mon Oct  9 15:49:28 2017]  ffffffff8147f803
[Mon Oct  9 15:49:28 2017]  0000001000000010
[Mon Oct  9 15:49:28 2017]  00ffffff8106854d
[Mon Oct  9 15:49:28 2017]  ffff880800000003

[Mon Oct  9 15:49:28 2017]  ffff8807a2981c00
[Mon Oct  9 15:49:28 2017]  ffff881036f20000
[Mon Oct  9 15:49:28 2017]  0000000000000000
[Mon Oct  9 15:49:28 2017]  ffff881036f1fb74

[Mon Oct  9 15:49:28 2017]  0000000000000000
[Mon Oct  9 15:49:28 2017]  0000000000000000
[Mon Oct  9 15:49:28 2017]  ffff881036f1fa50
[Mon Oct  9 15:49:28 2017]  ffffffff8147f932

[Mon Oct  9 15:49:28 2017] Call Trace:
[Mon Oct  9 15:49:28 2017]  [<ffffffff8147f803>] ? __schedule+0x4ae/0x56a
[Mon Oct  9 15:49:28 2017]  [<ffffffff8147f932>] ? schedule+0x73/0x81
[Mon Oct  9 15:49:28 2017]  [<ffffffff8148166f>] ? schedule_hrtimeout_range_clock+0x37/0xbf
[Mon Oct  9 15:49:28 2017]  [<ffffffff811399e6>] ? __pollwait+0xbe/0xc7
[Mon Oct  9 15:49:28 2017]  [<ffffffff8112c5b8>] ? fput+0x11/0x83
[Mon Oct  9 15:49:28 2017]  [<ffffffff81139c92>] ? poll_freewait+0x43/0x94
[Mon Oct  9 15:49:28 2017]  [<ffffffff8113ac96>] ? do_sys_poll+0x3f6/0x476
[Mon Oct  9 15:49:28 2017]  [<ffffffff8113d375>] ? __d_lookup_rcu+0xae/0xd0
[Mon Oct  9 15:49:28 2017]  [<ffffffff8113d375>] ? __d_lookup_rcu+0xae/0xd0
[Mon Oct  9 15:49:28 2017]  [<ffffffff81171a01>] ? dquot_file_open+0x14/0x32
[Mon Oct  9 15:49:28 2017]  [<ffffffff8106804b>] ? __enqueue_entity+0x67/0x69
[Mon Oct  9 15:49:28 2017]  [<ffffffff8106ac25>] ? enqueue_entity+0x55d/0x6c6
[Mon Oct  9 15:49:28 2017]  [<ffffffff81139bb9>] ? poll_select_copy_remaining+0xf6/0xf6
[Mon Oct  9 15:49:28 2017]  [<ffffffffa02d19d1>] ? qword_addhex+0x2b/0x7e [sunrpc]
[Mon Oct  9 15:49:28 2017]  [<ffffffffa0320f87>] ? rsi_request+0x36/0x45 [auth_rpcgss]
[Mon Oct  9 15:49:28 2017]  [<ffffffffa02d3550>] ? cache_read.isra.17+0x104/0x2a5 [sunrpc]
[Mon Oct  9 15:49:28 2017]  [<ffffffffa02d3721>] ? cache_read_procfs+0x30/0x35 [sunrpc]
[Mon Oct  9 15:49:28 2017]  [<ffffffff8117733a>] ? proc_reg_read+0x4e/0x63
[Mon Oct  9 15:49:28 2017]  [<ffffffff8112ab9f>] ? __vfs_read+0x21/0x98
[Mon Oct  9 15:49:28 2017]  [<ffffffff811ef15c>] ? fsnotify_perm+0x60/0x6c
[Mon Oct  9 15:49:28 2017]  [<ffffffff811efef7>] ? security_file_permission+0x36/0x3d
[Mon Oct  9 15:49:28 2017]  [<ffffffff8112b2b8>] ? rw_verify_area+0x7c/0xd1
[Mon Oct  9 15:49:28 2017]  [<ffffffff8112b393>] ? vfs_read+0x86/0xc2
[Mon Oct  9 15:49:28 2017]  [<ffffffff8112bb7a>] ? SyS_read+0x48/0x73
[Mon Oct  9 15:49:28 2017]  [<ffffffff81482017>] ? entry_SYSCALL_64_fastpath+0x12/0x66

[Mon Oct  9 15:49:31 2017]      3-...: (6000 ticks this GP) idle=809/140000000000001/0 softirq=197166633/197166633 fqs=5602 
[Mon Oct  9 15:49:31 2017]       (t=6275 jiffies g=165187832 c=165187831 q=947944)
[Mon Oct  9 15:49:31 2017] Task dump for CPU 3:
[Mon Oct  9 15:49:31 2017] rpc.svcgssd     R  running task        0 28629      1 0x00000008
[Mon Oct  9 15:49:31 2017]  0000000000000000 ffff88085fc63de8 ffffffff810648cb 0000000000000003
[Mon Oct  9 15:49:31 2017]  ffffffff81a3f500 ffff88085fc63e00 ffffffff8106643b ffffffff81a3f500
[Mon Oct  9 15:49:31 2017]  ffff88085fc63e30 ffffffff810816ad ffffffff81a3f500 ffff88085fc759c0
[Mon Oct  9 15:49:31 2017] Call Trace:
[Mon Oct  9 15:49:31 2017]  <IRQ>  [<ffffffff810648cb>] sched_show_task+0xb9/0xbe
[Mon Oct  9 15:49:31 2017]  [<ffffffff8106643b>] dump_cpu_task+0x32/0x35
[Mon Oct  9 15:49:31 2017]  [<ffffffff810816ad>] rcu_dump_cpu_stacks+0x71/0x8c
[Mon Oct  9 15:49:32 2017]  [<ffffffff810844d6>] rcu_check_callbacks+0x212/0x5a7
[Mon Oct  9 15:49:32 2017]  [<ffffffff81066f65>] ? irqtime_account_process_tick.isra.3+0x100/0x10f
[Mon Oct  9 15:49:32 2017]  [<ffffffff810879d4>] update_process_times+0x2a/0x4f
[Mon Oct  9 15:49:32 2017]  [<ffffffff81092dae>] tick_sched_handle+0x32/0x3e
[Mon Oct  9 15:49:32 2017]  [<ffffffff810932de>] tick_sched_timer+0x34/0x5a
[Mon Oct  9 15:49:32 2017]  [<ffffffff810880f6>] __hrtimer_run_queues+0xbe/0x19d
[Mon Oct  9 15:49:32 2017]  [<ffffffff81088564>] hrtimer_interrupt+0x5f/0xff
[Mon Oct  9 15:49:32 2017]  [<ffffffff8102cafd>] local_apic_timer_interrupt+0x46/0x49
[Mon Oct  9 15:49:32 2017]  [<ffffffff8102cf89>] smp_apic_timer_interrupt+0x29/0x39
[Mon Oct  9 15:49:32 2017]  [<ffffffff81482d2f>] apic_timer_interrupt+0x7f/0x90
[Mon Oct  9 15:49:32 2017]  <EOI>  [<ffffffffa02d19d1>] ? qword_addhex+0x2b/0x7e [sunrpc]
[Mon Oct  9 15:49:32 2017]  [<ffffffffa0320f87>] rsi_request+0x36/0x45 [auth_rpcgss]
[Mon Oct  9 15:49:32 2017]  [<ffffffffa02d3550>] cache_read.isra.17+0x104/0x2a5 [sunrpc]
[Mon Oct  9 15:49:33 2017]  [<ffffffffa02d3721>] cache_read_procfs+0x30/0x35 [sunrpc]
[Mon Oct  9 15:49:33 2017]  [<ffffffff8117733a>] proc_reg_read+0x4e/0x63
[Mon Oct  9 15:49:33 2017]  [<ffffffff8112ab9f>] __vfs_read+0x21/0x98
[Mon Oct  9 15:49:33 2017]  [<ffffffff811ef15c>] ? fsnotify_perm+0x60/0x6c
[Mon Oct  9 15:49:33 2017]  [<ffffffff811efef7>] ? security_file_permission+0x36/0x3d
[Mon Oct  9 15:49:33 2017]  [<ffffffff8112b2b8>] ? rw_verify_area+0x7c/0xd1
[Mon Oct  9 15:49:33 2017]  [<ffffffff8112b393>] vfs_read+0x86/0xc2
[Mon Oct  9 15:49:33 2017]  [<ffffffff8112bb7a>] SyS_read+0x48/0x73
[Mon Oct  9 15:49:33 2017]  [<ffffffff81482017>] entry_SYSCALL_64_fastpath+0x12/0x66
[Mon Oct  9 15:52:15 2017] INFO: rcu_sched self-detected stall on CPU
[Mon Oct  9 15:52:15 2017] INFO: rcu_sched detected stalls on CPUs/tasks:

[Mon Oct  9 15:52:15 2017]      6-...: (6000 ticks this GP) idle=64d/140000000000001/0 softirq=203638013/203638013 fqs=5905 
[Mon Oct  9 15:52:15 2017]      
[Mon Oct  9 15:52:15 2017] (detected by 1, t=6002 jiffies, g=165187841, c=165187840, q=932166)
[Mon Oct  9 15:52:15 2017] Task dump for CPU 6:
[Mon Oct  9 15:52:15 2017] rpc.svcgssd     R
[Mon Oct  9 15:52:15 2017]   running task    
[Mon Oct  9 15:52:15 2017]     0 28629      1 0x00000008
[Mon Oct  9 15:52:15 2017]  ffff88085f8001c0
[Mon Oct  9 15:52:15 2017]  000000018010000c
[Mon Oct  9 15:52:15 2017]  ffff880d00000003
[Mon Oct  9 15:52:15 2017]  ffff880f9deb3200

[Mon Oct  9 15:52:15 2017]  0000000100010000
[Mon Oct  9 15:52:15 2017]  0000000180190011
[Mon Oct  9 15:52:15 2017]  0011000100000001
[Mon Oct  9 15:52:15 2017]  0000000180190012

[Mon Oct  9 15:52:15 2017]  0000000a0000000f
[Mon Oct  9 15:52:15 2017]  0000000000000246
[Mon Oct  9 15:52:15 2017]  0000000000000246
[Mon Oct  9 15:52:15 2017]  ffff881036f1fa60

[Mon Oct  9 15:52:15 2017] Call Trace:
[Mon Oct  9 15:52:15 2017]  [<ffffffff81070c2e>] ? add_wait_queue+0x3f/0x44
[Mon Oct  9 15:52:15 2017]  [<ffffffff811399e6>] ? __pollwait+0xbe/0xc7
[Mon Oct  9 15:52:15 2017]  [<ffffffff8112c5b8>] ? fput+0x11/0x83
[Mon Oct  9 15:52:15 2017]  [<ffffffff81139c92>] ? poll_freewait+0x43/0x94
[Mon Oct  9 15:52:15 2017]  [<ffffffff8113ac96>] ? do_sys_poll+0x3f6/0x476
[Mon Oct  9 15:52:15 2017]  [<ffffffff810e557b>] ? free_pages_prepare+0x81/0xdf
[Mon Oct  9 15:52:15 2017]  [<ffffffff810e76ee>] ? __free_pages+0x16/0x1f
[Mon Oct  9 15:52:15 2017]  [<ffffffff810e7813>] ? __free_kmem_pages+0x9/0xb
[Mon Oct  9 15:52:15 2017]  [<ffffffff81120036>] ? __free_slab+0x120/0x12c
[Mon Oct  9 15:52:15 2017]  [<ffffffff81120087>] ? discard_slab+0x45/0x47
[Mon Oct  9 15:52:15 2017]  [<ffffffff811203fb>] ? unfreeze_partials.isra.65+0x10a/0x124
[Mon Oct  9 15:52:15 2017]  [<ffffffff8106804b>] ? __enqueue_entity+0x67/0x69
[Mon Oct  9 15:52:15 2017]  [<ffffffff8106ac25>] ? enqueue_entity+0x55d/0x6c6
[Mon Oct  9 15:52:15 2017]  [<ffffffff8112054e>] ? put_cpu_partial+0x53/0xce
[Mon Oct  9 15:52:15 2017]  [<ffffffff81121a90>] ? __slab_free+0x1cf/0x2a3
[Mon Oct  9 15:52:15 2017]  [<ffffffffa02d19fd>] ? qword_addhex+0x57/0x7e [sunrpc]
[Mon Oct  9 15:52:15 2017]  [<ffffffffa0320f87>] ? rsi_request+0x36/0x45 [auth_rpcgss]
[Mon Oct  9 15:52:15 2017]  [<ffffffffa02d3550>] ? cache_read.isra.17+0x104/0x2a5 [sunrpc]
[Mon Oct  9 15:52:15 2017]  [<ffffffffa02d3721>] ? cache_read_procfs+0x30/0x35 [sunrpc]
[Mon Oct  9 15:52:15 2017]  [<ffffffff8117733a>] ? proc_reg_read+0x4e/0x63
[Mon Oct  9 15:52:15 2017]  [<ffffffff8112ab9f>] ? __vfs_read+0x21/0x98
[Mon Oct  9 15:52:15 2017]  [<ffffffff811ef15c>] ? fsnotify_perm+0x60/0x6c
[Mon Oct  9 15:52:15 2017]  [<ffffffff811efef7>] ? security_file_permission+0x36/0x3d
[Mon Oct  9 15:52:15 2017]  [<ffffffff8112b2b8>] ? rw_verify_area+0x7c/0xd1
[Mon Oct  9 15:52:15 2017]  [<ffffffff8112b393>] ? vfs_read+0x86/0xc2
[Mon Oct  9 15:52:15 2017]  [<ffffffff8112bb7a>] ? SyS_read+0x48/0x73
[Mon Oct  9 15:52:15 2017]  [<ffffffff81482017>] ? entry_SYSCALL_64_fastpath+0x12/0x66

[Mon Oct  9 15:52:18 2017]      6-...: (6000 ticks this GP) idle=64d/140000000000001/0 softirq=203638013/203638013 fqs=6172 
[Mon Oct  9 15:52:18 2017]       (t=6287 jiffies g=165187841 c=165187840 q=971357)
[Mon Oct  9 15:52:18 2017] Task dump for CPU 6:
[Mon Oct  9 15:52:18 2017] rpc.svcgssd     R  running task        0 28629      1 0x00000008
[Mon Oct  9 15:52:18 2017]  0000000000000000 ffff88107fc03de8 ffffffff810648cb 0000000000000006
[Mon Oct  9 15:52:18 2017]  ffffffff81a3f500 ffff88107fc03e00 ffffffff8106643b ffffffff81a3f500
[Mon Oct  9 15:52:18 2017]  ffff88107fc03e30 ffffffff810816ad ffffffff81a3f500 ffff88107fc159c0
[Mon Oct  9 15:52:18 2017] Call Trace:
[Mon Oct  9 15:52:18 2017]  <IRQ>  [<ffffffff810648cb>] sched_show_task+0xb9/0xbe
[Mon Oct  9 15:52:18 2017]  [<ffffffff8106643b>] dump_cpu_task+0x32/0x35
[Mon Oct  9 15:52:18 2017]  [<ffffffff810816ad>] rcu_dump_cpu_stacks+0x71/0x8c
[Mon Oct  9 15:52:19 2017]  [<ffffffff810844d6>] rcu_check_callbacks+0x212/0x5a7
[Mon Oct  9 15:52:19 2017]  [<ffffffff81066f65>] ? irqtime_account_process_tick.isra.3+0x100/0x10f
[Mon Oct  9 15:52:19 2017]  [<ffffffff810879d4>] update_process_times+0x2a/0x4f
[Mon Oct  9 15:52:19 2017]  [<ffffffff81092dae>] tick_sched_handle+0x32/0x3e
[Mon Oct  9 15:52:19 2017]  [<ffffffff810932de>] tick_sched_timer+0x34/0x5a
[Mon Oct  9 15:52:19 2017]  [<ffffffff810880f6>] __hrtimer_run_queues+0xbe/0x19d
[Mon Oct  9 15:52:19 2017]  [<ffffffff81088564>] hrtimer_interrupt+0x5f/0xff
[Mon Oct  9 15:52:19 2017]  [<ffffffff8102cafd>] local_apic_timer_interrupt+0x46/0x49
[Mon Oct  9 15:52:19 2017]  [<ffffffff8102cf89>] smp_apic_timer_interrupt+0x29/0x39
[Mon Oct  9 15:52:19 2017]  [<ffffffff81482d2f>] apic_timer_interrupt+0x7f/0x90
[Mon Oct  9 15:52:19 2017]  <EOI>  [<ffffffffa02d19fd>] ? qword_addhex+0x57/0x7e [sunrpc]
[Mon Oct  9 15:52:19 2017]  [<ffffffffa0320f87>] rsi_request+0x36/0x45 [auth_rpcgss]
[Mon Oct  9 15:52:19 2017]  [<ffffffffa02d3550>] cache_read.isra.17+0x104/0x2a5 [sunrpc]
[Mon Oct  9 15:52:20 2017]  [<ffffffffa02d3721>] cache_read_procfs+0x30/0x35 [sunrpc]
[Mon Oct  9 15:52:20 2017]  [<ffffffff8117733a>] proc_reg_read+0x4e/0x63
[Mon Oct  9 15:52:20 2017]  [<ffffffff8112ab9f>] __vfs_read+0x21/0x98
[Mon Oct  9 15:52:20 2017]  [<ffffffff811ef15c>] ? fsnotify_perm+0x60/0x6c
[Mon Oct  9 15:52:20 2017]  [<ffffffff811efef7>] ? security_file_permission+0x36/0x3d
[Mon Oct  9 15:52:20 2017]  [<ffffffff8112b2b8>] ? rw_verify_area+0x7c/0xd1
[Mon Oct  9 15:52:20 2017]  [<ffffffff8112b393>] vfs_read+0x86/0xc2
[Mon Oct  9 15:52:20 2017]  [<ffffffff8112bb7a>] SyS_read+0x48/0x73
[Mon Oct  9 15:52:20 2017]  [<ffffffff81482017>] entry_SYSCALL_64_fastpath+0x12/0x66
[Mon Oct  9 15:55:15 2017] INFO: rcu_sched self-detected stall on CPU
[Mon Oct  9 15:55:15 2017] INFO: rcu_sched detected stalls on CPUs/tasks:

[Mon Oct  9 15:55:15 2017]      6-...: (23485 ticks this GP) idle=64d/140000000000001/0 softirq=203638013/203638013 fqs=19517 
[Mon Oct  9 15:55:15 2017]      
[Mon Oct  9 15:55:15 2017] (detected by 1, t=24007 jiffies, g=165187841, c=165187840, q=3544185)
[Mon Oct  9 15:55:15 2017] Task dump for CPU 6:
[Mon Oct  9 15:55:15 2017] rpc.svcgssd     R
[Mon Oct  9 15:55:15 2017]   running task    
[Mon Oct  9 15:55:15 2017]     0 28629      1 0x00000008
[Mon Oct  9 15:55:15 2017]  ffff88085f8001c0
[Mon Oct  9 15:55:15 2017]  000000018010000c
[Mon Oct  9 15:55:15 2017]  ffff880d00000003
[Mon Oct  9 15:55:15 2017]  ffff880f9deb3200

[Mon Oct  9 15:55:15 2017]  0000000100010000
[Mon Oct  9 15:55:15 2017]  0000000180190011
[Mon Oct  9 15:55:15 2017]  0011000100000001
[Mon Oct  9 15:55:15 2017]  0000000180190012

[Mon Oct  9 15:55:15 2017]  0000000a0000000f
[Mon Oct  9 15:55:15 2017]  0000000000000246
[Mon Oct  9 15:55:15 2017]  0000000000000246
[Mon Oct  9 15:55:15 2017]  ffff881036f1fa60

[Mon Oct  9 15:55:15 2017] Call Trace:
[Mon Oct  9 15:55:15 2017]  [<ffffffff81070c2e>] ? add_wait_queue+0x3f/0x44
[Mon Oct  9 15:55:15 2017]  [<ffffffff811399e6>] ? __pollwait+0xbe/0xc7
[Mon Oct  9 15:55:15 2017]  [<ffffffff8112c5b8>] ? fput+0x11/0x83
[Mon Oct  9 15:55:15 2017]  [<ffffffff81139c92>] ? poll_freewait+0x43/0x94
[Mon Oct  9 15:55:15 2017]  [<ffffffff8113ac96>] ? do_sys_poll+0x3f6/0x476
[Mon Oct  9 15:55:15 2017]  [<ffffffff810e557b>] ? free_pages_prepare+0x81/0xdf
[Mon Oct  9 15:55:15 2017]  [<ffffffff810e76ee>] ? __free_pages+0x16/0x1f
[Mon Oct  9 15:55:15 2017]  [<ffffffff810e7813>] ? __free_kmem_pages+0x9/0xb
[Mon Oct  9 15:55:15 2017]  [<ffffffff81120036>] ? __free_slab+0x120/0x12c
[Mon Oct  9 15:55:15 2017]  [<ffffffff81120087>] ? discard_slab+0x45/0x47
[Mon Oct  9 15:55:15 2017]  [<ffffffff811203fb>] ? unfreeze_partials.isra.65+0x10a/0x124
[Mon Oct  9 15:55:15 2017]  [<ffffffff8106804b>] ? __enqueue_entity+0x67/0x69
[Mon Oct  9 15:55:15 2017]  [<ffffffff8106ac25>] ? enqueue_entity+0x55d/0x6c6
[Mon Oct  9 15:55:15 2017]  [<ffffffff8112054e>] ? put_cpu_partial+0x53/0xce
[Mon Oct  9 15:55:15 2017]  [<ffffffff81121a90>] ? __slab_free+0x1cf/0x2a3
[Mon Oct  9 15:55:15 2017]  [<ffffffffa02d19f2>] ? qword_addhex+0x4c/0x7e [sunrpc]
[Mon Oct  9 15:55:15 2017]  [<ffffffffa0320f87>] ? rsi_request+0x36/0x45 [auth_rpcgss]
[Mon Oct  9 15:55:15 2017]  [<ffffffffa02d3550>] ? cache_read.isra.17+0x104/0x2a5 [sunrpc]
[Mon Oct  9 15:55:15 2017]  [<ffffffffa02d3721>] ? cache_read_procfs+0x30/0x35 [sunrpc]
[Mon Oct  9 15:55:15 2017]  [<ffffffff8117733a>] ? proc_reg_read+0x4e/0x63
[Mon Oct  9 15:55:15 2017]  [<ffffffff8112ab9f>] ? __vfs_read+0x21/0x98
[Mon Oct  9 15:55:15 2017]  [<ffffffff811ef15c>] ? fsnotify_perm+0x60/0x6c
[Mon Oct  9 15:55:15 2017]  [<ffffffff811efef7>] ? security_file_permission+0x36/0x3d
[Mon Oct  9 15:55:15 2017]  [<ffffffff8112b2b8>] ? rw_verify_area+0x7c/0xd1
[Mon Oct  9 15:55:15 2017]  [<ffffffff8112b393>] ? vfs_read+0x86/0xc2
[Mon Oct  9 15:55:15 2017]  [<ffffffff8112bb7a>] ? SyS_read+0x48/0x73
[Mon Oct  9 15:55:15 2017]  [<ffffffff81482017>] ? entry_SYSCALL_64_fastpath+0x12/0x66

and then many more of the same.

So this is a long-standing bug, not a regression.
Comment 15 Steinar H. Gunderson 2017-10-11 07:24:23 UTC
It also died with 4.13.5; I thought I'd done CONFIG_PREEMPT=y, but evidently it hadn't gone through to the .config.

An interesting data point: Our Kerberos setup is Active Directory with Samba 4. We have a long-standing issue where suddenly some of the service principals are getting renewed even though they shouldn't, and yesterday, this hang coincided with such an incident. Could it be related?
Comment 16 Steinar H. Gunderson 2017-10-13 07:30:46 UTC
Hi;

We're now running 4.13.5 with CONFIG_PREEMPT=y. Today, the WARN_ONCE you asked me to add triggered, and rpc.svcgssd is using 100% CPU. Please advise. :-)

[157277.972046] ------------[ cut here ]------------
[157277.976866] WARNING: CPU: 0 PID: 3193 at net/sunrpc/cache.c:854 cache_read.isra.19+0x27e/0x2b2 [sunrpc]
[157277.991237] Modules linked in: cts rpcsec_gss_krb5 binfmt_misc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 af_packet ip6_tables xt_REDIRECT nf_nat_redirect iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack sb_edac x86_pkg_temp_thermal coretemp kvm_intel kvm irqbypass ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd cryptd iTCO_wdt glue_helper iTCO_vendor_support tpm_tis tpm_tis_core ipmi_si ipmi_devintf tpm ipmi_msghandler button nfsd nfs_acl lockd grace auth_rpcgss oid_registry loop sunrpc ip_tables x_tables autofs4 raid10 raid0 multipath linear mpt3sas raid_class raid456 async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq libcrc32c raid1 sd_mod hid_generic igb i2c_i801 i2c_algo_bit ehci_pci crc32c_intel ptp i2c_core ehci_hcd
[157278.062193]  isci pps_core
[157278.066409] CPU: 0 PID: 3193 Comm: rpc.svcgssd Not tainted 4.13.5-itk02 #4
[157278.073968] Hardware name: Supermicro X9DR3-F/X9DR3-F, BIOS 2.0a 03/27/2013
[157278.081168] task: ffff8810596f1d00 task.stack: ffffc90008494000
[157278.095199] RIP: 0010:cache_read.isra.19+0x27e/0x2b2 [sunrpc]
[157278.101165] RSP: 0018:ffffc90008497d90 EFLAGS: 00010202
[157278.107223] RAX: 0000000080000000 RBX: 00000000fffffff5 RCX: 0000000000000001
[157278.115162] RDX: 0000000000000001 RSI: ffffc90008497db4 RDI: 00000000ffffffff
[157278.122753] RBP: ffffc90008497de8 R08: ffff880242af9fff R09: 0000000000000039
[157278.130828] R10: 0000000000000039 R11: ffff8810596f1d00 R12: 0000000000008000
[157278.138857] R13: ffff88085bbd6100 R14: ffff8801072f7f00 R15: ffff88085a85bae0
[157278.146876] FS:  00007f63e774cb80(0000) GS:ffff88085fa00000(0000) knlGS:0000000000000000
[157278.158522] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[157278.165037] CR2: 00007fbc6498b000 CR3: 000000105a757000 CR4: 00000000000406f0
[157278.172836] Call Trace:
[157278.176126]  cache_read_procfs+0x30/0x35 [sunrpc]
[157278.181469]  proc_reg_read+0x4e/0x63
[157278.190331]  __vfs_read+0x21/0x98
[157278.194563]  ? fsnotify_perm+0x60/0x6c
[157278.199088]  ? security_file_permission+0x36/0x3d
[157278.204951]  ? rw_verify_area+0x71/0xaa
[157278.209691]  vfs_read+0x9f/0xd7
[157278.213918]  SyS_read+0x48/0x73
[157278.217626]  entry_SYSCALL_64_fastpath+0x13/0x94
[157278.222501] RIP: 0033:0x7f63e64fe700
[157278.226987] RSP: 002b:00007fff08c6f2e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[157278.236390] RAX: ffffffffffffffda RBX: 0000557566a2d620 RCX: 00007f63e64fe700
[157278.249255] RDX: 0000000000008000 RSI: 00007fff08c71410 RDI: 0000000000000003
[157278.257809] RBP: 00007f63e67b9440 R08: 0000557566a1bdd0 R09: 00007f63e621ff00
[157278.266484] R10: 0000000000000020 R11: 0000000000000246 R12: 0000000000000000
[157278.275193] R13: 00007fff08c6f0c0 R14: 000000001443c51d R15: 0000000000000000
[157278.283142] Code: 85 fb e0 4c 89 f7 e8 63 85 fb e0 eb 0c 48 c7 c7 a0 25 1a a0 e8 23 f8 36 e1 83 fb f5 75 14 ff 45 bc 83 7d bc 64 0f 8e d5 fd ff ff <0f> ff e9 ce fd ff ff 48 8b 7d c0 e8 36 27 ef e0 48 63 c3 85 db 
[157278.303181] ---[ end trace 70b707cc726a612d ]---
Comment 17 Steinar H. Gunderson 2017-10-13 07:41:28 UTC
This is the file it's stuck on, I suppose (according to lsof):

rpc.svcgs 3193 root    3u   REG                0,4        0 4026532487 /proc/3193/net/rpc/auth.rpcsec.init/channel

I suppose the backtrace from perf isn't so interesting (I can't attach with gdb, so I thought it was an OK way to get out a stack trace), but just in case:

-  100,00%     0,00%  rpc.svcgssd  rpc.svcgssd        [.] handle_nullreq        ▒
     handle_nullreq                                                             ▒
     __GI___libc_read                                                           ▒
     entry_SYSCALL_64_fastpath                                                  ▒
     sys_read                                                                   ▒
     vfs_read                                                                   ▒
     __vfs_read                                                                 ▒
     proc_reg_read                                                              ▒
   - cache_read_procfs                                                          ▒
      - 99,87% cache_read.isra.19                                               ▒
         - 98,92% rsi_request                                                   ▒
              98,81% qword_addhex
Comment 18 bfields 2017-10-13 20:17:11 UTC
(In reply to Steinar H. Gunderson from comment #16)
> We're now running 4.13.5 with CONFIG_PREEMPT=y. Today, the WARN_ONCE you
> asked me to add triggered, and rpc.svcgssd is using 100% CPU. Please advise.
> :-)

So you were running with the patch from Comment 12?
Comment 19 Steinar H. Gunderson 2017-10-13 20:24:24 UTC
> So you were running with the patch from Comment 12?

Yes.

We eventually booted back into 4.4 today, since the looping kernel thread was making I/O unreasonably slow; at least it was more stable last time.
Comment 20 Steinar H. Gunderson 2018-01-28 00:37:23 UTC
It happened again today (rpc.svcgssd going unkillable to 100% CPU, server had to be rebooted)—we've had to upgrade to 4.15.0-rc9 to get Meltdown/Spectre patches, so we know that the bug is still there.

Interestingly enough, it appears to be correlated with a specific user logging in on a workstation (triggering automount mounting a given partition over Kerberized NFSv4). This user belongs to a lot of groups (214 of them!) -- could that be a clue?
Comment 21 bfields 2018-12-14 16:40:14 UTC
Sorry, just noticed this was left dangling.  Is this still happening?

(In reply to Steinar H. Gunderson from comment #20)
> Interestingly enough, it appears to be correlated with a specific user
> logging in on a workstation (triggering automount mounting a given partition
> over Kerberized NFSv4). This user belongs to a lot of groups (214 of them!)
> -- could that be a clue?

That could be making the init_sec_context tokens larger than expected, I guess, I seem to remember that could be the case depending on the kdc.  (What is your kdc?)
Comment 22 Steinar H. Gunderson 2018-12-14 16:44:57 UTC
It stopped happening after we switched to gssproxy. (It still happens if we don't use it.)

Like I said in one of the first emails, the KDC is Samba 4.
Comment 23 Trond Myklebust 2018-12-14 16:51:12 UTC
Both Active Directory and FreeIPA use PACs to pass group membership information in the kerberos ticket. That is known tto blow up the size of the init_sec_context...

Samba would, of course, be providing Active Directory.
Comment 24 bfields 2018-12-14 17:25:35 UTC
Yes, and the switch to gss-proxy was made in part to fix our inability to handle the large init_sec_context tokens generated by Active Directory for users in lots of groups.

Still, this RCU warning shouldn't be the result....

Looks like a too-large token should hit the SVC_DENIED case in gss_read_verf and prevent further processing, I wonder why that's not happening.
Comment 25 bfields 2022-01-21 18:35:26 UTC
There may still be some bug here for all I know, but perhaps it's just not a priority after the switch to gss-proxy.

Reopen if it becomes an issue again.