Bug 208685

Summary: kswapd warning at kernel/rcu/tree.c:1358 rcu_advance_cbs_nowake causing processes to enter d state
Product: Memory Management Reporter: Joe (joseph.feather)
Component: Slab AllocatorAssignee: Andrew Morton (akpm)
Status: NEW ---    
Severity: high CC: guillaume, matthew.fillo, timothy.heilman
Priority: P1    
Hardware: x86-64   
OS: Linux   
Kernel Version: 5.4.51 Subsystem:
Regression: No Bisected commit-id:

Description Joe 2020-07-24 14:57:52 UTC
After upgrading our systems from 5.3.x to 5.4.51 we started experiencing problems where some processes would enter an uninterruptible sleep state. 

We do not have a swap enabled or a swap mount point.

We have tried updating to 5.7.10 and still experience the same issue. Downgrading to 5.3.x fixes the problem.

Looking through dmesg we found the following error message:
[819876.687024] ------------[ cut here ]------------
[819876.687038] WARNING: CPU: 22 PID: 434 at kernel/rcu/tree.c:1358 rcu_advance_cbs_nowake+0x53/0x60
[819876.687038] Modules linked in: binfmt_misc(E) intel_rapl_msr(E) dell_smbios(E) wmi_bmof(E) dell_wmi_descriptor(E) sr_mod(E) cdrom(E) dcdbas(E) xt_comment(E) xt_multiport(E) ip6t_rpfilter(E) ip6t_RE
JECT(E) nf_reject_ipv6(E) ipt_REJECT(E) nf_reject_ipv4(E) xt_conntrack(E) ebtable_nat(E) ebtable_broute(E) ip6table_nat(E) ip6table_mangle(E) ip6table_security(E) ip6table_raw(E) iptable_nat(E) nf_nat(
E) iptable_mangle(E) iptable_security(E) iptable_raw(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) ip_set(E) nfnetlink(E) ebtable_filter(E) ebtables(E) ip6table_filter(E) ip6_tables(E) iptable
_filter(E) intel_rapl_common(E) vfat(E) fat(E) skx_edac(E) nfit(E) libnvdimm(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) coretemp(E) kvm(E) irqbypass(E) crct10dif_pclmul(E) crc32_pclmul(E) ghash_clm
ulni_intel(E) aesni_intel(E) crypto_simd(E) cryptd(E) glue_helper(E) wdat_wdt(E) pcspkr(E) sg(E) i2c_i801(E) lpc_ich(E) mei_me(E) mei(E) ioatdma(E) wmi(E) ipmi_si(E) ipmi_devintf(E) ipmi_msghandler(E)
[819876.687069]  acpi_power_meter(E) ip_tables(E) xfs(E) libcrc32c(E) sd_mod(E) crc32c_intel(E) megaraid_sas(E) mgag200(E) i40e(E) drm_kms_helper(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) fb_sys_fo
ps(E) igb(E) ptp(E) pps_core(E) drm_vram_helper(E) dca(E) ttm(E) ahci(E) libahci(E) drm(E) i2c_algo_bit(E) libata(E) uas(E) usb_storage(E) dm_mirror(E) dm_region_hash(E) dm_log(E) dm_mod(E)
[819876.687087] CPU: 22 PID: 434 Comm: kswapd0 Tainted: G            E     5.4.51-1.el7.x86_64 #1
[819876.687087] Hardware name: Dell Inc. PowerEdge R740xd/0YNX56, BIOS 2.2.11 06/13/2019
[819876.687090] RIP: 0010:rcu_advance_cbs_nowake+0x53/0x60
[819876.687092] Code: 85 c0 74 f1 48 89 ee 48 89 df e8 68 ff ff ff 84 c0 75 17 48 89 df c6 07 00 0f 1f 40 00 5b 5d c3 48 89 f7 e8 1f df ff ff eb c3 <0f> 0b eb e5 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00
00 55 53 48 89
[819876.687093] RSP: 0018:ffffaaf58d713a40 EFLAGS: 00010002
[819876.687094] RAX: 0000000000000001 RBX: ffffffff9a2655c0 RCX: 0000000008326518
[819876.687095] RDX: 0000000000000001 RSI: 0000000000000001 RDI: ffffffff9a265140
[819876.687096] RBP: ffff9547a0eeb2c0 R08: ffff95317eefb920 R09: 0000000130dae188
[819876.687096] R10: 0000000000000000 R11: 0000000000000001 R12: ffff953c2a0c1da8
[819876.687097] R13: 000000000002b2c0 R14: ffffaaf58d713b48 R15: ffff95317eefb6d8
[819876.687099] FS:  0000000000000000(0000) GS:ffff9547a0ec0000(0000) knlGS:0000000000000000
[819876.687100] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[819876.687101] CR2: 00007fc07fe2e300 CR3: 00000005ffe0a001 CR4: 00000000007606e0
[819876.687101] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[819876.687102] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[819876.687103] PKRU: 55555554
[819876.687103] Call Trace:
[819876.687114]  ? radix_tree_node_ctor+0x90/0x90
[819876.687115]  __call_rcu+0x2aa/0x4e0
[819876.687120]  xas_free_nodes+0x8b/0xb0
[819876.687122]  xas_store+0x1a4/0x550
[819876.687130]  shadow_lru_isolate+0xed/0x170
[819876.687132]  ? scan_shadow_nodes+0x30/0x30
[819876.687134]  ? workingset_update_node+0x70/0x70
[819876.687136]  __list_lru_walk_one+0x6d/0x170
[819876.687138]  ? workingset_update_node+0x70/0x70
[819876.687140]  list_lru_walk_one_irq+0x4c/0x70
[819876.687145]  do_shrink_slab+0x150/0x310
[819876.687148]  shrink_slab+0x239/0x2d0
[819876.687151]  shrink_node+0xd5/0x450
[819876.687154]  balance_pgdat+0x275/0x560
[819876.687157]  kswapd+0x15a/0x3d0
[819876.687160]  ? remove_wait_queue+0x60/0x60
[819876.687166]  kthread+0xf8/0x130
[819876.687169]  ? balance_pgdat+0x560/0x560
[819876.687170]  ? kthread_bind+0x10/0x10
[819876.687173]  ret_from_fork+0x35/0x40
[819876.687175] ---[ end trace ff791da2812c6a3e ]---

After that message we see our process that is in D state:
[820150.702270] INFO: task python:11885 blocked for more than 122 seconds.
[820150.702315]       Tainted: G        W   E     5.4.51-1.el7.x86_64 #1
[820150.702348] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[820150.702387] python          D    0 11885  11872 0x00000080
[820150.702390] Call Trace:
[820150.702401]  __schedule+0x2d1/0x6c0
[820150.702404]  schedule+0x39/0xa0
[820150.702406]  schedule_timeout+0x1c8/0x290
[820150.702413]  ? rcu_accelerate_cbs+0x61/0x180
[820150.702415]  wait_for_completion+0x123/0x190
[820150.702420]  ? wake_up_q+0x70/0x70
[820150.702422]  __wait_rcu_gp+0xfd/0x130
[820150.702425]  synchronize_rcu+0x4e/0x80
[820150.702427]  ? __call_rcu+0x4e0/0x4e0
[820150.702428]  ? __bpf_trace_rcu_utilization+0x10/0x10
[820150.702432]  packet_release+0x247/0x380
[820150.702437]  __sock_release+0x3d/0xc0
[820150.702439]  sock_close+0x11/0x20
[820150.702442]  __fput+0xbe/0x250
[820150.702446]  task_work_run+0x88/0xa0
[820150.702452]  exit_to_usermode_loop+0x77/0xc7
[820150.702454]  do_syscall_64+0x182/0x1b0
[820150.702457]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[820150.702459] RIP: 0033:0x7fd85ca297a0
[820150.702464] Code: Bad RIP value.
[820150.702465] RSP: 002b:00007ffd05cc66b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
[820150.702466] RAX: 0000000000000000 RBX: 00007fd85d0cba50 RCX: 00007fd85ca297a0
[820150.702467] RDX: 0000000000000000 RSI: 00007fd85d06a210 RDI: 0000000000000003
[820150.702468] RBP: 0000000000000007 R08: 0000000000000000 R09: 00007fd85d0cc1e7
[820150.702469] R10: 0000000000000083 R11: 0000000000000246 R12: 0000000000000007
[820150.702470] R13: 00007fd85d06a210 R14: 00007fd85ccd4590 R15: 00000000022750a0
Comment 1 guillaume 2021-10-05 17:18:42 UTC
https://lkml.org/lkml/2021/9/17/1048