A bug was introduced with commit fc800ec491c39e42b65df72dc9ede3bb2d4a3755 where my NFS server (which supports Kerberos) will lock up the system after a client has mounted volumes export by the NFS server. Unsetting CONFIG_CGROUP_NET_PRIO or reverting fc800ec491c39e42b65df72dc9ede3bb2d4a3755 resolves the issue. I get nothing on the console or in the logs once this occurs. The system is locked until it finally restarts on its own or through user intervention. The client machine I used for testing has CONFIG_CGROUP_NET_PRIO enabled in the kernel, but it causes no issues there. I suspect the Kerberos functionality has no impact on this issue, but I'm presently unable to disable it. I'm not actually using the functionality provided by CONFIG_CGROUP_NET_PRIO, so I've disabled it for now. All kernels since v5.4.42 are affected. By disabling this feature or reverting fc800ec491c39e42b65df72dc9ede3bb2d4a3755 allows 5.4.2+ to work. I'm currently running stable with v5.4.45. The 5.5 and 5.6 series are unsurprisingly affected, and I suspect both series would work with either of the fixes above. Some info about the NFS server: Debian 10 Buster AMD EPYC 3251 8-core processor 128GB memory Intel X540 10GB NIC PCIE nic (only 1 port is used) 2x Intel Corporation I350 Gigabit onboard NICs (both are used) All NIC's are used in individual bridge interfaces for a total of 3 interfacs (br0, br1, br2) to facilitate virtualization. More info furnished upon request.
Do you have LOCKDEP enabled, that is CONFIG_LOCKDEP=y in your kernel config? It helps a lot to debug deadlocks. If not, can you enable it and test again? How reproducible is this? If you can find a minimum reproducer, that would help a lot to narrow down the problem. Thanks.
And make sure you have lockup detectors enabled: CONFIG_LOCKUP_DETECTOR=y CONFIG_SOFTLOCKUP_DETECTOR=y CONFIG_HARDLOCKUP_DETECTOR=y
LOCKDEP support is definiteld already enabled: CONFIG_LOCKDEP_SUPPORT=y It's very reproducible, but slightly intermittent. Sometimes the box will stay up even 30 minutes... But usually I'm able to get it to lock up in under a minute. I am mounting the volumes via automount, and I'm using NFSv4. The problem seems to occur no matter if it's 4.0, 4.1, or 4.2. I haven't tried NFSv3 as it doesn't appear to be an issue caused by NFS at this point. How I reproduce: I simply rm the kerberos file for my user from /tmp, mount the volumes by trying to access them as a regular user (I'm using automounter), and if it doesn't locked up within around 10 seconds after mounting the volumes, I umount them all, and repeat the process. 95% of the time, I'm able to reproduce the crash after repeating this process 1-3 times. Removing the kerberos file is probably unnecessary. If I get a crash, clearly the kernel is bad... If I don't, I try to leave the box up for at least an hour... My initial bisection attempt failed because I wasn't able to reproduce the issue within the first 10 minutes for what turned out to be a bad revision. Even though it seems to not lockup nearly as fast ~5% of the time, the box will still eventually lock up given enough time. Let me know if there are any other kernel options you'd like me to check for. Thanks! On 6/9/2020 10:49 AM, bugzilla-daemon@bugzilla.kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=208107 > > Cong Wang (xiyou.wangcong@gmail.com) changed: > > What |Removed |Added > ---------------------------------------------------------------------------- > CC| |xiyou.wangcong@gmail.com > > --- Comment #1 from Cong Wang (xiyou.wangcong@gmail.com) --- > Do you have LOCKDEP enabled, that is CONFIG_LOCKDEP=y in your kernel config? > It > helps a lot to debug deadlocks. If not, can you enable it and test again? > > How reproducible is this? If you can find a minimum reproducer, that would > help > a lot to narrow down the problem. > > Thanks. >
I enabled those options, but I still get no output on the console or in the logs. I also sent you my kernel config directly in case it would be faster to figure out what, if anything, is still missing.
Setup kernel crash dumps. Here's the relevant output of the dmesg captured from the crash: [ 457.038422] BUG: kernel NULL pointer dereference, address: 0000000000000010 [ 457.038479] #PF: supervisor read access in kernel mode [ 457.038513] #PF: error_code(0x0000) - not-present page [ 457.038547] PGD 0 P4D 0 [ 457.038568] Oops: 0000 [#1] SMP [ 457.038592] CPU: 3 PID: 0 Comm: swapper/3 Kdump: loaded Not tainted 5.4.46-broken #2 [ 457.038640] Hardware name: Supermicro Super Server/M11SDV-8C+-LN4F, BIOS 1.0 01/30/2019 [ 457.038696] RIP: 0010:__cgroup_bpf_run_filter_skb+0xe7/0x3e0 [ 457.038735] Code: 4e 70 41 2b 4e 74 48 89 5c 24 20 48 01 c8 41 83 fd 01 49 89 46 50 0f 84 96 01 00 00 44 89 ea 48 8d 84 d6 18 06 00 00 48 8b 00 <4c> 8b 78 10 4c 8d 68 10 4d 85 ff 0f 84 c8 02 00 00 49 8d 46 30 bb [ 457.038845] RSP: 0018:ffff99b39ecc5780 EFLAGS: 00010297 [ 457.038878] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000000000013c [ 457.038922] RDX: 0000000000000000 RSI: ffff99b36aebc000 RDI: ffff99b33fad3400 [ 457.038966] RBP: ffff99b33fad3400 R08: 0000000000000001 R09: ffff99b24684e500 [ 457.039009] R10: 0000000000000000 R11: ffff99b397d800a0 R12: ffff99b33fad3400 [ 457.039053] R13: 0000000000000000 R14: ffff99b24684e500 R15: ffff99b35e4f60e2 [ 457.039097] FS: 0000000000000000(0000) GS:ffff99b39ecc0000(0000) knlGS:0000000000000000 [ 457.039146] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 457.039182] CR2: 0000000000000010 CR3: 0000001d9e26b000 CR4: 00000000003406e0 [ 457.039225] Call Trace: [ 457.039245] <IRQ> [ 457.039265] ? ixgbe_xmit_frame_ring+0x509/0xea0 [ 457.039299] sk_filter_trim_cap+0x10c/0x230 [ 457.039331] ? tcp_v4_inbound_md5_hash+0x58/0x190 [ 457.039362] tcp_v4_rcv+0xa66/0xc50 [ 457.039391] ip_protocol_deliver_rcu+0x2c/0x1c0 [ 457.039423] ip_local_deliver_finish+0x44/0x50 [ 457.039453] ip_local_deliver+0xe0/0xf0 [ 457.039481] ? ip_protocol_deliver_rcu+0x1c0/0x1c0 [ 457.039516] ip_sabotage_in+0x55/0x60 [ 457.041021] nf_hook_slow+0x52/0xd0 [ 457.042513] ip_rcv+0x9c/0xe0 [ 457.043994] ? ip_rcv_finish_core.isra.22+0x3b0/0x3b0 [ 457.045474] __netif_receive_skb_one_core+0x85/0xa0 [ 457.046939] netif_receive_skb_internal+0x2f/0xa0 [ 457.048401] netif_receive_skb+0x1b/0xb0 [ 457.049816] br_pass_frame_up+0x104/0x110 [ 457.051188] ? br_handle_local_finish+0x20/0x20 [ 457.052544] br_handle_frame_finish+0x2b3/0x420 [ 457.053896] ? br_nf_forward_finish+0x129/0x1b0 [ 457.055240] ? br_dev_queue_push_xmit+0x150/0x150 [ 457.056546] ? br_pass_frame_up+0x110/0x110 [ 457.057808] br_nf_hook_thresh+0xda/0xf0 [ 457.059049] ? br_pass_frame_up+0x110/0x110 [ 457.060286] br_nf_pre_routing_finish+0x142/0x340 [ 457.061524] ? br_pass_frame_up+0x110/0x110 [ 457.062725] ? nf_nat_ipv4_in+0x2d/0x80 [nf_nat] [ 457.063885] br_nf_pre_routing+0x224/0x4e8 [ 457.065023] ? br_nf_forward_ip+0x480/0x480 [ 457.066146] br_handle_frame+0x1d4/0x370 [ 457.067266] ? br_pass_frame_up+0x110/0x110 [ 457.068381] __netif_receive_skb_core+0x283/0xc50 [ 457.069507] __netif_receive_skb_one_core+0x3c/0xa0 [ 457.070610] netif_receive_skb_internal+0x2f/0xa0 [ 457.071696] napi_gro_receive+0xed/0x150 [ 457.072761] ixgbe_poll+0x6f1/0x1280 [ 457.073791] ? enqueue_entity+0x410/0x8f0 [ 457.074786] ? check_preempt_curr+0x7a/0x90 [ 457.075757] net_rx_action+0x136/0x370 [ 457.076716] __do_softirq+0xda/0x2d1 [ 457.077667] irq_exit+0xa5/0xb0 [ 457.078605] do_IRQ+0x59/0xf0 [ 457.079529] common_interrupt+0xf/0xf [ 457.080455] </IRQ> [ 457.081360] RIP: 0010:cpuidle_enter_state+0xb4/0x440 [ 457.082269] Code: 24 0f 1f 44 00 00 31 ff e8 69 7c 90 ff 80 7c 24 13 00 74 12 9c 58 f6 c4 02 0f 85 5c 03 00 00 31 ff e8 30 90 96 ff fb 45 85 e4 <0f> 88 8f 02 00 00 49 63 cc 48 8b 34 24 48 2b 74 24 08 48 8d 04 49 [ 457.084188] RSP: 0018:ffff99b398aabe78 EFLAGS: 00000202 ORIG_RAX: ffffffffffffffda [ 457.085182] RAX: ffff99b39ece8f00 RBX: ffffffffb52c0b80 RCX: 000000000000001f [ 457.086190] RDX: 0000006a699c1a5b RSI: 000000003333348b RDI: 0000000000000000 [ 457.087203] RBP: ffff99b393419400 R08: 0000000000000002 R09: 0000000000028780 [ 457.088217] R10: 00000159023d5e8a R11: ffff99b39ece7fc0 R12: 0000000000000002 [ 457.089237] R13: ffffffffb52c0c58 R14: 0000000000000002 R15: 0000000000000000 [ 457.090265] ? cpuidle_enter_state+0x97/0x440 [ 457.091294] cpuidle_enter+0x35/0x50 [ 457.092328] do_idle+0x1f8/0x230 [ 457.093353] cpu_startup_entry+0x20/0x30 [ 457.094381] start_secondary+0x143/0x170 [ 457.095407] secondary_startup_64+0xa4/0xb0 [ 457.096436] Modules linked in: vhost_net vhost tap xt_conntrack nft_counter nft_chain_nat xt_MASQUERADE nf_nat nft_compat nf_tables nfnetlink ipmi_si ipmi_devintf ipmi_msghandler btrfs zstd_decompress zstd_compress zlib_deflate raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq raid0 multipath linear raid1 md_mod [ 457.099853] CR2: 0000000000000010 Here's the result of the crash command: crash /usr/lib/debug/lib/modules/5.4.46-broken/vmlinux /var/crash/202006121154/dump.202006121154 crash 7.2.5 Copyright (C) 2002-2019 Red Hat, Inc. Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation Copyright (C) 1999-2006 Hewlett-Packard Co Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited Copyright (C) 2006, 2007 VA Linux Systems Japan K.K. Copyright (C) 2005, 2011 NEC Corporation Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc. Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc. This program is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Enter "help copying" to see the conditions. This program has absolutely no warranty. Enter "help warranty" for details. GNU gdb (GDB) 7.6 Copyright (C) 2013 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-unknown-linux-gnu"... WARNING: kernel relocated [814MB]: patching 103638 gdb minimal_symbol values WARNING: could not find MAGIC_START! crash: page excluded: kernel virtual address: ffffffffb40f8560 type: "framepointer check" crash: page excluded: kernel virtual address: ffffffffb485ebb0 type: "gdb_readmem_callback" crash: recursive temporary file usage
Created attachment 289707 [details] The patch fixes the issue This is the patch Cong Wang provided to me. Apparently this was a known upstream issue. My box has been up for ~4 days now using this patch with this feature enabled in the kernel without issue.