Over the past year or so, I've heard several reports of this workqueue splat during transport disconnect: workqueue: WQ_MEM_RECLAIM xprtiod:xprt_autoclose [sunrpc] is flushing !WQ_MEM_RECLAIM events_highpri:rpcrdma_mr_refresh_worker [rpcrdma] WARNING: CPU: 1 PID: 20378 at kernel/workqueue.c:3728 check_flush_dependency+0x101/0x120 ? check_flush_dependency+0x101/0x120 ? report_bug+0x175/0x1a0 ? handle_bug+0x44/0x90 ? exc_invalid_op+0x1c/0x70 ? asm_exc_invalid_op+0x1f/0x30 ? __pfx_rpcrdma_mr_refresh_worker+0x10/0x10 [rpcrdma aefd3d1b298311368fa14fa93ae5fb3818c3aeac] ? check_flush_dependency+0x101/0x120 __flush_work.isra.0+0x20a/0x290 __cancel_work_sync+0x129/0x1c0 cancel_work_sync+0x14/0x20 rpcrdma_xprt_disconnect+0x229/0x3f0 [rpcrdma aefd3d1b298311368fa14fa93ae5fb3818c3aeac] xprt_rdma_close+0x16/0x40 [rpcrdma aefd3d1b298311368fa14fa93ae5fb3818c3aeac] xprt_autoclose+0x63/0x110 [sunrpc a04d701bce94b5a8fb541cafbe1a489d6b1ab5b3] process_one_work+0x19e/0x3f0 worker_thread+0x340/0x510 ? __pfx_worker_thread+0x10/0x10 kthread+0xf7/0x130 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x41/0x60 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 Though alarming, it's relatively harmless. xprtiod is a WQ_MEM_RECLAIM work queue because reconnecting to an NFS server might be necessary to handle a direct reclaim. The MR refresh worker uses a !WQ_MEM_RECLAIM work queue because the RDMA core does not yet implement MEM_RECLAIM safety. To address this splat in the short term, the work of releasing hardware-related resources in rpcrdma_xprt_disconnect() can be deferred to a !WQ_MEM_RECLAIM work queue. We might accomplish that by moving the req, rep, and sendctx data structures to struct rpcrdma_ep, and then release that during connection tear-down via a normal work queue item or via an RCU-controlled delay.
Because rpcrdma_ep_destroy() is called directly by rpcrdma_xprt_disconnect(), it's also a potential source of these splats (it invokes RDMA core API functions, which don't generally tolerate being called in a MEM_RECLAIM context). Perhaps the first step, then, is to move ep_destroy to a deferred context.