Bug 219737

Summary: warning in nfsd4_cb_done
Product: File System Reporter: rik.theys
Component: NFSDAssignee: Filesystem/NFSD virtual assignee (filesystem_nfsd)
Status: NEW ---    
Severity: normal CC: cel, chuck.lever, jlayton
Priority: P3    
Hardware: All   
OS: Linux   
Kernel Version: Subsystem:
Regression: No Bisected commit-id:

Description rik.theys 2025-01-30 14:00:07 UTC
+++ This bug was initially created as a clone of Bug #218735 +++

The following warning was triggered while running 6.12.11:

[Wed Jan 29 10:11:17 2025] cb_status=-521 tk_status=-10036
[Wed Jan 29 10:11:17 2025] WARNING: CPU: 16 PID: 1670626 at fs/nfsd/nfs4callback.c:1339 nfsd4_cb_done+0x171/0x180 [nfsd]
[Wed Jan 29 10:11:17 2025] Modules linked in: dm_snapshot(E) nfsv4(E) dns_resolver(E) nfs(E) netfs(E) binfmt_misc(E) rpcsec_gss_krb5(E) rpcrdma(E) rdma_cm(E) iw_cm(E) ib_cm(E) ib_core(E) bonding(E) tls(E) rfkill(E) nft_ct(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nf_tables(E) nfnetlink(E) vfat(E) fat(E) dm_thin_pool(E) dm_persistent_data(E) dm_bio_prison(E) dm_bufio(E) libcrc32c(E) dm_service_time(E) dm_multipath(E) intel_rapl_msr(E) intel_rapl_common(E) intel_uncore_frequency(E) intel_uncore_frequency_common(E) skx_edac(E) skx_edac_common(E) nfit(E) libnvdimm(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) coretemp(E) kvm_intel(E) mgag200(E) kvm(E) i2c_algo_bit(E) dell_pc(E) rapl(E) drm_shmem_helper(E) ipmi_ssif(E) intel_cstate(E) platform_profile(E) dell_smbios(E) dcdbas(E) dell_wmi_descriptor(E) intel_uncore(E) wmi_bmof(E) pcspkr(E) drm_kms_helper(E) i2c_i801(E) mei_me(E) mei(E) intel_pch_thermal(E) i2c_mux(E) lpc_ich(E) i2c_smbus(E) joydev(E)
[Wed Jan 29 10:11:17 2025]  acpi_power_meter(E) ipmi_si(E) acpi_ipmi(E) ipmi_devintf(E) ipmi_msghandler(E) nfsd(E) nfs_acl(E) lockd(E) auth_rpcgss(E) grace(E) fuse(E) drm(E) sunrpc(E) ext4(E) mbcache(E) jbd2(E) lpfc(E) sd_mod(E) sg(E) nvmet_fc(E) nvmet(E) nvme_fc(E) ahci(E) crct10dif_pclmul(E) nvme_fabrics(E) crc32_pclmul(E) libahci(E) crc32c_intel(E) polyval_clmulni(E) nvme_core(E) megaraid_sas(E) ixgbe(E) polyval_generic(E) ghash_clmulni_intel(E) mdio(E) libata(E) wdat_wdt(E) scsi_transport_fc(E) dca(E) wmi(E) dm_mirror(E) dm_region_hash(E) dm_log(E) dm_mod(E)
[Wed Jan 29 10:11:17 2025] CPU: 16 UID: 0 PID: 1670626 Comm: kworker/u193:0 Kdump: loaded Tainted: G            E      6.12.11-1.el9.esat.x86_64 #1
[Wed Jan 29 10:11:17 2025] Tainted: [E]=UNSIGNED_MODULE
[Wed Jan 29 10:11:17 2025] Hardware name: Dell Inc. PowerEdge R740/00WGD1, BIOS 2.20.1 09/13/2023
[Wed Jan 29 10:11:17 2025] Workqueue: rpciod rpc_async_schedule [sunrpc]
[Wed Jan 29 10:11:17 2025] RIP: 0010:nfsd4_cb_done+0x171/0x180 [nfsd]
[Wed Jan 29 10:11:17 2025] Code: 0f 1f 44 00 00 e9 1d ff ff ff 80 3d 1c a7 01 00 00 0f 85 d9 fe ff ff 48 c7 c7 e5 b2 06 c1 c6 05 08 a7 01 00 01 e8 1f 4f ef d4 <0f> 0b 8b 75 54 e9 bc fe ff ff 0f 0b 0f 1f 00 90 90 90 90 90 90 90
[Wed Jan 29 10:11:17 2025] RSP: 0018:ffffa469b58c7e08 EFLAGS: 00010282
[Wed Jan 29 10:11:17 2025] RAX: 0000000000000000 RBX: ffff8a8f13ef6400 RCX: 0000000000000000
[Wed Jan 29 10:11:17 2025] RDX: 0000000000000002 RSI: ffffffff97398443 RDI: 00000000ffffffff
[Wed Jan 29 10:11:17 2025] RBP: ffff8a8c574515b8 R08: 0000000000000000 R09: ffffa469b58c7cb0
[Wed Jan 29 10:11:17 2025] R10: ffffa469b58c7ca8 R11: ffffffff97fdf688 R12: ffff8a7548f73f60
[Wed Jan 29 10:11:17 2025] R13: ffff8a8f13ef6400 R14: 0000000004248060 R15: ffffffffc0d66a40
[Wed Jan 29 10:11:17 2025] FS:  0000000000000000(0000) GS:ffff8a8ae0a00000(0000) knlGS:0000000000000000
[Wed Jan 29 10:11:17 2025] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Wed Jan 29 10:11:17 2025] CR2: 00007f8a14576160 CR3: 000000274fa20004 CR4: 00000000007726f0
[Wed Jan 29 10:11:17 2025] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[Wed Jan 29 10:11:17 2025] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[Wed Jan 29 10:11:17 2025] PKRU: 55555554
[Wed Jan 29 10:11:17 2025] Call Trace:
[Wed Jan 29 10:11:17 2025]  <TASK>
[Wed Jan 29 10:11:17 2025]  ? __warn+0x84/0x130
[Wed Jan 29 10:11:17 2025]  ? nfsd4_cb_done+0x171/0x180 [nfsd]
[Wed Jan 29 10:11:17 2025]  ? report_bug+0x1c3/0x1d0
[Wed Jan 29 10:11:17 2025]  ? handle_bug+0x5b/0xa0
[Wed Jan 29 10:11:17 2025]  ? exc_invalid_op+0x14/0x70
[Wed Jan 29 10:11:17 2025]  ? asm_exc_invalid_op+0x16/0x20
[Wed Jan 29 10:11:17 2025]  ? __pfx_rpc_exit_task+0x10/0x10 [sunrpc]
[Wed Jan 29 10:11:17 2025]  ? nfsd4_cb_done+0x171/0x180 [nfsd]
[Wed Jan 29 10:11:17 2025]  ? nfsd4_cb_done+0x171/0x180 [nfsd]
[Wed Jan 29 10:11:17 2025]  rpc_exit_task+0x5b/0x170 [sunrpc]
[Wed Jan 29 10:11:17 2025]  __rpc_execute+0x9f/0x370 [sunrpc]
[Wed Jan 29 10:11:17 2025]  rpc_async_schedule+0x2b/0x40 [sunrpc]
[Wed Jan 29 10:11:17 2025]  process_one_work+0x179/0x390
[Wed Jan 29 10:11:17 2025]  worker_thread+0x239/0x340
[Wed Jan 29 10:11:17 2025]  ? __pfx_worker_thread+0x10/0x10
[Wed Jan 29 10:11:17 2025]  kthread+0xdb/0x110
[Wed Jan 29 10:11:17 2025]  ? __pfx_kthread+0x10/0x10
[Wed Jan 29 10:11:17 2025]  ret_from_fork+0x2d/0x50
[Wed Jan 29 10:11:17 2025]  ? __pfx_kthread+0x10/0x10
[Wed Jan 29 10:11:17 2025]  ret_from_fork_asm+0x1a/0x30
[Wed Jan 29 10:11:17 2025]  </TASK>
[Wed Jan 29 10:11:17 2025] ---[ end trace 0000000000000000 ]---

This was also reported in Bug #218735 but as it isn't certain that this warning is related to the callback workqueue stall from that bug, it was recommended to file a separate bug.
Comment 1 Chuck Lever 2025-01-30 14:08:15 UTC
I recommended a separate bug because this looks to me like the CB_GETATTR reply handling path needs some attention. I'm not sure the problems here will result in the hanging symptoms seen in Bug #218735.

First issue is the explicit use of NFS4ERR_BAD_XDR in the CB_GETATTR reply decoder. Should be EIO instead.

Second issue is the CB_GETATTR reply decoder does not seem capable of handling a non-zero status code in the reply.

Third issue is whether NFS4ERR_BADHANDLE means the server requested a CB_GETATTR for the wrong file, or if it is an expected situation.
Comment 2 rik.theys 2025-01-30 14:17:39 UTC
Hi,

It seems I've cloned the wrong bug. It should have been https://bugzilla.kernel.org/show_bug.cgi?id=219710 instead.

As mentioned by Chuck in https://bugzilla.kernel.org/show_bug.cgi?id=219710#c25 there are multiple events in the trace, but there is only a single WARNING logged? So is there still something different between the various events, or is the warning only logged once?

Regards,
Rik
Comment 3 Bugspray Bot 2025-01-30 15:10:03 UTC
Olga Kornievskaia <aglo@umich.edu> replies to comment #1:

On Thu, Jan 30, 2025 at 9:09 AM Chuck Lever via Bugspray Bot
<bugbot@kernel.org> wrote:
>
> Chuck Lever writes via Kernel.org Bugzilla:
>
> I recommended a separate bug because this looks to me like the CB_GETATTR
> reply handling path needs some attention. I'm not sure the problems here will
> result in the hanging symptoms seen in Bug #218735.
>
> First issue is the explicit use of NFS4ERR_BAD_XDR in the CB_GETATTR reply
> decoder. Should be EIO instead.
>
> Second issue is the CB_GETATTR reply decoder does not seem capable of
> handling a non-zero status code in the reply.
>
> Third issue is whether NFS4ERR_BADHANDLE means the server requested a
> CB_GETATTR for the wrong file, or if it is an expected situation.

Isn't this because 6.12.x is still missing the patch "NFSD: fix
decoding in nfs4_xdr_dec_cb_getattr" that just went into 6.14?

> View: https://bugzilla.kernel.org/show_bug.cgi?id=219737#c1
> You can reply to this message to join the discussion.
> --
> Deet-doot-dot, I am a bot.
> Kernel.org Bugzilla (bugspray 0.1-dev)
>
>

(via https://msgid.link/CAN-5tyFeGvihaucNL1rPPhbkxM04gMfOvDz04V0%2B-05mLbCS4w@mail.gmail.com)
Comment 4 Chuck Lever 2025-01-30 15:16:40 UTC
(In reply to Bugspray Bot from comment #3)
> Olga Kornievskaia <aglo@umich.edu> replies to comment #1:
> 
> > First issue is the explicit use of NFS4ERR_BAD_XDR in the CB_GETATTR reply
> > decoder. Should be EIO instead.
> >
> > Second issue is the CB_GETATTR reply decoder does not seem capable of
> > handling a non-zero status code in the reply.
> >
> > Third issue is whether NFS4ERR_BADHANDLE means the server requested a
> > CB_GETATTR for the wrong file, or if it is an expected situation.
> 
> Isn't this because 6.12.x is still missing the patch "NFSD: fix
> decoding in nfs4_xdr_dec_cb_getattr" that just went into 6.14?

Yes, second and third issues are addressed by 1b3e26a5ccbf ("NFSD: fix decoding in nfs4_xdr_dec_cb_getattr").

The first issue has not yet been addressed upstream.
Comment 5 Chuck Lever 2025-01-30 15:17:58 UTC
(In reply to Chuck Lever from comment #4)
> (In reply to Bugspray Bot from comment #3)
> > Olga Kornievskaia <aglo@umich.edu> replies to comment #1:
> > 
> > > First issue is the explicit use of NFS4ERR_BAD_XDR in the CB_GETATTR
> reply
> > > decoder. Should be EIO instead.
> > >
> > > Second issue is the CB_GETATTR reply decoder does not seem capable of
> > > handling a non-zero status code in the reply.
> > >
> > > Third issue is whether NFS4ERR_BADHANDLE means the server requested a
> > > CB_GETATTR for the wrong file, or if it is an expected situation.
> > 
> > Isn't this because 6.12.x is still missing the patch "NFSD: fix
> > decoding in nfs4_xdr_dec_cb_getattr" that just went into 6.14?
> 
> Yes, second and third issues are addressed by 1b3e26a5ccbf ("NFSD: fix
> decoding in nfs4_xdr_dec_cb_getattr").
> 
> The first issue has not yet been addressed upstream.

The first issue isn't a big deal, though. Applying 1b3e26a5ccbf should address this bug.
Comment 6 Chuck Lever 2025-01-30 15:52:15 UTC
(In reply to Chuck Lever from comment #4)
> (In reply to Bugspray Bot from comment #3)
> > Olga Kornievskaia <aglo@umich.edu> replies to comment #1:
> > 
> > > First issue is the explicit use of NFS4ERR_BAD_XDR in the CB_GETATTR
> reply
> > > decoder. Should be EIO instead.
> > >
> > > Second issue is the CB_GETATTR reply decoder does not seem capable of
> > > handling a non-zero status code in the reply.
> > >
> > > Third issue is whether NFS4ERR_BADHANDLE means the server requested a
> > > CB_GETATTR for the wrong file, or if it is an expected situation.
> > 
> > Isn't this because 6.12.x is still missing the patch "NFSD: fix
> > decoding in nfs4_xdr_dec_cb_getattr" that just went into 6.14?
> 
> Yes, second and third issues are addressed by 1b3e26a5ccbf ("NFSD: fix
> decoding in nfs4_xdr_dec_cb_getattr").

I take it back: The third issue is not addressed by 1b3e26a5ccbf. Why is the server sending the client a delegation that it rejects? And does the server recover properly in this case?

But again, 1b3e26a5ccbf should prevent the warning reported here.
Comment 7 rik.theys 2025-02-04 08:39:44 UTC
Hi,

(In reply to Chuck Lever from comment #4)
> (In reply to Bugspray Bot from comment #3)
> > Olga Kornievskaia <aglo@umich.edu> replies to comment #1:
> > 
> > > First issue is the explicit use of NFS4ERR_BAD_XDR in the CB_GETATTR
> reply
> > > decoder. Should be EIO instead.
> > >
> > > Second issue is the CB_GETATTR reply decoder does not seem capable of
> > > handling a non-zero status code in the reply.
> > >
> > > Third issue is whether NFS4ERR_BADHANDLE means the server requested a
> > > CB_GETATTR for the wrong file, or if it is an expected situation.
> > 
> > Isn't this because 6.12.x is still missing the patch "NFSD: fix
> > decoding in nfs4_xdr_dec_cb_getattr" that just went into 6.14?
> 
> Yes, second and third issues are addressed by 1b3e26a5ccbf ("NFSD: fix
> decoding in nfs4_xdr_dec_cb_getattr").
> 
> The first issue has not yet been addressed upstream.

Is it possible this patch has not (yet) been sent to stable@vger.kernel.org so it ends up in 6.12.y?

Regards,
Rik
Comment 8 Chuck Lever 2025-02-04 13:28:52 UTC
(In reply to rik.theys from comment #7)
> Is it possible this patch has not (yet) been sent to stable@vger.kernel.org
> so it ends up in 6.12.y?

Commit 1b3e26a5ccbf has been in a publicly released kernel for exactly two days (as of this writing). It will take some time before it appears in any LTS kernel. For now, if you would like to test the commit, you need to apply it manually.
Comment 9 Jeff Layton 2025-02-10 15:42:07 UTC
FYI:

521 == EBADHANDLE
10036 == NFS4ERR_BADXDR
3 == CB_GETATTR

The cb_status gets set via decode_cb_op_status() during the decode phase and is not cleared in nfsd4_cb_prepare.

Note that the code will call rpc_restart_call() in the NFS4ERR_DELAY case, which skips the  rpc_call_prepare phase when the task is restarted. So, I think this may go something like this:

Get a NFS4ERR_DELAY in a callback operation, that restarts the call but doesn't clear cb_status. Then on the next attempt, it gets a non-zero tk_status (and probably skips the decode phase). That would cause this warning to pop, I think.

In practice, I'm warming up to the idea that this entire if statement is just bogus:

        if (cb->cb_status) {
                WARN_ONCE(task->tk_status,
                          "cb_status=%d tk_status=%d cb_opcode=%d",
                          cb->cb_status, task->tk_status, cb->cb_ops->opcode);
                task->tk_status = cb->cb_status;
        }

Consider the case where we successfully decode a cb_status value but then some later field in the reply was corrupt. We will have set the cb_status, but the decode will fail which will cause the tk_status to be non-zero, and this warning will pop.

I think we should probably just remove the entire if statement above.
Comment 10 Jeff Layton 2025-02-10 16:30:35 UTC
(In reply to Chuck Lever from comment #8)
> (In reply to rik.theys from comment #7)
> > Is it possible this patch has not (yet) been sent to stable@vger.kernel.org
> > so it ends up in 6.12.y?
> 
> Commit 1b3e26a5ccbf has been in a publicly released kernel for exactly two
> days (as of this writing). It will take some time before it appears in any
> LTS kernel. For now, if you would like to test the commit, you need to apply
> it manually.

Now that I look, 1b3e26a5ccbf is wrong. The patch on the ml was correct, but the one that got committed is different. It should be:

    status = decode_cb_op_status(xdr, OP_CB_GETATTR, &cb->cb_status);
    if (unlikely(status || cb->cb_status))

If "status" is non-zero, decoding failed (usu. BADXDR), but we also want to bail out and not decode the rest of the call if the decoded cb_status is non-zero. That's not happening here, cb_seq_status has already been checked and is non-zero, so this ends up trying to decode the rest of the CB_GETATTR reply when it doesn't exist.
Comment 11 Chuck Lever 2025-02-10 16:41:45 UTC
Also, -NFS4ERR_BADXDR is the wrong status code to use here. The correct code to use is -EIO. We're decoding the callback server's response, not a call (where BADXDR would indeed be appropriate).

For comparison, see how the NFS client handles a server response it cannot decode.
Comment 12 rik.theys 2025-02-24 11:52:39 UTC
Hi,

We're currently running 6.12.13 with the following 4 patches on top:

1. 1b3e26a5ccbfc2f85bda1930cc278e313165e353: NFSD: fix decoding in nfs4_xdr_dec_cb_getattr

This patch went into 6.13-rc7, but I don't think it was CC'ed to stable?
I also don't see it in nfsd-6.12.y branch on https://git.kernel.org/pub/scm/linux/kernel/git/cel/linux.git/log
Is it not needed on 6.12?

2. cedfbb92cf97a6bff3d25633001d9c44442ee854: NFSD: fix hang in nfsd4_shutdown_callback

This one is now in 6.12.16

3. 99e98a2312c8d08fba60d548009c03e7cfb1bf6b: NFSD: Skip sending CB_RECALL_ANY when the backchannel isn't up

This one seems to have a CC: stable but I assume it's not yet there as it isn't included in the mainline kernel yet? Is it expected to be merged into 6.14-rcX, or will it have to wait for 6.15?

4. 4990d098433db18c854e75fb0f90d941eb7d479e: NFSD: Fix CB_GETATTR status fix

This one seems to be in 6.14-rc, but also doesn't have a CC: stable? It fixes 1b3e26a5ccbfc2f85bda1930cc278e313165e353 above.


Are there any plans to include these patches in the 6.12.y kernel?

Some of the patches mentioned above are in the nfsd-{fixes,testing,next} branches on https://git.kernel.org/pub/scm/linux/kernel/git/cel/linux.git, but I don't know what all of these branches are used for. Is the nfsd-next branch what's expected to land in the next upstream kernel (6.15-rc1), or also the current rc kernels (6.14-rcX)?

What is the difference between the nfsd-testing and nfsd-fixes branch?


This kernel seems to be running stable for now, but it's too soon to conclude it fixes the issue that sometimes comes up. It does show the following kernel message sometimes:

rpc-srv/tcp: nfsd: sent 106256 when sending 131204 bytes - shutting down socket

I don't recall seeing this message with previous (6.11 and earlier) kernels. Does this message mean a single connection to a client was shut down? Is this a message we can ignore?

Regards,
Rik
Comment 13 Jeff Layton 2025-02-24 12:15:46 UTC
> rpc-srv/tcp: nfsd: sent 106256 when sending 131204 bytes - shutting down
> socket

This is unrelated to the original bug report.

It means that the kernel tried to send 131204 bytes on the socket, but only 106256  was sent. It was a short send, IOW. Usually that means the TCP buffers are full and couldn't hold the whole message, possibly because the receiver isn't ACKing things quickly enough.

To try and remedy this, the server shuts down the socket. That should be harmless though -- the client should just reestablish it once it notices the socket is down. I wonder if we really even need that printk at all. It doesn't tell the admin anything actionable, especially since it doesn't tell you anything about the peer.