Bug 218382 - Possible discrepancy in CREATE_SESSION slot number accounting
Summary: Possible discrepancy in CREATE_SESSION slot number accounting
Status: RESOLVED CODE_FIX
Alias: None
Product: File System
Classification: Unclassified
Component: NFS (show other bugs)
Hardware: All Linux
: P3 normal
Assignee: Chuck Lever
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-01-16 14:30 UTC by Connor Smith
Modified: 2024-04-23 15:05 UTC (History)
1 user (show)

See Also:
Kernel Version:
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Connor Smith 2024-01-16 14:30:52 UTC
I noticed what I think might be a discrepancy in the way Linux nfs and nfsd manage CREATE_SESSION reply cache sequence IDs.

In fs/nfs/nfs4proc.c (_nfs4_proc_create_session), the seqid looks to be incremented so long as the status is not one of NFS4ERR_STALE_CLIENTID, NFS4ERR_DELAY, or an RPC timeout. This is mostly due to:

commit b519d408ea32040b1c7e10b155a3ee9a36660947
Author: Trond Myklebust <trond.myklebust@primarydata.com>
Date:   Sun Sep 11 14:50:01 2016 -0400

    NFSv4.1: Fix the CREATE_SESSION slot number accounting

    Ensure that we conform to the algorithm described in RFC5661, section
    18.36.4 for when to bump the sequence id. In essence we do it for all
    cases except when the RPC call timed out, or in case of the server returning
    NFS4ERR_DELAY or NFS4ERR_STALE_CLIENTID.

(Note that the comment /* Increment the clientid slot sequence id */ not being moved in the above commit has, I think, left it in the wrong place.)

Meanwhile, in fs/nfsd/nfs4state.c (nfsd4_create_session), the seqid looks to be incremented only when the session has been successfully created and the status will be NFS4_OK. This is mostly due to: 

commit 86c3e16cc7aace4d1143952813b6cc2a80c51295
Author: J. Bruce Fields <bfields@redhat.com>
Date:   Sat Oct 2 17:04:00 2010 -0400

    nfsd4: confirm only on succesful create_session

    Following rfc 5661, section 18.36.4: "If the session is not successfully
    created, then no changes are made to any client records on the server."
    We shouldn't be confirming or incrementing the sequence id in this case.

My reading of RFC8881 18.36.4 suggests that the former is correct - after the sequence ID has been processed in phase 2...

> [...] the CREATE_SESSION processing goes to the next phase. A subsequent new
> CREATE_SESSION call over the same client ID MUST use a csa_sequenceid that is
> one greater than the sequence ID in the slot.

Client ID confirmation is in phase 3, for example, and returns NFS4ERR_CLID_INUSE. Since this is after phase 2, the sequence ID should be incremented on that error code. My understanding is that a "client record" refers specifically to the 5-tuple described in 18.35.4, which does not include the CREATE_SESSION reply cache, so although I think the latter commit was right to move the confirmation of the client record, I'm not sure about the incrementing of the sequence ID.
Comment 1 Connor Smith 2024-01-16 14:46:36 UTC
I originally sent the above to the linux-nfs mailing list, where Chuck Lever replied...

> On first blush, your interpretation of S18.36.4 looks correct
> to me. We need to study this further and also look into how
> pynfs treats CREATE_SESSION retransmits.

... and suggested that I file this bug report.
Comment 2 Chuck Lever 2024-01-18 23:28:44 UTC
There is possible ambiguity in Section 18.36.4.

There is a single CREATE_SESSION slot and sequence number per client. Thus it would be a short leap to assume that these are part of the "client record" that is frequently referred to in this section.

The description of Phase 2 of the CREATE_SESSION operation says:

> If csa_sequenceid is equal to the slot's sequence ID + 1 (accounting
> for wraparound), then the slot's sequence ID is set to csa_sequenceid,
> and the CREATE_SESSION processing goes to the next phase.

Which can be interpreted as an update to that client's record.

But later in Phase 3, the description of "Successful Confirmation" says:

> If the session is not successfully created, then no changes are made
> to any client records on the server.

and the description of "Unsuccessful Confirmation" likewise says:

> Neither of these cases is permissible. Processing stops and
> NFS4ERR_CLID_INUSE is returned to the client. No changes are made to
> any client records on the server.

It's possible that Bruce read this to mean that the slot sequence ID, as part of the client's record, is not updated in these two cases (including NFS4ERR_CLID_INUSE).
Comment 3 Chuck Lever 2024-01-22 15:21:26 UTC
There is growing consensus that Section 18.35.4 clearly defines a client record, and it does /not/ contain any session-related details.
Comment 4 Chuck Lever 2024-02-07 21:34:48 UTC
I've prototyped a fix for this issue, but now NFSD FAILS pynfs CSESS6.

This test checks whether a CREATE_SESSION that fails with NFS4ERR_CLID_INUSE is cached by sending a CREATE_SESSION with the wrong credential (leaving the client ID unconfirmed) and then replaying that CREATE_SESSION using the same CS slot sequence number.

This means that either the test is wrong or our understanding of the spec is wrong.

A colleague confirmed that the Solaris NFS server passes CSESS6.

Given the way the implementation guidance in RFC 8881 Section 18.36.4 is broken into phases and states the error code behavior in terms of BCP14 compliance keywords, I still believe that SEQ_MISORDERED is the correct error code. The sequence number check is supposed to be done first, so that:

1. The first CREATE_SESSION, even though unsuccessful, still advances the CS slot sequence number on the server.

2. The second CREATE_SESSION will fail early because the client presents an old slot sequence number.
Comment 5 Chuck Lever 2024-04-23 15:05:27 UTC
Fixed by commit e4469c6cc69b ("NFSD: Fix the NFSv4.1 CREATE_SESSION operation") and 99dc2ef0397d ("NFSD: CREATE_SESSION must never cache NFS4ERR_DELAY replies"), both merged in v6.9-rc.

Note You need to log in before you can comment on or make changes to this bug.