Assumptions: 1. stateid based on lockowner+file 2. stateid is changed if state of the file changed (on any operation that returns stateid) 3. There could NOT be two operations with seqid or stateid in one compound request 4. client increments seqid before sending request, however it's possible to increment seqid only after OK received from the server. Some parts should be rechecked in this case. Changes to Andy's diagram: 1. stale stateid could cause new SETCLIENTID or OPEN call. NFS Server not always may find nfs_lockowner with stale_stateid. it could be implementation depended. 2. some changes when increment seq. number and renew lease don't renew lease on replay. (See email to Noveck from 1/31/2001) however some cases still not clear (most for OPEN_CONFIRM) 3. Don't check lease on OPEN 4. good stateid -> replay seqid = logical error 5. old stateid -> good seqid = logical error 6. separate processing of OPEN_CONFIRM 7. For operations that have a stateid and no sequence number bad stateid includes unconfirmed seqid. Operations that have a sequence number, and no stateid ------------------------------------------------------ OPEN (everything starts here) check clientid / \ bad good / | don't bump seq num | don't renew lease | NFS4ERR_STALE_CLIENTID | | | lockowner found / \ no yes / \ / seq. num confirmed -- yes ----------------------->- / | | / no | / | | save lockowner check sequence number | | / \ | | / \ | | / \ | | bad or good replay | | | \ | | cancel prev. OPEN send saved reply | | state for lockowner server doesn't bump seq num | | (whatever it means: CLOSE?) server doesn't renew lease | | | | |<----------| | | | server saves new seqid | set seqid to unconfirmed | | | perform operation | | | renew lease | save reply | | | op. result OK | / \ | yes no | / \ | ask for send reply | OPEN_CONFIRM client should not decrement seqid | send reply to avoid replay | | | |<--------------------------------------------------| | | check sequence number / | \ / | \ / | \ bad replay good / | \ on server side | \ don't bump seqnum | \ don't renew lease | \ NFS4ERR_BAD_SEQ_NUM | \ don't save reply??? | \ | \ on server side | don't bump seq num | don't renew lease | return replay | | perform operation | server saves new seqid renew lease save reply Operation OPEN_CONFIRM that has a stateid and a sequence number ---------------------------------------------------- stateid in verifier otherwise server can't locate lockowner It looks like it should be different from other operations that take stateid and seq. number. Any comments are very welcome, because I don't think that everything is correct here (let's look together) NFS4ERR_STALE_STATEID should be added to protocol in return errors, otherwise there is no way for client to know about server reboot. NFSERR_BAD_STATEID, NFSERR_OLD_STATEID are also missed, even if they could be replaced by NFSERR_NOENT and NFSERR_INVAL, but NFSERR_INVAL not always clear what it's about. check stateid (verifier from OPEN) / | \ \ / | \ \ / \ \ \ / \ \ \_______ / | \ \ stale good old bad / | \ | | | | server can't bump seq num | | | client undo seq num increment | | | don't renew lease | | | NFSERR_BAD_STATEID seq knowledge is | \ reset on client | \ don't renew lease | check sequence number NFS4ERR_STALE_STATEID | | | confirmed - no NFSERR_BAD_STATEID | | \ server can't bump seq num | yes client undo seq num increment | / | \ don't renew lease | / | \ | bad good replay (should hit this) | | | server doesn't bump seq num | | | doesn't renew lease | | | send from replay cache | | \___________________________ | | | | server doesn't increment seq num | | doesn't renew lease | | NFSERR_BAD_SEQID server doesn't bump seq num | doesn't renew lease | client undo seq num increment | NFSERR_OLD_STATEID | seq num confirmed | \ no yes (logical error. we should never be here, | because stateid was changed on previous OK | of OPEN_CONFIRM and is not the same as verifier) | NFSERR_SERVERFAULT or NFSERR_INVAL or NFSERR_BAD_STATEID | renew lease | server doesn't bump seq num | client undo seq num increment | | check sequence number / | \ / | \ / | \ bad replay good | | \ don't bump seqnum | \ don't renew lease | \ NFS4ERR_BAD_SEQ_NUM | \ | \ don't bump seq num \ don't renew lease | send stored reply | (it was op. error) | | | check lease / | / | bad good / | / | NFS4ERR_EXPIRED | client undo seq num increment | server bumps seq num | (for replay) | server doesn't renew lease | CLOSE file | save reply | server should keep perform operation info for some time | renew lease server bumps seq num save reply | | op. result OK / \ yes no / \ set state to confirmed NFSERR_xxxx return new stateid CLOSE file (not same as verifier) server should keep info for some time NOTE: Unfortunately it's NOT possible to set state to confirmed on op. error also, because seqid is per lockowner, but stateid is lockowner+file. OPEN_CONFIRM used to confirm sequence id usage for lockowner (but takes and returns stateid), which could still exist on client even if something wrong happened with file. (oohh) Why don't introduce new operation for setting seqid, not tied to OPEN file operation. It could help also in case of resynchronization of client and server. Right now if client receives BAD_SEQID error, the only way to continue is to start from the very beginning (SETCLIENTID), which is too costly. (Don't say that we need perfect client and server, it never happens in real live) Operations that have a stateid and a sequence number ---------------------------------------------------- CLOSE, LOCK, LOCKU, OPEN_DOWNGRADE, NOTE: the server cannot deallocate the information associated with a CLOSE'd state until another request with the next sequence number arrives for the same lockowner, or the client reboots, or there is no longer any locking state for that owner and the server receives nothing from that lockowner for a while and forgets the associated lockowner information, yada, yada, yada. check stateid / | \ \ / | \ \ / \ \ \ / \ \ \______ / | \ \ stale good old bad / | \ | | | | server can't bump seq num | | | client undo seq num increment | | | don't renew lease | | | NFSERR_BAD_STATEID seq knowledge is | \ reset on client??? | \ and server??? | check sequence number NFS4ERR_STALE_STATEID | | don't renew lease | confirmed - no NFSERR_BAD_STATEID | | \ server can't bump seq num | yes client undo seq num increment | / | \ don't renew lease | / | \ | bad good replay (should hit this) | | | server doesn't bump seq num | | | doesn't renew lease | | | send from replay cache | | \___________________________ | | | | server doesn't increment seq num | | doesn't renew lease | | NFSERR_BAD_SEQID server doesn't bump seq num | doesn't renew lease check sequence number client undo seq num increment | NFSERR_OLD_STATEID | confirmed - no NFSERR_BAD_STATEID | \__ server can't bump seq num ----------- yes---- client undo seq num increment | | | don't renew lease | | \ bad replay good | | \ don't bump seqnum | \ don't renew lease | \ NFS4ERR_BAD_SEQ_NUM | \ | \ don't bump seq num \ don't renew lease \ NFSERR_BAD_SEQID | | check lease / | / | bad good / | / | NFS4ERR_EXPIRED | client undo seq num increment | server doesn't bump seq num | server doesn't renew lease | | | perform operation | renew lease server saves new seq num save reply Operations that have a stateid and no sequence number ------------------------------------------------------ READ, WRITE, RENEW, SETATTR for TRUNCATE This case is clear, the lease is renewed only if the stateid is valid. NOTE: bad stateid includes unconfirmed seqid. check stateid / \ bad good / | don't renew lease | NFS4ERR_XXX_STATEID | | | check lease / | bad good / | / | NFS4ERR_EXPIRED | | | renew lease perform operation