From: Noveck, Dave (Dave.Noveck@netapp.com)
Date: 05/30/02-11:19:13 AM Z
Message-ID: <8C610D86AF6CD4119C9800B0D0499E336A8E28@red.nane.netapp.com> From: "Noveck, Dave" <Dave.Noveck@netapp.com> Subject: uids-as-names, compound, reply cache, cabbages and kings Date: Thu, 30 May 2002 09:19:13 -0700 I had an interesting experience in implementing some elements of the v4 readdir in which there was some unexpected interactions (positive as well as negative) among protocol features and I thought I'd share it with everybody (after all, you can hit delete), since it's kind of amusing, at least if you can be amused by weird protocol implementation issues (OK, so maybe everybody is going to hit delete). If you're going to be at the bake-off you might want to skip to the last paragraph, instead of hitting delete. This is the kind of thing that might be an interesting section in that implementation RFC that we used to talk about doing after the spec was done. By the way Spencer, how is the spec coming? I'm hoping that we will test the *final* protocol at the August bake-off. Anyway, the issue is that for reasons that are too complicated to go into here, I may post-process some of the readdir output to deal with the fact that uid's and gid's need to be mapped to strings. I maintain a cache of this stuff and install the strings in the output if I can find them in the cache. In cases in which the name isn't in the cache, I leave some bread crumbs in what would otherwise be the readdir output and mark the request for later post-processing, after the request is finished, while creating the reply. So here's the ugly case. Somebody gives me a readdir with a very small max output. It might be big enough to hold the output for a single file in the directory or it might not, depending on how long the string for the uid turns out ot be. So at post-processing, I may install the actual string if it fits, or return a NOSPC error. But wait a minute. I'm doing this after processing the request. This might have been a COMPOUND in which there is some operation that followed the READDIR. If I fail the readdir, then I shouldn't have done the following operation, and if it had visible consequences then I would have to undo that operation, which would be very difficult or impossible to do. So am I screwed here? The answer is "No" because of some interesting serendipity. Suppose someone did a COMPOUND consisting of READDIR followed by RENAME. Or READ followed by WRITE or by REMOVE. How could this be accommodated by the reply cache. Not very easily. The COMPOUND is clearly not idempotent so the reply cache information has to be saved. But that includes the idempotent operations before the non-idempotent including operations like READ and READDIR that can generate lots of output. Without COMPOUND, such operations would never have to be stored in the reply cache, and saving large amounts of data in a reply cache is highly undesirable. So because the above issues with the reply cache (which is not part of the spec), we decided way back that although such things as READDIR followed by RENAME are allowed by the spec, the server may reject a compound with NFS4ERR_RESOURCE, if it is just too complicated to handle for this or any other reason. RFC3010 says: It is the client's responsibility for recovering from any partially completed COMPOUND procedure. Partially completed COMPOUND procedures may occur at any point due to errors such as NFS4ERR_RESOURCE and NFS4ERR_LONG_DELAY. This may occur even given an otherwise valid operation string. ... Therefore, the client should avoid overly complex COMPOUND procedures in the event of the failure of an operation within the procedure. Dealing with a non-idempotent op after READ, READLINK, READDIR seems just about impossible, from the reply cache point of view. Dealing with READ, READLINK, READDIR after a non-idempotent operation is more tractable but still difficult. At the next bake-off, my server is going to return NFS4ERR_RESOURCE for either case. I don't expect this will be a problem for anyone, but you never know. BTW, DAFS which does define reply cache behavior as part of its architecture, specifies that the first of these combinations is an instance of invalid chaining (DAFS has chaining rather than COMPOUND).
This archive was generated by hypermail 2.1.2 : 03/04/05-01:49:45 AM Z CST