From: Carl Burnett (cburnett@us.ibm.com)
Date: 10/20/03-07:53:36 AM Z
Subject: Re: [nfsv4] Directory delegations, take 2
Message-ID: <OF121DC528.C89BE01A-ON87256DC5.00443F86@us.ibm.com>
From: Carl Burnett <cburnett@us.ibm.com>
Date: Mon, 20 Oct 2003 07:53:36 -0500
Look below. My remarks are within [Carl: .... ].
Thanks,
Carl
Carl Burnett
AIX Kernel Architecture - Network File System
(512) 838-8498, TL 678-8498
(please reply to cburnett@us.ibm.com)
"Noveck, Dave" <Dave.Noveck@netapp.com>
Sent by: nfsv4-admin@ietf.org
10/17/2003 06:55 PM
To: <nfsv4@ietf.org>
cc:
Subject: [nfsv4] Directory delegations, take 2
So this is my attempt to update the directory delegation approach
to reflect comments I have recently received. In some cases, I've
changed things and in some cases I've simply tried to explain things
better. I'm not going to do extensive quoting of comments I've
received, but I hope my summaries are not misrepresenting anyone.
The first thing that has been mentioned, by David Robinson I believe,
is directory write delegation. I'd like to explain why I've stayed
away from directory write delegation.
Directory write delegations scare the hell out of me. There are two
important practical issues. First, you have the problem of an
unresponsive client stopping things on the server from proceeding.
It is true that this problem already exists with write delegations
for files, but the problem is much more likely if you have a whole
directory such that any lookup through it will cause people to wait
for a long time, while we decide whether the client holding the write
delegation is ever going to respond. The second issue, which exacerbates
the first is that the effect of revoking a write directory delegation
is liable to be extremely disconcerting to the client/user, causing the
server to be extremely reluctant to revoke the thing, exacerbating the
delay to other clients when there is an unresponsive client. The problem
is that once you do a directory-modifying operation, and it succeeds at
the application level, if you have your delegation ripped away, you are
in a tough situation. Your application syscall has succeeded, the
application may have terminated, and now you have created a file, other
clients have seen the directory when you didn't, and thus you don't want
to push the create out and if you don't, you essentially have a corrupt-
fs/reboot situation. It is theoretically possible to embed the
directory-
modifying operations in transactions such that we have a nice recovery
but my guess is that there aren't going to be actual clients able to use
this kind of thing safely for a long long time, if ever.
Carl Burnett made reference to write directory delegation in DFS (I think
???*****). I'm guessing that the issues of lack of shared semantics that
he mentions could be worked around, if it was worth it. But I wonder
about the effects of communications problems, as I mentioned above,
particularly in an internet environment. So I would be interested in
hearing about actual experiences with this. Is it worth it, given that
spec-ing and implementing this is liable to be a lot of work?
[CARL: DFS did not use write guarantees on dirs for some of the reasons
you listed above. AFS had it. It was largely possible in AFS because the
server side filesystem was a well known and it was the only one that AFS
worked with.]
There are number of other issues that people have brought up that seem
to be inter-related:
Delegated directory contents as READDIR or READDIR+ (or what is
the role of attributes?).
Synchronous or asynchronous notification.
Notification of changes vs. a clear-your-dnlc-for-directory model.
(raised by Tom Talpey in some private e-mail).
These all relate to the issue of what write delegations are intended to
do (Oh Gosh, I need a Problem Statement :-). So the following things are
possibilities (not mutually exclusive). I'm particularly interested in
additions to this list, except when they complicate the design. Come to
think of it, I'm not interested in changes to the list :-), but I know
that won't stop anybody.
Enable files to be accessed without significant server interaction
when they exist in read-mostly directories, or rather, in directories
that are not being written by other clients.
Tracking changes in a specific directories for programs that display
directories for GUI tools, without ugly polling.
Accessing non-existent files in a non-changing directory, presumably
one which exists :-), or the issue of ENOENT lookups/opens mentioned
by Carl Burnett.
Let's first consider the issue of asynchronous vs. synchronous
notification.
The motivation for asynchronous notification is that it is better from the
server's point of view in that operations will not be held up due to
network
problems or a client not responding quickly to a callback (or being down)
and that even when everything is working OK there is a cost in that a
delay
equal to twice the latency to the most distant notified client is added.
So the issue boils down to whether asynchronous notification will do the
job.
Carl's comments have caused me to rethink the issue and I've decided that
they won't, at least for anything other than the case of the GUI tools. So
I think I'm back in the synchronous notification camp.
Tom Talpey (private e-mail) has raised the issue of whether change
notifica-
tion is worth it at all, and whether instead you just have a
recall/revoca-
tion event and let the client just get his delegation again and refetch
the
modified directory contents. This does make the feature easier to spec
(e.g. There is a tough case of sequencing notifications and successive
READDIR's when fetching a big directory, as well as a more complicated set
of callbacks to define). However, I worry about very large directories
and
the effort of refetching, when there is a modest level of exogenous
directory change. What do other people think? I think I'm going to go
forward trying to do the notifications, unless it turns out that the
complexities make this too difficult to do for v4.1.
One issue that Saadia Khan raised is the issue of directory changes made
by the delegate himself. I think we have to make clear that this is
allowed and the client is presumed to know about directory changes it
makes itself. Doing otherwise would compromise the usefulness of
directory delegations in the case in which a single client is modifying
the directory. My assumption is that it just too difficult to do write
directory delegation, but exclusive use is still a very important case,
and we should what we can to make read directory delegations useful in
the exclusive use environment.
This would be particularly important if we are not doing notifications
and have to recall the delegation and re-READDIR, but even if we do
notification, the thought of a RENAME on a high-latency link waiting
for a high-latency callback to the client doing the rename, makes me
kind of sick.
The issue of not sending callbacks to the client making the change
could require stateid's in all directory-modifying operations, but
with sessions, we can simply not notify delegations associated with
the session making the change.
Regardless of all the IETF procedural stuff, my impression is that
sessions are going to be in v4.1 and I don't want to waste my time
defining new operations, that won't be needed for v4.1 if sessions
are present.
Another issue that was raised is the requirement that attributes not
be changed. There was some objections to this by David Robinson on
what I take to be architectural grounds, in that directories and
the attributes of files within them are just different sorts of
objects. Also, Rob Thurlow worries about the difficulty of
implementing the callback in response to, for example, a SETATTR
on a filehandle which just happens to be in the subject directory.
[CARL: I agree with Dave and Rob. You want to keep them separate with
maybe the possible exception for symlinks that you elude to below]
So let me first explain my basic motivation for the attribute
requirement. The attributes I am basically concerned with are
those that have to do with access to the file: mode, owner,
group, and acl. Also the change attribute so that the client can
see if he has the right version of the file. We could try to
reduce the attributes guaranteed constant to the minimum, but
there doesn't seem to be a lot of reason to do that. This is the
same situation as with file delegation. Any SETATTR causes the
delegation to be recalled, even though it might be possible to
allow a few marginal attributes to be changed. Having the client
able to assume that all attributes remain unchanged just makes
things simpler.
I want the directory-delegated client to be able to access files
(i.e. open and read) files within the directory without needing
to contact the server. So this is why it makes sense to impose
a similar attribute constancy requirement on directory delegation.
If you didn't, you could not determine whether a given user
could access the file and would have to contact the server
for each individual file.
Even when you have read delegations available for the individual
files, you have to get a delegation for each one being accessed,
and then return that delegation. Given that clients may cache
copies of infrequently changed files on disk, a simple way of
validating such copies and securing access would be very
nice indeed, especially without forcing a per-file state
housekeeping requirement. The number of directories you are
going to be accessing is much smaller than the number of files,
in almost all cases.
[Carl: I think one problem with the above is security. The security of the
directory and each object's attributes are not enough to determine the
security of each object and the client should not assume the security. At
a minimum it needs to check with the server to make sure access is allowed
by the entity operating on the file. The server could have security
policies that go beyond file attributes, including NFS V4 ACLs. For
example, time of day policies, etc.]
So I'd argue that the performance benefits of this override any
architectural reservations, but that is generally the way I lean
on these things. After all, READDIRPLUS (now READDIR) returns
attribute information together with directory information, in the
face of the same architectural disconnect for the same sorts of
pragmatic reasons. As far as the difficulty of implementation,
I'd say "No pain, no gain" but I would be open to an option what
would allow servers that couldn't implement this to obtain all the
benefits that they could get without it. Let me also offer the
following full disclosure. WAFL does not have pointers in the inode
back to the enclosing directory but it has been discussed. I'm
pretty firm in believing that this is something that filesystems
will just have to do. When things go wrong, for example, saying you
have a problem with inode xxx (as opposed to the file named
aaa/bbb/cc) as is part of the typical UNIX fs paradigm is not
something that users can or should be asked to accept.
I guess it is possible to reduce the attributes to the critical
set, if someone can make a strong case for this. However, once
you subtract what are basically filesystem attributes, get rid
of atime which has to be excluded, take away unchanging attributes
such as fileid and fsid, there isn't all that much left. Also,
the difficulty of implementation does not seem to be reduced with
fewer attributes.
One issue that has come up recently that we will have to resolve
for directory delegation, and appears particularly relevant to the
client looking at the acls and granting access to the individual
user processes is relation of credentials and state, particularly
delegation state. I haven't followed the ongoing discussion of
this issue well enough to determine my exact position on how it
does or should affect directory delegations, although it is clearly
quite relevant. This needs further discussion.
Carl also mentioned some ideas for structuring requests to get
directory delegation. I'm thinking a request to get a directory
delegation alone would work OK. You can add a READDIR to the
COMPOUND. There would have to be an option so that failure to
get the delegation would not cause an error so that you could
try for a delegation and get the directory information whether
you got the delegation or not. Mike Eisler has suggested (in
private e-mail), the possibility that this would fit well with
OPENDIR/CLOSEDIR operations in which a delegation request was
a client option. Since OPENDIR would allow the server to know
when the directory was open, it could make the cookie verifier
useful by enabling the server to switch the verifier only when
the directory was not open.
Carl also mentioned the possibility of symlink delegations. I
don't think this is needed and it would be a lot of delegation
stateid's for the server and client to keep track of. At least
within the nfs protocols, there is no way to change a symlink
without changing the directory. Symlinks are not writable
objects. You have to delete the existing one and then create a
new one of the same name to get the effect of changing symlink
contents, and even this would change the filehandle of the
symlink, rather than being see as modifying an existing object.
So changing a symlink is always going to involve a directory
delegation callback in any case. To deal with the possibility
that the local server OS has a API to modify as symlink, we merely
have to make the rule that a read directory delegation provides
an assurance that there be no change in symlinks within the directory
without a callback.
[CARL: Using the directory delegation to cover the validity of symlink
data sounds like it could work and it would provide the fundamental
benefit]
_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www1.ietf.org/mailman/listinfo/nfsv4
This archive was generated by hypermail 2.1.2 : 03/04/05-02:12:49 AM Z CST