From: Noveck, Dave (dave.noveck@netapp.com)
Date: 06/11/99-10:34:33 AM Z
Message-ID: <7F608EC0BDE6D111B53A00805FA7F7DA0330363C@TAHOE.netapp.com>
From: "Noveck, Dave" <dave.noveck@netapp.com>
Subject: Replication/migration proposal
Date: Fri, 11 Jun 1999 08:34:33 -0700
This is a slight modification of a proposal I sent out few months ago.
It addresses failover for replicated (almost certainly read-only) filesystems
and controlled migration of (possibly) read-write filesystems in order
to allow server maintenance, load-balancing, etc. This follows up on a
presentation beepy made at the last working group meeting (at Connectathon).
The basis of the proposal is a new attribute:
Name: server_locations
Data Type: utf8<>
Access: Read
Description:Provides a string which defines alternate locations for the
objects within this filesystem (i.e. all objects which have the
same fsid.major/fsid.minor as this one).
The alternate locations specified can be used to recover from the failure
of a server by re-directing the requests to one of a set of alternate
servers (This is most useful in the case of read-only filesystems) or
to allow the migration of the responsibility for serving a filesystem
to provide client-transparent load balancing or scheduled server
downtime.
To allow extensibility and flexibility, the format of the sting is defined
as consisting of a short type string followed by an equal-sign followed
by the data as defined for that particular type string. The type string
"hosts" is suggested for initial implementation. The string following
"hosts=" would consist of a comma-separated list of typical Unix
nfs mountpoint designators (hostname followed by colon followed by
location of the export with the server name space). Other type strings
may be defined in the future if needed.
When the client finds that a server is unresponsive, it can use
the server_locations attribute fetched at mount time in order
to try to find alternate servers to provide nfs service for the
filesystem. When the handles for this filesystem are volatile
(probably the most common case), the client will use the saved
pathnames to reestablish new replacement handles for all of the
old handles, which are assumed to have lost their validity. It
is also possible for a server to provide persistent handles which
can be carried over from server to server transparently.
In order to deal with the case of a server which is unresponsive
at mount time, a client implementation might provide a way of
specifying (e.g. mount option) an initial value of server_locations
to be used. Another alternative is to allow designation of some
sort of reliable name service to be used to provide that initial
information.
When multiple replicas do not exist (typical case of read-write
filesystems), migration of a filesystem may be initiated by the
existing server returning NFS4ERR_MOVED in response to every
attempt to use a filehandle in the filesystem being migrated,
with one exception. The exception is that a GETATTR interrogating
only server_locations will succeed and return information which
can be used to find the new server just as done in the replication
case. In order to reduce the burden on the server in terms of
what must be committed to stable storage (probably outside the
disk normally associated with the filesystem), it is not required
to carefully validate the filehandles in this case. In other words,
if it can determine that the filehandle belongs to the proper fs,
it does not have to make sure that the proper inode exists or that
the generation number is correct, for example.
Just as in the replica case, in the migration case, volatile file
handles are assumed to lose validity as part of migrating to a new
server and need to be relooked up. On a read-write filesystem, this
procedure can cause errors when renames have occurred after filehandle
was obtained. Persistent handles retain their validity which requires
that the old and new servers agree on a procedure to make sure that
the handle remain valid across the migration. The means by which
this is done are up to the server implementations and are outside the
scope of this document.
When a client transfers a given filesystem to a different server, either
for failover or migration, certain attributes of the files contained
within the filesystem may change. First of all, since the assignment
space fsid.major and fsid.minor is host-specific, these attributes will
change uniformly for all files within the filesystem. Other attributes
may change so it is advisable to for the client to invalidate cached
attributes even before they would have been normally timedout. When
the filehandles involved are volatile, it is likely that even such
attributes as fileid will be different when a handle is remapped. It
is up to the server to make sure that the actual contents of the object
behind the old filehandle and the new one are the same.
This archive was generated by hypermail 2.1.2 : 03/04/05-01:47:13 AM Z CST