Going forward on replication-migration

New Message Reply About this list Date view Thread view Subject view Author view Attachment view

From: Noveck, Dave (Dave.Noveck@netapp.com)
Date: 12/20/02-09:18:20 AM Z


Message-ID: <C8CF60CFC4D8A74E9945E32CF096548A0729DB@SILVER.nane.netapp.com>
From: "Noveck, Dave" <Dave.Noveck@netapp.com>
Subject: Going forward on replication-migration
Date: Fri, 20 Dec 2002 07:18:20 -0800

This message is in some sense a response to a recent 
message by Brent, but rather than start the cycle of
quote and requote, I'll try instead to take this opportunity
to put forth my vision for where we should be going on
this.

The comment that got me going was Brent saying something
to the effect that nobody wants to prototype the migration-
replication protocol.  It's true everybody is busy but I
don't see unwillingness to prototype as the main issue.

The problem is that we don't yet have anything that we can
reasonably prototype.  We are still arguing stuff like RPC/
XDR that is at the transition between requirements and design.
We have a strawman and the straw is fairly lightweight right
now.

The fault in all this is not Rob's.  He put the strawman out
there and asked for, pleaded for, comments.  We all ignored him.
Yes, we had other stuff to do, but still we should have devoted
the time to advancing this.  We all know this is important.
I want to thank Mike Eisler for getting us started looking at
this.  I know his contribution shamed me into doing something.

The problem with the cthon '03 prototyping goal is that it has
become prototyping-for-prototyping's sake.  The purpose of 
prototyping is to do some implementation to learn things about
the protocol, when just thinking about it and making comments,
no longer helps.  Clearly, we aren't at that stage, yet.  

I think we could produce something and convince ourselves that
we had done a prototype but there would be no benefit.  We would
be diverting effort that would better be used in getting the
protocol in shape.  Also, prototyping might pre-maturely freeze
the protoocol in some undesirable ways.  People will be reluctant
to consider changes that would invalidate their prototypes.  It
is just too early to prototype.  For all the actual work that 
has been done on the protocol, this might as well be one week
after the strawman was published and we should recognize that.

We need to get serious.  Even though I think that the goal of 
prototyping at cthon '03 is no longer realistic, I do believe
we need to have some goals to force the pace.  I think it is 
reasonable to have a goal of a prototypable protocol by cthon 
'03. That is an ambitious goal, but if we work hard we could
reach it.  Let's have a meeting at cthon '03 to assess progress
toward that goal and plan the next steps (i.e. getting some
prototypes and testing them).  One thing to note here, is that,
regardless of your opinion on the RPC issue, this server-server
protocol is going to be extremely latency-tolerant.  There should
be no need to wait for the next bakeoff to do interoperability 
testing.  Over-the-network will work fine.  Let's avoid artificial
constraints in making our plans.

As part of getting serious, I'd like to ask anybody who might
be interested in getting involved in this work, to think about
actually contributing now.  This is the critical time to make this
stuff happen.  This is particularly addressed to those who work neither
at Sun nor at Netapp.  We need your input.

Brent has mentioned the possibility of involvement by groups who
have been doing user-level fs synchronization tools.  While I think
it is unrealistic to ask those groups to simply produce a protocol 
for us, soliciting their involvement and input seems like an 
excellent thing to do.  I'm wondering what we can do make that
easier for them.  One suggestion I have is that maybe we need to
create a separate nfsv4-wg-mig-rep list.  I think some people, who
are not all the interested in the v4 protocol per se, might be 
put off getting lots of long emails where Neil Brown and I discuss
how the protocol might have been done (and maybe we are going to
get requests from others for nfsv4-wg-no-history).

One other issue on the migration-replication front is that we have
not done testing (that I know of) on the client-server aspects of
this that are already in the about-to-be-a-proposed-standard v4
protocol.  Cthon '03 seems like an excellent place to test those.
So I'd like to ask those with servers whether they would be prepared
to help test client support of this.  I'm taking about soemthing
incredibly kludgy here, but it will still be of value in testing 
clients.  For example, export a filesystem on a floppy, force the
server to return migrated and move the floppy to another v4 server.
See if the client can handle it.  Forcing the FH_VOL_MIGRATION bit
and changing the fsid could test the clients volatile file handling
path.  So the question I have is whether it worth it for the server
to invest time in such a kludge.  Will the clients be prepared to
test using it?

One other thought is that e-mail is not the best method for some
of the discussions we are having (for/against RPC is a definite 
example).  Maybe a conference call to discuss things would help.  
Is there any interest in scheduling one sometime in the first half 
of January, as a way of discussing some of the troublesome issues, 
and focusing effort on getting some momentum for this protocol? 
 


New Message Reply About this list Date view Thread view Subject view Author view Attachment view

This archive was generated by hypermail 2.1.2 : 03/04/05-01:50:44 AM Z CST