superlifter design notes (was Re: Latest rZync release: 0.06)

Sun Jul 21 01:14:02 EST 2002

On Thu, Jul 11, 2002 at 07:06:29PM +1000, Martin Pool wrote:
> I've put a cleaned-up version of my design notes up here
> 
>   http://samba.org/~mbp/superlifter/design-notes.html
> 
> It's very early days, but (gentle :-) feedback would be welcome.  It
> has some comments on Wayne's rzync design, which on the whole looks
> pretty clever.
> 
> I don't have any worthwhile code specifically towards this yet, but I
> have been experimenting with the protocol ideas in distcc

That's OK. I don't have code either yet.  Lets get some
designing done first.

I promised Martin some specific feedback last week.  I'm finally
getting to it.  I have an design in mind so some of my
comments will be biased toward that.

I will start with point-by-point so it might help anyone
following to have Martin's design notes handy.

> Performance
>
>    6. No arbitrary limits: this is related to scalability.
>       Filesizes and times should be 64-bit; names should be
>       arbitrarily long. 

File sizes, yes.  Times, no.  unsigned 32 bit integers will
last us for another 90 years.  I suspect that by the time we
need 64 bit timestamps the units will be milliseconds.
I just don't see the need to waste an extra 4 bytes per
timestamp per file.  A protocol change here will be easily
implemented with up/down conversion.

In general though i don't want to get stuck with undersized
fields.

> Qualities
>  4. In particular, make the output easy for both humans and
>     programs to read. 
>  5. Use systematic error logging, per distcc. 
>  6. Use UTF-8 everywhere. 

Yes, yes, YES!

Logs and error messages should be easy to read, understand,
parse and to identify severity and appropriate reaction.

If an error occurs should the wrapper script retry or
give up?  See also "Quality #10"

A post-processor should know why a file is listed in the log.

>  11.  Try to be secure, even taking into account that the code
>        will have bugs.
But don't try to build security into it.  Leverage other
transports (ssh, https) where transport security is an issue.

If a new service needs to be created to provide a "lifterd"
it should have a separate codebase, use an existing
network protocol and not need a special client portion.

>  12. Have reasonable default behaviors, but be explicit rather
>      than implicit where necessary. 

Absolutely.  The most common case should require no options
at all.  I'm undecided regarding super-options (like -a) but
lean toward them as alternate semi-default behaviors.

> Not In scope

Good to address this.  However, recognizing possible
interests is worthwhile in considering.

> Inspirations

What could I add?  

> Principles
> 
>   5. Similarly, no silly tricks with forking, threads, or
>      nonblocking IO: one process, one IO. 

For the most part i agree with this.  A framework might do a
violate the "one process, one IO" rule but the active work
should occur with a simple Input-Process-Output mindset.

>   6. Programs reading the protocol should have fixed readahead:
>      they're either reading a small fixed-size header, or a body
>      of data whose length is known. Scanning input as it comes in
>      is undesirable. 

Yes.  If we are processing binary streams i want to do simple
read(2) operations and cast the data onto structs.
get_the_next_thing((struct thing)head, (void *)data)
is OK as well.

I do think that it would be best to use library routines for
reading and writing the datastreams and, as ugly as it is to
process, having an ASCII format as an option may be
desirable.  It would, however, be acceptable to have no
internal handling of an ASCII format if there is a companion
utility that can convert back and forth between the ASCII
and binary formats.  Being able to read the datastreams is
very helpful for debugging and for extensibility.

>   7. Use TCP the way it was intended: one client, one socket.
>      Don't open auxiliary sockets like FTP. 
Amen!  Like many noble experiments the FTP auxiliary sockets
were a mistake.  Ah, the benefit of hindsight :)

>  11. "Smart clients, dumb servers." This is claimed to be
>  a good design pattern for internet software. rsync at the
>  moment does not really adhere to it. Part of the point of
>  rsync is that having a smarter server can make things
>  much more efficient. A strength of this approach is that
>  to add features, you (often) only need to add them to the
>  client. 

Try "dumb servers, dumb components, smart client".  Keep it
simple.  Simple code is easier to debug and has fewer hidden
holes.

>  12. Try to keep the TCP pipe full in both directions at
>  all times. Pursuing this intently has worked well in
>  rsync, but has also led to a complicated design prone to
>  deadlocks. 

Processing in smaller units will reduce the biggest interval
of pipe emptiness presently in rsync.  Much of the
deadlocking is because of the complications and
bidirectionality of the pipeline.  You might not like my
unidirection pipeline but it shouldn't suffer from deadlocks.

> General design ideas
>
>   3. Use network-endian binary integers. Don't be afraid

I'm not so sure about network-endian.  I cannot help but
think of intel-intel transfers where every integer will need
to swabbed on both ends.  I'd rather have the initial
handshake determine byte-order (with a preference for the
server) and the io routines do in-place swabbing.

>   to use 32 or 64-bit integers: I am fairly convinced that
>   the increased network overhead will not cause a serious
>   performance problem, as long as the number of messages
>   per file is low. Multiplying the number of test domains
>   by having large and small files (or messages, or
>   whatever) is a serious problem. 

This i agree with.  A little transport overhead is minor
compared to having to determine field size every time.
Don't forget you need to set aside the bits for the field
size as well and that looses some of the savings.

>   7. Errors should contain space for human-readable
>   explanations. 

Consideration should also be given to native-languages.
Developers may be expected to know some English but users
should not.

>   9. Model files as composed of a stream of bytes, plus an
>   optional table of key-value attributes. Some of these
>   can be distinguished to model ownership, ACLs, resource
>   forks, etc. 

The lengths of names and of values should separate for
forks.  ACLs should be processed with the stat structures.
Other forks should be processed as byte streams.  Linus has
specifically spoken of forks as containing rather large data
such as thumbnail images and other arbitrary files.  A fork
could be as large as a file or even larger than the file to
which it is attached.

>  14. It may be necessary to make later operations
>  conditional on the success of earlier ones: if we fail to
>  open a file, then we can't do any later operations on it.
>  However, we don't want to drop the whole connection (per
>  HTTP), or abort operations on other files. Possibly an
>  operation that fails can invalidate its register and thus
>  cause later ones to fail? 

I'm not sure what design you have in mind.  The only
operations i can think of for which operations which much
later depend seem to be directory creation.  File creation
should be immediately followed by the chown, chmod and utime
operations.  Hardlinks (when dealt with) can to a
prophylactic fstat to confirm target existence if needed but
a return code check will do fine.

I hope this feedback helps somewhat.
I'll post my design suggestion document separately and you
can chew that up, spit it out and tell me I'm way off track.

-- 
________________________________________________________________
	J.W. Schultz            Pegasystems Technologies
	email address:		jw at pegasys.ws

		Remember Cernan and Schmitt