superlifter design notes (was Re: Latest rZync release: 0.06)

Fri Jul 26 00:40:05 EST 2002

On Thu, Jul 25, 2002 at 11:49:24AM -0400, Bennett Todd wrote:
> 2002-07-21-04:12:55 jw schultz:
> > On Thu, Jul 11, 2002 at 07:06:29PM +1000, Martin Pool wrote:
> > >    6. No arbitrary limits: this is related to scalability.
> > >       Filesizes and times should be 64-bit; names should be
> > >       arbitrarily long. 
> > 
> > File sizes, yes.  Times, no.  unsigned 32 bit integers will
> > last us for another 90 years.  I suspect that by the time we
> > need 64 bit timestamps the units will be milliseconds.
> > I just don't see the need to waste an extra 4 bytes per
> > timestamp per file.
> 
> If bandwidth is of any interest at all, compress; any compression
> algorithm will have no trouble making hay with bulky, redundant
> timestamp formats. Rather than trying to optimize the protocol for
> bandwidth without compression, wouldn't it be better to try to
> optimize to future-proof in the face of changing time
> representations across systems?

OK!  I'm convinced, no micro-optimization of the protocol
for network compactness.  We compress.  zlib should probably
be the default.  Let's stop arguing the point.  It just
isn't that important.  If we use library functions for I/O
we can plug in any scheme we want so long as the I/O libs
talk to each other and present consistent data to the
functional routines.

> 
> If I were designing a protocol at this level, I'd be using TAI;
> there's 64-bit time with 1 second resolution covering pretty much
> all time (more or less, depending on the whimsies of
> cosmologists:-); there are also longer variations with finer
> resolution. TAI, with appropriately fine resolution, should be able
> to represent any time that any other representation can, closer than
> anyone could care.
> 
> TAI can be converted to other formats with more or less pain,
> depending on how demented the other formats are; djb's libtai is a
> reasonable starting point.
> 
> <URL:http://cr.yp.to/time.html> has links to some pages discussing
> time formats.
> 
> In short, though, "Time since the epoch" has a complication:
> leap-seconds. Either you end up having to move the epoch every time
> you bump into a leap-second, thereby redefining all times before
> that; or else you have duplicate times, where two different seconds
> have the same representation in seconds-since-the-epoch. Well,
> there's a third possibility, you could also let the current time
> drift further and further from what everybody else is using, but
> nobody seems to go for that one.

The pedantics of time representation are irrelevant.  No
offense intended.  Yes, every computer system on the planet
mishandles time in some way.  It doesn't matter.  All
time is relative and i don't wish to get into cosmology or
religious discussions of when time began.  We cannot know with
sufficient precision to be of value.  What does matter is
that jitter aside our computers track time with higher
precision than accuracy and we must maintain that precision
where reasonable.

All that matters is that we can represent the timestamps in
a way that allows consistent comparison, restoration and
transfer.  Each platform's file manipulation and query
routines need to support bi-directional transfer in a way
that loses a minimum of content.

I expect that 1 second precision is very soon going to be
insufficient.  The real question is what it will be?  Are we
talking centiseconds, milliseconds or microseconds and what
is the epoch.  So far i haven't heard of any reputable
standard emerging yet.

A quick calculation shows that uint64 storing microseconds
has a scope in excess of 584,000 years.  As such we can pick
as an epoch any time in recorded human history.  I don't feel
qualified to impose any epoch myself.  I would be inclined to
stick with the UNIX epoch for the sake of convenience. If we
allow signed values we should be able to accommodate any
time set in recorded history.  Conversion with any other
time representation should be a matter of t * scale +
offset.

Differing precision mean comparisons would have to involve
rounding to coarsest precision.
	pc = max(p1, p2)
	t1/pc != t2/pc
Measuring the difference is not sufficient.  Where precision is 2
5 =~ 4 but 5 !~ 6.
And remember that while scale is a property of the OS,
precision is a property of the filesystem and may change as
we cross mount points.

-- 
________________________________________________________________
	J.W. Schultz            Pegasystems Technologies
	email address:		jw at pegasys.ws

		Remember Cernan and Schmitt