superlifter design notes (was Re: Latest rZync release: 0.06)
foner-rsync at media.mit.edu
Mon Jul 22 07:38:02 EST 2002
Date: Mon, 22 Jul 2002 15:15:29 +1000
From: Martin Pool <mbp at samba.org>
Swabbing to/from network endianness is very cheap. On 486s and higher
it is a single inlined instruction and I think takes about one cycle.
On non-x86 it is free. The cost is barely worth considering: if you
are flipping words as fast as you can you will almost certainly be
limited by memory bandwidth, not by the work of swapping them.
I agree with all of this. And to add even more in that direction...
Just in case someone else is concerned about the trivial inefficiency
of byteswapping (which they shouldn't be; I'm sure that won't be the
hotspot in this design), consider this---we've had various requests
for a broadcast-like rsync that could update several mirrors in
parallel. Ignoring all the other reasons why that's hard (for
example, if the mirrors aren't all in sync with each other before you
start), you wouldn't be able to negotiate a consistent byte order if
the mirrors are different architectures.
Not to mention that, if the design actually produces some stream of
bytes that could land in a file somewhere and get used later (e.g.,
the strawman proposals that actually envision a unidirectional pipe
at some point between a pair of layers), it would sure be handy if
that pile o' bytes could get fed into a layer on a machine of any
architecture and Just Work. If there had been some negotiated
byte-ordering, then (a) the generator had to talk to the receiver,
even if the task was "dump everything without running the diff
algorithm", and (b) the resulting pile o' bytes would be unreadable
if it was later tried on a different architecture.
This also means that some debugging tool run on the resulting
intermediate file/pipe wouldn't have to care about byte-ordering,
either, nor be a party to some negotiation. This is particularly
important if that tool is just a hex dumper or equivalent.
Network byte order was invented for a reason; we should use it.
More information about the rsync