Improving the rsync protocol (RE: Rsync dies)

Donovan Baarda abo at minkirri.apana.org.au
Mon May 20 06:02:02 EST 2002


On Mon, May 20, 2002 at 09:35:04PM +1000, Martin Pool wrote:
> On 17 May 2002, Wayne Davison <wayned at users.sourceforge.net> wrote:
> > On Fri, 17 May 2002, Allen, John L. wrote:
[...]
> I've been thinking about this too.  I think the top-level question is
> 
>   Start from scratch with a new protocol, or try to work within the
>   current one?

tough question... to avoid backwards breakage and yet implement something
significantly better you would probably have to make two rsyncs in one
executable; the new protocol, and the old one for a compatible fallback
talking to old versions. After enough time had passed and all old rsync
implementations had been purged, the old code could be dropped, leaving a
nice clean small solution.

I tend to think that once a delta compressing http extension gets mainstream
acceptance, rsync will fade away _unless_ it offers significantly better
performance by avoiding http overheads (which is why ftp lives on, despite
being a bastard protocol from hell).

> algorithm or codebase, or need to evolve the current one.  I think the
> nature of the current protocol is that it will be hard to make really
> fundamental improvements without rewriting it.
[...]

My feelings too.

> I wrote librsync.  There is some documentation and I can add more if
> there's anything undocumented.

I really want to do some work on librsync. I recently did some work writing
a Python swig wrapper for it and identified several areas where it could be
improved. I already have a more modular implementation of the rolling
checksum that is 2~3x faster that I want to integrate with it.

> I haven't looked at pysync as much as it deserves, but it could be a
> good foundation.

Pysync now includes a Python extension module to librsync. I have also
implemented a Python wrapper around the rolling checksum mentioned above
that makes the Python version run nearly 2x as fast as the pure Python
adler32 version. This will be released in the next release.

I don't think Python is viable for a final rsync solution, even using
librsync as an extension module. The Python interpreter is an un-necisary
overhead for a lean-mean-data-transfer-machine. However, Python is _perfect_
as a means of experimenting with protocols and algorithms. Pysync was
written with this in mind, and in ~400 lines of heavily commented Python
implements both the rsync and xdelta algorithms. The next release will add
inverse-rsync (for client-side delta calculation). Note that pysync _does_
implement the context-compression used by rsync that is not included in
librsync yet.

-- 
----------------------------------------------------------------------
ABO: finger abo at minkirri.apana.org.au for more info, including pgp key
----------------------------------------------------------------------




More information about the rsync mailing list