Release 3 of "rzync" new-protocol test

Donovan Baarda abo at minkirri.apana.org.au
Fri Jun 21 04:37:04 EST 2002


On Fri, Jun 21, 2002 at 03:46:39AM -0700, Wayne Davison wrote:
> The count of transferred bytes in the latest protocol is now below what
> rsync sends for many commands -- both a start-from-scratch update or a
> fully-up-to-date update are usually smaller, for instance.  This is
> mainly because my file-list data is smaller, but it's also because I
> reduced the protocol overhead quite a bit.  Transferred bytes for
> partially-changed files are still bigger than rsync because librsync
> creates unusually large delta sizes (though there's a patch that makes
> it work much better, it's still not as good as rsync).

I believe that the remaining difference is rsync does "context compression"
using zlib. I believe librsync does no compression at all yet. Even if you
zlib compress librsync's delta's, they will still be bigger than rsync
because of the "context" it uses... it compresses the whole file, hits and
misses, but only sends the compressed output for the misses. This means the
compressor is "primed" with data from the hits.

I think that the best solution for this is to do what xdelta is planning to
do... toss zlib and include target references as well as source references
in the delta instruction stream; do the compression yourself. 

One way to do this is implement xdelta-style non-block aligned matches
against the target, building a rollsum hash-tree as you go through it, and
run it alongside the rsync block match algorithm. However, this might not
work well in practice...

> In my speed testing, one test was sending around 8.5 meg of data on a
> local system, and while rsync took only .5 seconds, my rzync app took
> around 2 seconds.  A quick gprof run reveals that 98% of the runtime is
> being spent in 2 librsync routines, so it looks like librsync needs to
> be optimized a bit.
> 
> One potential next steps might include optimizing rsync to make the
> transferred file-list size a little smaller (e.g. making the transfer of
> the "size" attribute only as long as needed to store the number would
> save ~4-5 bytes per file entry on typical files).
> 
> It looks like work needs to be done on making librsync more efficient.

I'm going to get onto this after this week end. I know what needs to be
done... I just need the time to do it.

-- 
----------------------------------------------------------------------
ABO: finger abo at minkirri.apana.org.au for more info, including pgp key
----------------------------------------------------------------------




More information about the rsync mailing list