reducing file list bytes transferred

Wayne Davison wayned at samba.org
Tue Oct 2 08:22:32 GMT 2007


On Mon, Oct 01, 2007 at 08:21:25PM -0500, Stephen Zemlicka wrote:
> I thought it was supposed to do it more effeciently

Yes, protocol 30 has several improvements that reduce the number of
bytes sent over the wire in addition to its incremental recursion mode.
The latter is slightly less byte-efficient over a non-incremental pr.30
recursion, but is more disk I/O efficient the larger your file list gets
(not to mention more memory efficient).

For example, I did an rsync on a hierarchy of 30,103 files that didn't
need any updates.  This resulted in the following transfer counts:

Protocol 29:                     sent 533,947 bytes, received 20 bytes

Protocol 30, --inc-recursive:    sent 503,658 bytes, received 676 bytes

Protocol 30, --no-inc-recursive: sent 501,189 bytes, received 11 bytes

You'll note that the inc-recursive mode has a little more "back chatter"
due to the generator letting the sender know about its progress through
the file list.  Since that direction of flow is more lightly loaded than
the flow from sender to receiver, the small amount of extra data should
not adversely affect rsync's speed.

So, the inc-recursive transfer may be slightly less efficient over the
wire than pr.30 no-inc-recursive, but because the sending and receiving
sides tend to be doing more disk I/O at the same time, the total
transfer time can be less (depending on how much directory data is in
the disk cache, how large the transfer is, and how many files need to
be transferred).

It would be interesting to look into applying more general compression
to the file list in a future version.  It would also be interesting to
see if an rsync-checksum-delta technique helps out on an inc-recursive
file list.  I use that idea on my protocol-experiment software a while
back, and it may be a net win (though it has tradeoffs, as chunks of the
file list must be buffered and sorted before being delta-difference
transferred to the other side, while the current code sends file list
info as soon as it is scanned).

..wayne..


More information about the rsync mailing list