proposal to speed rsync with lots of files

Fabian Cenedese Cenedese at
Mon Mar 9 11:20:17 GMT 2009

At 07:58 06.03.2009 -0800, Wayne Davison wrote:
>On Thu, Mar 05, 2009 at 03:27:50PM -0800, Peter Salameh wrote:
>> My proposal is to first send a checksum of the file list for each
>> directory.  If is found to be identical to the same checksum on the
>> remote side then the list need not be sent for that directory!
>My rZync source does something like that for directories:  it treats a
>directory-list transfer like a file transfer.  That means that the
>receiving side sends a set of checksums to the sending side telling it
>what it's version of the directory looks like, and then the sender sends
>a normal set of delta data that lets the receiver reconstruct the
>sender's version of the directory (which it compares to its own).  One
>potential drawback is having to deal with false checksum-matches (which
>should be rare, but would require the dir data to be resent) I hadn't
>optimized it for block size or (possibly) data order to make it more
>efficient, but it is an interesting idea for speeding up a slow
>connection.  I'm not sure if it would really help out that much for a
>more modern, faster connection, because rsync sends the file-list data
>at the same time as it is being scanned, and sometimes the scan is the

To find out whether the scanning or the transferring is the bottleneck,
would it be possible to give in the statistics a hint like what threads
needed to wait longer, what action took more time? Something that
would give a hint that e.g. enabling/disabling compression might give
a faster overall transfer. I don't know if this internal data can be collected
or if the "trial-and-change" method is the only way to do it.


bye  Fabi

More information about the rsync mailing list