Any way to predict the amount of data to be copied when re-copying a file?

Eliot Moss moss at cs.umass.edu
Sun Nov 29 12:58:46 MST 2009


I can't answer your question directly, but I can say that
it is not strictly the number of bytes that are different
that matters, but also how the differences are distributed
in the file.  Unless you explicitly set the block size,
rsync uses a size that is the sqrt of the size of the
file, thus bounding the worst case for the total volume
of data transmitted (block summaries *plus* block data
for changed blocks).  If many of these sqrt-n-sized
blocks are affected, then many will be transmitted. If
you know more about what tends to happens with your files,
you can adjust the block size.

(This is all from memory from reading the rsync tech
report some time ago, but I think it remains sounds.
I'm sure someone will correct me if I am off base.)

Best wishes -- Eliot Moss


More information about the rsync mailing list