How to make big MySQL database more diffable/rsyncable? (aka rsyncing big files)

Jamie Lokier jamie at shareable.org
Mon Jul 13 15:54:40 MDT 2009


Ryan Malayter wrote:
> Your log file indicates that rsync is indeed working as designed
> finding lots of data matches:
> 
>    Literal data: 123736377 bytes
>    Matched data: 17889663500 bytes
> 
> This means that rsync only had to transfer 118 MB instead of 16+ GB.
> It does this by trading CPU and disk operations for network bytes.

> It would be a big boost for large files if rsync "remembered" the
> hashes on each end, so it didn't have to re-read the files on every
> run if the files were unchanged. This is a feature that rsync's
> developers have rejected, since rsync is designed to be stateless
> between runs. I believe Unison does keep state at both ends, you might
> want to look at that.

Remembering hashes only helps on the side where the file has not
changed at all between runs.

In this case, the sender would still have to read the whole file to
find the changes, but the receiver could remember file hashes because
the backup hasn't changed.

Remembering hashes doesn't make any difference to speed, if the
bottleneck is the sending side.

-- Jamie


More information about the rsync mailing list