Rsync efficiency on SQL dump files

Paul Slootman paul+rsync at wurtel.net
Wed Oct 1 14:29:20 GMT 2008


On Wed 01 Oct 2008, Callum Macdonald wrote:
> 
> I'm backing up production MySQL database servers. The tables are almost
> all MyISAM.
> 
> My plan is to use MySQL binary logging and then rsync the binary logs
> offsite hourly. The binary log files are only appended to, with new
> queries logged at the end of the file. So I'm assuming the rsync
> algorithm will be highly effective at reducing the bandwith required to
> transfer these files.

I'd recommend using --append --inplace with these binlogs (not with most
other data!).


> My question is regarding MySQL dump files. From one backup to the next,
> I'd estimate that the MySQL data will be 90% consistent. So the dump
> file will have 10% new data. That new data will be scattered at random
> points through the file.
> 
> Typically I'd pipe the output from mysqldump into gzip. Then I'd copy
> the gzipped file offsite. I wonder if saving the file uncompressed, and
> then using rsync to copy the file offsite will be more efficient.
> 
> Any advice? Will rsync be effective at finding the 90% consistency
> between the new dump file and the old dump file? Will it be able to
> transfer only the 10% of the file that has changed?

That depends on how large the files are (which affects the block size
rsync uses) and how close together the changes are.

Also consider using --order-by-primary with mysqldump, which will at
least ensure that the order of rows won't change, which may help.
However, if you database is very large, this may impact the dump time
too much.

Your gzip may also have a --rsyncable option which optimizes the
compressed file so that small changes in the original lead to smaller
changes in the compressed result, at a cost of a slightly larger
compressed file.  Experimenting with different combinations of things
will show the best strategy in your particular case.

> PS> Apologies if this double posted. I tried it a few days ago but
> haven't seen it on the list yet.

Didn't see it either


Paul Slootman


More information about the rsync mailing list