rsync algorithm for large files

eharvey at lyricsemiconductors.com eharvey at lyricsemiconductors.com
Fri Sep 4 16:00:56 MDT 2009


I thought rsync, would calculate checksums of large files that have changed
timestamps or filesizes, and send only the chunks which changed.  Is this
not correct?  My goal is to come up with a reasonable (fast and efficient)
way for me to daily incrementally backup my Parallels virtual machine (a
directory structure containing mostly small files, and one 20G file)



I’m on OSX 10.5, using rsync 2.6.9, and the destination machine has the same
versions.  I configured ssh keys, and this is my result:



(Initial sync)

time rsync -a --delete MyVirtualMachine/ myserver:MyVirtualMachine/

                20G

                ~30minutes



(Second time I ran it, with no changes to the VM)

time rsync -a --delete MyVirtualMachine/ myserver:MyVirtualMachine/

                2 seconds



(Then I made some minor changes inside the VM, and I want to send just the
changed blocks)

time rsync -a --delete MyVirtualMachine/ myserver:MyVirtualMachine/

                After waiting 50 minutes, I cancelled the job.



Why does it take longer the 3rd time I run it?  Shouldn’t the performance
always be **at least** as good as the initial sync?



Thanks for any help…
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.samba.org/pipermail/rsync/attachments/20090904/49024dd7/attachment.html>


More information about the rsync mailing list