rsync algorithm for large files

Matthias Schniedermeyer ms at citd.de
Fri Sep 4 16:34:41 MDT 2009


On 04.09.2009 18:00, eharvey at lyricsemiconductors.com wrote:
> 
> Why does it take longer the 3rd time I run it?  Shouldn?t the performance
> always be **at least** as good as the initial sync?

Not per se.

First you have to determine THAT the file has changed, then the file is 
synced if there was a change. At least that's what you have to do when 
the file-size is unchanged and only the timestamp is differs.
(Which is unfortunatly often the case for Virtual Machine Images)

Worst case: Takes double the time if the change is at end of the file.

When the filesize differs rsync immediatly knows that the file has 
actual changes and starts the sync right away.

If i understand '--ignore-times' correctly it forces rsync to always 
regard the files as changed and so start a sync right away, without 
first checking for changes.


There are also some other options that may or may not have a speed 
impact for you:
--inplace, so that rsync doesn't create a tmp-copy that is later moved over 
the previous file on the target-site.
--whole-file, so that rsync doesn't use delta-transfer but rather copies 
the whole file.

Also you may to separate the small from the large files with:
--min-size
--max-size
So you can use different options for the small/large file(s).




Bis denn

-- 
Real Programmers consider "what you see is what you get" to be just as 
bad a concept in Text Editors as it is in women. No, the Real Programmer
wants a "you asked for it, you got it" text editor -- complicated, 
cryptic, powerful, unforgiving, dangerous.



More information about the rsync mailing list