How rsync performs synchronization

Matt McCutchen hashproduct at verizon.net
Sun Nov 6 19:36:18 GMT 2005


On Sat, 2005-11-05 at 19:52 -0800, Jeffrey Ellis wrote:
> Can someone tell me how rsync performs it’s synchronization feature?
> What basic procedure does it use to check if files are different?

By default, rsync does a "quick check" and assumes that the data
portions of two files are identical if the files have the same size and
last-modified time.  If this is the case, rsync doesn't look at the data
portions at all, but it does copy attributes that you have told it to
preserve (times, permissions, etc.).

If the sizes or last-modification times do not match, the "rsync
algorithm" described in tech_report.tex will be carried out to transfer
any differences.  If the data portions are actually identical, only a
small amount of information goes over the connection; however, the rsync
on each end must read the entire file on that end to conclude that the
files are identical, so it is better if the quick check succeeds.

Several options change the behavior of the quick check.
--modify-window=N will allow the times to differ by up to N seconds.
--size-only requires only the sizes, not the last-modified times, to
match.  --ignore-times disables the quick check, so all files will be
read and compared.  --checksum makes the quick check compare MD4sums of
the files; since all files must be read completely to compute the
checksums, it isn't clear to me how this is better than --ignore-times.
-- 
Matt McCutchen, ``hashproduct''
hashproduct at verizon.net -- http://mysite.verizon.net/hashproduct/



More information about the rsync mailing list