rsync fails to sync files

Paul Slootman paul at debian.org
Tue May 8 11:07:51 GMT 2007


On Tue 08 May 2007, Atte Peltomaki wrote:
> 
> I'm seeing a weird problem with rsync 2.6.9 protocol version 29 on
> Debian Sarge. When copying a file from one location to another between
> two Debian boxes, if destination includes a file with same size and
> name, rsync fails to see that they are not exactly the same file.
> 
> The situation originates from copying a file to a place which is
> periodically rsynced onwards, and the rsync coperation takes place before
> the original file transfer to the rsync source is complete.
> 
> Example:
> 
> Source:
> 
> md5sum:
> 2ac4a4ad88da17f49d26c9e578ce5432  somefile.exe
> sha1sum:
> eaabb30b716e993be000b89208e2d9f63e78f052  somefile.exe
> ls -l:
> -rwxrw----  1 user group 109819105 Apr  2 10:48 somefile.exe*
> 
> Destination:
> 
> md5sum:
> 72c116866a75f859a19a150216768e52  somefile.exe
> sha1sum:
> 33b7d91fc6bd2a5bff292258c7d6eeb7db0aec8a  somefile.exe
> ls -l:
> -rwxrw----  1 user group 109819105 2007-04-02 10:48 somefile.exe*

The files apparently both have the same size, timestamp, and other
attributes.  Only the contents differ.

> rsync is executed from the source with following flags:
> 
> -alvv --delete --exclude-from=file 
> 
> where 'file' includes three lines:
> /upload
> /upload/*
> upload/
> 
> rsync itself says about somefile.exe: 
> 
> somefile.exe is uptodate
> 
> As you can see, md5sums and sha1sums reveal that the file is not the
> same, even though timestamps and sizes match. 

You did not ask rsync to checksum the files...

> What is the exact algorithm rsync uses to determien wether a file is up
> to date or not? 

Hmmm.... I thought this would have been in the manpage, but it's not
spelled out apparently.
Rsync by default compares size and mtime to determine whether a file
needs to be transferred. (I think the other attributes such as owner,
permissions are only updated, but don't necessarily incur a contents
sync, although I confess I'm not absolutely sure.)

You need to supply the --checksum option if you want to make sure that
the contents are indeed identical. This is normally not done as that
would cause a massive IO load; cases that size and timestamp are
identical but not the contents don't usually happen...


Paul Slootman


More information about the rsync mailing list