checksum-xattr.diff [CVS update: rsync/patches]

Matt McCutchen hashproduct+rsync at gmail.com
Sat Jun 30 20:17:29 GMT 2007


On 6/30/07, Wayne Davison <wayned at samba.org> committed:
> Added Files:
>         checksum-xattr.diff
> Log Message:
> A simple patch that lets rsync use cached checksum values stored in
> each file's extended attributes.  A perl script is provided to create
> and update the values.

Wayne,

You should be aware of two drawbacks of caching checksums in xattrs:

First, setting the xattr hits the file's ctime.  Thus, in exchange for
rsync being able to skip the file, other tools that use ctime (such as
GNU tar incremental backups) unnecessarily reprocess it.  Beagle also
caches checksums in xattrs, and one of its users complained about the
effect on the ctime:

http://www.mail-archive.com/dashboard-hackers@gnome.org/msg03251.html

Second, it is impossible to make xattr-based checksum caching
foolproof against same-second modification.  Suppose a file is written
during second 5 and then rsync caches its checksum during second 8;
now the file has mtime 5 and ctime 8.  Sometime later, rsync notices
that the file still has mtime 5 and ctime 8.  Does rsync trust the
cached checksum?  It must; otherwise the benefit of caching checksums
would be lost.  However, rsync will be fooled if the file was modified
and then touched back to mtime 5 during second 8, right after the
checksum was cached.  This concern may not be relevant when the
content is slowly changing.

Matt


More information about the rsync mailing list