checksum-xattr.diff [CVS update: rsync/patches]
wayned at samba.org
Mon Jul 2 17:28:25 GMT 2007
On Mon, Jul 02, 2007 at 08:43:39AM -0400, Matt McCutchen wrote:
> What do you mean? There's no way to fix the example I gave with
Not so. I went on to explain how that is possible in my prior email
(i.e. avoiding caching a checksum on a "now" mtime file is all that is
> That's easy to fix: get your "now" by touching a file on the
> filesystem and reading the resulting mtime.
Yes, that's one solution that I had already thought of. You'd also need
to do that for every filesystem in the transfer, so you need to add
filesystem checking and hope that you always have write permissions to
the dirs holding the files (or have a work-around algorithm if you
don't). As I said, it's complicated (and quite a bit of hassle).
> >Also, being off by a second might still be "now" if the value of the
> >seconds field rolled over during the check.
> I don't think this is a problem if you stat the file just once before
> reading it.
It is if you're doing one check to see if a file is being updated (e.g.
stat() followed by time() to compute "now". If time rolls over between
the two calls, you may have just missed that the mtime would now match
if you did another stat(). Because of this, you can't be sure if you
read the file prior to the last change, or after the last change.
> And that's probably fine for rsync's purposes. However, I still think
> it might be cool if I made a foolproof checksum-caching library and
> rsync used it...
I don't see any need for that for the xattr version, since rsync isn't
going to update the checksums (just optionally create them on its temp
files). For the non-xattr version it would be nice to have a better
cache mechanism than the simple per-dir .rsyncsums files I implemented
in my patch: having a library that implemented a checksum lookup/update
by dev+inode using a global checksum cache would be cool, and avoid the
file droppings. Making it so that different programs could request a
checksum of a particular type concurrently (which the server/library
would return from cache, if possible, or compute and store in the cache,
if safe) would make it generally useful for a variety of programs. That
would be quite easy for rsync to support, if it existed.
More information about the rsync