Syncing moved files

Fri Sep 22 23:42:13 GMT 2006

On Fri, Sep 22, 2006 at 07:17:02PM -0400, Matt McCutchen wrote:
> I suppose you want the receiver to skip the transfer entirely and just
> move the file if --checksum is on and the checksums match.

That doesn't really save you time (over leaving --checksum off) since
you have to read the entire file (and every file too!) to compute the
checksums.

One thing that can be done with an unmodified rsync is to copy each dir,
one at a time (for those outside of the tmp dir), and give the option
--link-dest=/tmp to have rsync hard-link a file that got moved out of
the tmp dir.  Then, copy the tmp dir to update/remove the files there.
To do this, use -d and avoid -r, like this:

 rsync -avd --no-r --link-dest=/tmp dir1/ host:/dir1
 rsync -avd --no-r --link-dest=/tmp dir1/sub1/ host:/dir1/sub1

(If your version of rsync doesn't understand --no-r, manually expand -a
into its constituent options and leave out the -r.)

> --fuzzy and --detect-renamed are very similar; they just consider
> different sets of potential basis files.  Perhaps their
> implementations should be merged.

While they have a little overlap in functionality (since fuzzy can find
a renamed file in the same dir), the main purpose of --fuzzy (as I see
it, at least) is an attempt to use an older version of a file to speed
up the transfer of a newer version.  For instance, if someone uses the
--rsyncable gzip on their tar files in a release dir, when they release
a new version of a source tar, --fuzzy would try to use the previous
version (in the same dir) as a source of matching data for the new
version.  This name matching is the slowest part of the current
algorithm (for large directories), and is the main reason why the two
algorithms should probably not be merged.

..wayne..