--fuzzy search over to-be-deleted files to catch moved files and directories

Matt McCutchen matt at mattmccutchen.net
Thu Nov 12 21:20:19 MST 2009


Attempting to address each of your questions, here and then in your
other message...

On Wed, 2009-11-11 at 12:17 +0100, H. Langos wrote: 
> > It will find moved files that match exactly
> > according to the "quick check" in effect (size + mtime or checksum). 
> 
> That is basename+size+mtime  or basename+checksum, right?

No, a basename match is not a requirement (hence the ability to detect
renames), but it is a tie-breaker. 

> How does "--detect-renamed" interact with "--fuzzy" and "--delete-after"? 

--detect-renamed and --fuzzy are two different means of finding basis
files that overlap in some cases but do not really interact.
--detect-renamed considers the whole destination using the quick check,
while --fuzzy considers only the same destination subdir using
size+mtime or otherwise name similarity.

--delete-before and --delete-during may reduce the effectiveness of
--fuzzy, as stated in the man page description of --fuzzy, but they do
not affect --detect-renamed since --detect-renamed actually works during
the delete pass.

> > It doesn't calculate name similarity like --fuzzy because that would
> > be prohibitively expensive in the current implementation.

> Only files of the same size should be
> candidates to start with, right.

No, the name similarity calculation I'm talking about is the fallback to
select a similar basis file when no available destination file passes
the quick check, so it does not require a size match.

> Why would it be so expensive?

Wayne said so here:

https://bugzilla.samba.org/show_bug.cgi?id=3392#c11

I haven't done any tests it myself.

-- 
Matt



More information about the rsync mailing list