Renaming a directory results in an expensive retransmission

Matt McCutchen hashproduct+rsync at gmail.com
Sat Oct 6 15:24:03 GMT 2007


On 10/5/07, N.J. van der Horn (Nico) <nico at vanderhorn.nl> wrote:
> It is a tricky problem to deal with i think, it is tempting to keep a
> checksum'd file/directory list on both sides with information like:
>
> * a fingerprint/signature/checksum to identify each file or directory
> * inode number
> * timestamp
> * filesize
>
> In case a files appears to be deleted, because the name/path is changed,
> it could possibly be identified by it's fingerprint and used to sync
> cleverly ;-)
> This in the thought of expanding --fuzzy, giving it more functionality
> (hint).
>
> For some time i am experimenting with a solution to this problem, by
> some sort
> of a "preprocessor", that tries to identify in the described way, creating
> hardlinks (ln) to let rsync think the files are already in the new location.

The --detect-renamed option provided by the patch
"patches/detect-renamed.diff" in the rsync source package does
essentially this.

> The cost of keeping a database in this scenario would be truly justified
> for me.

Wayne is considering adding support for a file database, which would
be used to make --detect-renamed work somewhat better:

http://lists.samba.org/archive/rsync/2007-October/018780.html

Matt


More information about the rsync mailing list