Backing up two trees with corresponding files hard linked

Wayne Davison wayned at samba.org
Fri Feb 24 18:29:41 GMT 2006


On Thu, Jan 19, 2006 at 10:20:48AM -0500, Matt McCutchen wrote:
> I had the understanding that -H used an O(n^2) algorithm to match up
> hard links and it would be prohibitively expensive to use this option on
> a filesystem with about 100,000 files.  Is this true?

It used to be true, but is not anymore.  The code used to keep an entire
extra file-list array sorted by inode, and do binary searches into this
list for every hard-linked file.  The current code has been optimized in
several ways:

 - We only save inode information for files that have more than one
   file-system link.

 - After doing a qsort() by inode on the potentially linked files, rsync
   replaces the inode data in the file-list with linked-list data
   (without using any extra memory) that allows rsync to know all the
   files that are linked together and which file is the "master" (the
   one that got updated or is up-to-date).  This allows us to handle all
   the hard-linking without doing any binary searching.

So, these days -H should be nice and fast.

..wayne..


More information about the rsync mailing list