TODO hardlink performance optimizations
John Van Essen
vanes002 at umn.edu
Wed Dec 17 08:18:15 EST 2003
On Mon, 15 Dec 2003, jw schultz <jw at pegasys.ws> wrote:
> OK, first pass on TODO complete.
> PERFORMANCE ----------------------------------------------------------
> Traverse just one directory at a time
> Traverse just one directory at a time. Tridge says it's possible.
> At the moment rsync reads the whole file list into memory at the
> start, which makes us use a lot of memory and also not pipeline
> network access as much as we could.
An additional comment should be added observing that this will affect
hardlink processing, since it relies on the entire flist array being
present in order to match dev and inode numbers. But perhaps the
required hlist array could be saved and built on the fly as items
with node counts > 1 are encountered.
> Hard-link handling
> At the moment hardlink handling is very expensive, so it's off by
> default. It does not need to be so.
> Since most of the solutions are rather intertwined with the file
> list it is probably better to fix that first, although fixing
> hardlinks is possibly simpler.
> We can rule out hardlinked directories since they will probably
> screw us up in all kinds of ways. They simply should not be used.
> At the moment rsync only cares about hardlinks to regular files. I
> guess you could also use them for sockets, devices and other beasts,
> but I have not seen them.
> When trying to reproduce hard links, we only need to worry about
> files that have more than one name (nlinks>1 && !S_ISDIR).
It would be very helpful if file_struct.flags could have a bit set to
indicate that the node count was greater than 1. This info could be
used later to optimize the hardlink search by only considering those
flist entries with this flag bit set.
It'd be nice to implement this bit setting in this protocol number so
it can be widely distributed before 2.6.1 is released which could have
the code to actually make use of it. I'd be interested in doing the
later changes, but if Martin or jw could at least get the bit set...
It doesn't even have to be --hwlink option dependent. Just examine
the node count and set the bit.
> The basic point of this is to discover alternate names that refer to
> the same file. All operations, including creating the file and
> writing modifications to it need only to be done for the first name.
> For all later names, we just create the link and then leave it
An earlier thread started 11/25/2003 points out that in certain cases,
a hardlinked file is unnecessarily transferred in full. This is due
to the algortihm described above. If the first file in the sorted list
is missing, but a later one exists, then that file should be used as
the master. I've been thinking of solutions to this as well. But
not until after 2.6.0 is released.
> If hard links are to be preserved:
> Before the generator/receiver fork, the list of files is received
> from the sender (recv_file_list), and a table for detecting hard
> links is built.
> The generator looks for hard links within the file list and does
> not send checksums for them, though it does send other metadata.
> The sender sends the device number and inode with file entries, so
> that files are uniquely identified.
> The receiver goes through and creates hard links (do_hard_links)
> after all data has been written, but before directory permissions
> are set.
> At the moment device and inum are sent as 4-byte integers, which
> will probably cause problems on large filesystems. On Linux the
> kernel uses 64-bit ino_t's internally, and people will soon have
> filesystems big enough to use them. We ought to follow NFS4 in
> using 64-bit device and inode identification, perhaps with a
> protocol version bump.
> Once we've seen all the names for a particular file, we no longer
> need to think about it and we can deallocate the memory.
> We can also have the case where there are links to a file that are
> not in the tree being transferred. There's nothing we can do about
> that. Because we rename the destination into place after writing,
> any hardlinks to the old file are always going to be orphaned. In
> fact that is almost necessary because otherwise we'd get really
> confused if we were generating checksums for one name of a file and
> modifying another.
> At the moment the code seems to make a whole second copy of the file
> list, which seems unnecessary.
Indeed! It does! Very wasteful. It should only need a list of pointers
to the flist entries and sort that list. Furthermore, with the addition
of the new multiple nodes flag bit requested above, the list of pointers
would only contain pointers to flist entries with that bit set, resulting
in a much smaller list.
John Van Essen Univ of MN Alumnus <vanes002 at umn.edu>
More information about the rsync