TODO hardlink performance optimizations
jw schultz
jw at pegasys.ws
Wed Dec 17 08:49:46 EST 2003
On Tue, Dec 16, 2003 at 03:18:15PM -0600, John Van Essen wrote:
> On Mon, 15 Dec 2003, jw schultz <jw at pegasys.ws> wrote:
>
> > OK, first pass on TODO complete.
> ....
> > PERFORMANCE ----------------------------------------------------------
> ....
> > Traverse just one directory at a time
> >
> > Traverse just one directory at a time. Tridge says it's possible.
> >
> > At the moment rsync reads the whole file list into memory at the
> > start, which makes us use a lot of memory and also not pipeline
> > network access as much as we could.
>
> An additional comment should be added observing that this will affect
> hardlink processing, since it relies on the entire flist array being
> present in order to match dev and inode numbers. But perhaps the
> required hlist array could be saved and built on the fly as items
> with node counts > 1 are encountered.
Dynamic creation of the hlist set (hash perhaps) would deal with that, yes.
>
> ....
> > Hard-link handling
> >
> > At the moment hardlink handling is very expensive, so it's off by
> > default. It does not need to be so.
> >
> > Since most of the solutions are rather intertwined with the file
> > list it is probably better to fix that first, although fixing
> > hardlinks is possibly simpler.
> >
> > We can rule out hardlinked directories since they will probably
> > screw us up in all kinds of ways. They simply should not be used.
> >
> > At the moment rsync only cares about hardlinks to regular files. I
> > guess you could also use them for sockets, devices and other beasts,
> > but I have not seen them.
> >
> > When trying to reproduce hard links, we only need to worry about
> > files that have more than one name (nlinks>1 && !S_ISDIR).
>
> It would be very helpful if file_struct.flags could have a bit set to
> indicate that the node count was greater than 1. This info could be
> used later to optimize the hardlink search by only considering those
> flist entries with this flag bit set.
>
> It'd be nice to implement this bit setting in this protocol number so
> it can be widely distributed before 2.6.1 is released which could have
> the code to actually make use of it. I'd be interested in doing the
> later changes, but if Martin or jw could at least get the bit set...
> It doesn't even have to be --hwlink option dependent. Just examine
> the node count and set the bit.
I'm not keen on squeezing that in at this time. Lets get it
out the door, hardlink performance improvements can be made
in a minor release. I'm also a bit more inclined to pass
nlinks (IFF non-zero and ~IS_DIR).
>
> > The basic point of this is to discover alternate names that refer to
> > the same file. All operations, including creating the file and
> > writing modifications to it need only to be done for the first name.
> > For all later names, we just create the link and then leave it
> > alone.
>
> An earlier thread started 11/25/2003 points out that in certain cases,
> a hardlinked file is unnecessarily transferred in full. This is due
> to the algortihm described above. If the first file in the sorted list
> is missing, but a later one exists, then that file should be used as
> the master. I've been thinking of solutions to this as well. But
> not until after 2.6.0 is released.
>
> > If hard links are to be preserved:
> >
> > Before the generator/receiver fork, the list of files is received
> > from the sender (recv_file_list), and a table for detecting hard
> > links is built.
> >
> > The generator looks for hard links within the file list and does
> > not send checksums for them, though it does send other metadata.
> >
> > The sender sends the device number and inode with file entries, so
> > that files are uniquely identified.
> >
> > The receiver goes through and creates hard links (do_hard_links)
> > after all data has been written, but before directory permissions
> > are set.
> >
> > At the moment device and inum are sent as 4-byte integers, which
> > will probably cause problems on large filesystems. On Linux the
> > kernel uses 64-bit ino_t's internally, and people will soon have
> > filesystems big enough to use them. We ought to follow NFS4 in
> > using 64-bit device and inode identification, perhaps with a
> > protocol version bump.
> >
> > Once we've seen all the names for a particular file, we no longer
> > need to think about it and we can deallocate the memory.
> >
> > We can also have the case where there are links to a file that are
> > not in the tree being transferred. There's nothing we can do about
> > that. Because we rename the destination into place after writing,
> > any hardlinks to the old file are always going to be orphaned. In
> > fact that is almost necessary because otherwise we'd get really
> > confused if we were generating checksums for one name of a file and
> > modifying another.
> >
> > At the moment the code seems to make a whole second copy of the file
> > list, which seems unnecessary.
>
> Indeed! It does! Very wasteful. It should only need a list of pointers
> to the flist entries and sort that list. Furthermore, with the addition
> of the new multiple nodes flag bit requested above, the list of pointers
> would only contain pointers to flist entries with that bit set, resulting
> in a much smaller list.
You do mean multiple paths, same node? :)
Lets take this up after the release, shall we?
--
________________________________________________________________
J.W. Schultz Pegasystems Technologies
email address: jw at pegasys.ws
Remember Cernan and Schmitt
More information about the rsync
mailing list