scan for first existing hard-link file
jw schultz
jw at pegasys.ws
Sun Jan 25 19:35:37 GMT 2004
On Sun, Jan 25, 2004 at 02:05:21AM -0800, Wayne Davison wrote:
> Here's a patch that makes rsync try to find an existing file in a group
> of hard-linked files so that it doesn't create the first one in the
> group from scratch if a later file could be used instead.
>
> Details: I decided to avoid having the code do an extra scan down the
> list when we encounter the lead file in the list. This is because it
> would be bad to have to do the same scan in the receiver that the
> generator just performed, especially since there's no guarantee that it
> will get identical results (if a file pops up at the wrong moment). My
> solution just keeps moving the master file in the group down the list,
> causing it to be processed in turn as we go through the normal flist
> scan. This ensures that we use the right basis file in the receiver,
> and keeps the code simple. The only complicating factor was that the
> hard-link post-processing pass was being done by the receiver, while the
> generator is the one that keeps track of the updated master. To deal
> with this, I moved the hard-link post-processing loop and the final
> touch-up of the directory permissions into the final work that the
> generator does after it gets the "end of phase 2" indicator from the
> receiver.
>
> Some simple changes to the hard-link data structures was needed. I
> got rid of the "head" pointer, replacing it with an index into the
> hlink_list array. This lets us update this "first item" pointer to
> point to the current master. I then made the single-linked list of
> hard-linked items circular, and added a flag to mark the last item in
> the original list (so we know when to give up our search and just ask
> for the file to be created).
>
Nice. Took a few minutes to grok that you made generator
skip the file until you got a hit or it was the last in the
original list.
Having moved the hardlink creation to the generator i don't
think the receiver does anything with hlink_list or related
data. This means we can, later, free that memory in
receiver after the fork, reducing the memory footprint a bit
more and avoiding the COW of the hlink_list that you have
just introduced.
--
________________________________________________________________
J.W. Schultz Pegasystems Technologies
email address: jw at pegasys.ws
Remember Cernan and Schmitt
More information about the rsync
mailing list