scan for first existing hard-link file

jw schultz jw at pegasys.ws
Sun Jan 25 19:35:37 GMT 2004


On Sun, Jan 25, 2004 at 02:05:21AM -0800, Wayne Davison wrote:
> Here's a patch that makes rsync try to find an existing file in a group
> of hard-linked files so that it doesn't create the first one in the
> group from scratch if a later file could be used instead.
> 
> Details:  I decided to avoid having the code do an extra scan down the
> list when we encounter the lead file in the list.  This is because it
> would be bad to have to do the same scan in the receiver that the
> generator just performed, especially since there's no guarantee that it
> will get identical results (if a file pops up at the wrong moment).  My
> solution just keeps moving the master file in the group down the list,
> causing it to be processed in turn as we go through the normal flist
> scan.  This ensures that we use the right basis file in the receiver,
> and keeps the code simple.  The only complicating factor was that the
> hard-link post-processing pass was being done by the receiver, while the
> generator is the one that keeps track of the updated master.  To deal
> with this, I moved the hard-link post-processing loop and the final
> touch-up of the directory permissions into the final work that the
> generator does after it gets the "end of phase 2" indicator from the
> receiver.
> 
> Some simple changes to the hard-link data structures was needed.  I
> got rid of the "head" pointer, replacing it with an index into the
> hlink_list array.  This lets us update this "first item" pointer to
> point to the current master.  I then made the single-linked list of
> hard-linked items circular, and added a flag to mark the last item in
> the original list (so we know when to give up our search and just ask
> for the file to be created).
> 

Nice.  Took a few minutes to grok that you made generator
skip the file until you got a hit or it was the last in the
original list.

Having moved the hardlink creation to the generator i don't
think the receiver does anything with hlink_list or related
data.  This means we can, later, free that memory in
receiver after the fork, reducing the memory footprint a bit
more and avoiding the COW of the hlink_list that you have
just introduced.


-- 
________________________________________________________________
	J.W. Schultz            Pegasystems Technologies
	email address:		jw at pegasys.ws

		Remember Cernan and Schmitt


More information about the rsync mailing list