TODO hardlink performance optimizations

jw schultz jw at pegasys.ws
Mon Jan 5 08:01:14 GMT 2004


On Mon, Jan 05, 2004 at 01:44:37AM -0600, John Van Essen wrote:
> On Sun, 4 Jan 2004, jw schultz <jw at pegasys.ws> wrote:
> > On Sun, Jan 04, 2004 at 06:35:03AM -0600, John Van Essen wrote:
> ....
> >> I've modified hlink.c to use a list of file struct pointers instead of
> >> copies of the actual file structs themselves, so that will save memory.
> >> I'll submit that patch for review in a day or two after I've tested it.
> > 
> > I've just done the same.  It reduces the memory requirements
> > of the hlink list to 1/18th.  It is also somewhat faster to
> > build that way because we don't have to walk the list.
> > 
> > If we built the hlink_list one element at a time the way we
> > do the file_list only putting those files that we might link
> > in it it would be smaller but building it would be slower.
> 
> Right.  But as you noted in a later followup, it's a real time-saver
> for searches in a filelist with a typical percentage of hardlinks.
>  
> > I've only done a little testing but it seems to be working
> > and warnings about theory v. practice aside it should be
> > good.
> 
> Your changes are almost identical to mine, so I will address only the
> main differences...
> 
[snip]
> > +       
> > +        memcpy(hlink_list, flist->files, sizeof(hlink_list[0]) * flist->count);
> 
> 
> Nice optimization but we will need to walk it anyway to extract only
> the candidates for hardlinking (discussed separately):

While my current tree has 

        hlink_count = 0;
	for (i = 0; i < flist->count; i++) {
	if (S_ISREG(flist->files[i]->mode))
		hlink_list[hlink_count++] = flist->files[i];
	}       

If we get a purge function that eliminates the non linked
files i'd go back to the memcpy do the !S_ISREG purge with
the post qsort pass.



-- 
________________________________________________________________
	J.W. Schultz            Pegasystems Technologies
	email address:		jw at pegasys.ws

		Remember Cernan and Schmitt


More information about the rsync mailing list