TODO hardlink performance optimizations
jw schultz
jw at pegasys.ws
Mon Jan 5 08:01:14 GMT 2004
On Mon, Jan 05, 2004 at 01:44:37AM -0600, John Van Essen wrote:
> On Sun, 4 Jan 2004, jw schultz <jw at pegasys.ws> wrote:
> > On Sun, Jan 04, 2004 at 06:35:03AM -0600, John Van Essen wrote:
> ....
> >> I've modified hlink.c to use a list of file struct pointers instead of
> >> copies of the actual file structs themselves, so that will save memory.
> >> I'll submit that patch for review in a day or two after I've tested it.
> >
> > I've just done the same. It reduces the memory requirements
> > of the hlink list to 1/18th. It is also somewhat faster to
> > build that way because we don't have to walk the list.
> >
> > If we built the hlink_list one element at a time the way we
> > do the file_list only putting those files that we might link
> > in it it would be smaller but building it would be slower.
>
> Right. But as you noted in a later followup, it's a real time-saver
> for searches in a filelist with a typical percentage of hardlinks.
>
> > I've only done a little testing but it seems to be working
> > and warnings about theory v. practice aside it should be
> > good.
>
> Your changes are almost identical to mine, so I will address only the
> main differences...
>
[snip]
> > +
> > + memcpy(hlink_list, flist->files, sizeof(hlink_list[0]) * flist->count);
>
>
> Nice optimization but we will need to walk it anyway to extract only
> the candidates for hardlinking (discussed separately):
While my current tree has
hlink_count = 0;
for (i = 0; i < flist->count; i++) {
if (S_ISREG(flist->files[i]->mode))
hlink_list[hlink_count++] = flist->files[i];
}
If we get a purge function that eliminates the non linked
files i'd go back to the memcpy do the !S_ISREG purge with
the post qsort pass.
--
________________________________________________________________
J.W. Schultz Pegasystems Technologies
email address: jw at pegasys.ws
Remember Cernan and Schmitt
More information about the rsync
mailing list