Premature optimization in f_name_cmp()?

Andrew Bartlett abartlet at samba.org
Sun Mar 13 23:40:00 GMT 2005


I appear to be hitting a case of premature optimization at my site -
with rsync 2.6.4pre2:

I backup a number of very large systems using rsync, and have been
updating as I try and fix various weird problems.  It appears that rsync
2.6.4pre2 fixes some of my other issues (timeouts in particular), but
now I'm being bitten by very high loads and slow transfers.

My current theory is that the offending code is from this checkin:

http://cvs.samba.org/cgi-bin/cvsweb/rsync/flist.c.diff?r1=1.272&r2=1.273

> If f_name_cmp() discovers that two directory strings compare to an 
> equal value without being equal pointers, substitute one of the
> pointers for the other in the file list.  This optimizes future name
> comparisons.  Note also that this optimization won't be triggered
> very often (because rsync tends to send the names grouped by dir-
> name at transmission time), but it's nice to be able to assume that
> all files in the same dir have identical dir-name pointers after the
> qsort is finished.

What seems to be happening for me is that we then walk, rather often,
the half-million files in the directory tree I am syncing to my backup
server.  This causes the server to spend a lot of time in a CPU loop,
and slows down the process just a little...

Is there a way to make this only look at the 'right' part of sorted-
flist, given it's sorted, and is this really needed at all?

Thanks,

Andrew Bartlett

-- 
Andrew Bartlett                                http://samba.org/~abartlet/
Authentication Developer, Samba Team           http://samba.org
Student Network Administrator, Hawker College  http://hawkerc.net
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://lists.samba.org/archive/rsync/attachments/20050314/b6b091d5/attachment.bin


More information about the rsync mailing list