Time rsYnc Machine (tym)

Linda Walsh rsync at tlinx.org
Thu Aug 9 14:54:48 MDT 2012



Dan Stromberg wrote:
> 
> I may be mistaken, but I heard at one time that rsync was noticeably 
> slower when asked to preserve hard links.  I'm guessing this is a matter 
> of CPU requirements rather than I/O requirements, but that's just a guess.
---
	I didn't realize at the time I wrote it, that rsync
had switched to an "incremental_recursive" algorithm, where it
doesn't know the full dataset on each machine before starting synchronization.

That makes it impossible to *easily* manage hardlinks -- and, in fact, as
indicated elsewhere, causes hardlink copying to fail when making a differential
rsync from A->C not including anything on "B" (where B is the argument
for the "--compare-dest" option, and you copy from A->C;

(Note, A is older version of B (a snapshot ), so I'm copying the differences
between a live snapshot taken by lvm (at some point in the past, and 'now')
to a static volume 'B'.

FWIW, I save that off into it's own resized volumne mounted under
a date stamped dir like /A/snapdir/@GMT-2012.08.05-00.21.43.

That gives can give me the "previous versions of files" option on my samba
shares on windows...


I didn't understand why a bloom filter would be needed if you knew all
the inodes, but given their approach with the incremental recursion I can
see why they are reaching for arcane methods of handling hard links.

Fortunately my problem with it crashing goes away and hardlinks work fine if
incremental recursion is turned off.

I'd be surprised if there was any hard-link overhead -- as in doing a dup-file 
detector,
a version that pre-read the filenames to sort by size & inode, ran significantly
faster than one of the standard chksum/md5sum based detectors, which tried to build
a list as it went...

Anyway, thanks for the history update.   I have a feeling rsync is afraid to use
memory -- and really, it should try to use alot of memory to optimize transfers,

Turning off partial-send's (--whole-file) using 8-bit-io) seem to really help
speed things up on a same-system copy... In doing a full sync with a backup,
of a 6G HD, (I used --drop-cache, --inplace and --del as well)  Doing an
archive diff with --acls --xattrs + --hardlinks rysync averaged 125MB/s
for the actual IO... (about 30% of the disk)...




More information about the rsync mailing list