Time rsYnc Machine (tym)

Thu Aug 9 16:23:18 MDT 2012

Reading the rsync man page, it seems that the -H option with --link-dest is tricky. The price to pay on not using -H is that hard linked files are treated as separated files: I could not find any mechanism in rsync to improve the time machine effect; I would appreciate hits on how to improve it. I highlight this in the tym man page and inside the program.

Regards
Tomas

On 9 Aug 2012, at 22:54, Linda Walsh wrote:

> 
> 
> Dan Stromberg wrote:
>> I may be mistaken, but I heard at one time that rsync was noticeably slower when asked to preserve hard links.  I'm guessing this is a matter of CPU requirements rather than I/O requirements, but that's just a guess.
> ---
> 	I didn't realize at the time I wrote it, that rsync
> had switched to an "incremental_recursive" algorithm, where it
> doesn't know the full dataset on each machine before starting synchronization.
> 
> That makes it impossible to *easily* manage hardlinks -- and, in fact, as
> indicated elsewhere, causes hardlink copying to fail when making a differential
> rsync from A->C not including anything on "B" (where B is the argument
> for the "--compare-dest" option, and you copy from A->C;
> 
> (Note, A is older version of B (a snapshot ), so I'm copying the differences
> between a live snapshot taken by lvm (at some point in the past, and 'now')
> to a static volume 'B'.
> 
> FWIW, I save that off into it's own resized volumne mounted under
> a date stamped dir like /A/snapdir/@GMT-2012.08.05-00.21.43.
> 
> That gives can give me the "previous versions of files" option on my samba
> shares on windows...
> 
> 
> I didn't understand why a bloom filter would be needed if you knew all
> the inodes, but given their approach with the incremental recursion I can
> see why they are reaching for arcane methods of handling hard links.
> 
> Fortunately my problem with it crashing goes away and hardlinks work fine if
> incremental recursion is turned off.
> 
> I'd be surprised if there was any hard-link overhead -- as in doing a dup-file detector,
> a version that pre-read the filenames to sort by size & inode, ran significantly
> faster than one of the standard chksum/md5sum based detectors, which tried to build
> a list as it went...
> 
> Anyway, thanks for the history update.   I have a feeling rsync is afraid to use
> memory -- and really, it should try to use alot of memory to optimize transfers,
> 
> Turning off partial-send's (--whole-file) using 8-bit-io) seem to really help
> speed things up on a same-system copy... In doing a full sync with a backup,
> of a 6G HD, (I used --drop-cache, --inplace and --del as well)  Doing an
> archive diff with --acls --xattrs + --hardlinks rysync averaged 125MB/s
> for the actual IO... (about 30% of the disk)...
> 
>