rsync'ing large files

Brian Cuttler brian at wadsworth.org
Thu Apr 22 15:23:18 GMT 2004


I can't address the algoritm questions but I'll tell you that we
had a tremendous improvement is speed when we switched to a newer
version of rsync.

We are using it (in this case) to rsync our oracle files to a
separate partition on the system cpu.

> I'm using rsync to copy some large (>1GB) oracle datafiles. I've noticed
> that sometimes it transfers some of the files twice.
> 
> Some earlier posts to this list that I saw in the archives seemed to
> indicate that this is a problem with the rsync algorithm itself when
> dealing with large files. Some of the mails seemed to indicate that this
> can be mitigated by using larger block sizes, though there were some
> caveats that increasing block size without increasing checksum size
> might cause more hash collisions.
> 
> My questions:
> 
> 1) Can anyone explain the problem to me in layman's terms. Is the
> initial bad transfer due to hash collisions?
> 
> 2) If I'm transferring files that are 1-2GB, would increasing the
> block-size parameter to 8k or so help here? Or would I be creating more
> chances for hash collisions since I can't increase the checksum size?
> 
> 3) I'm using 2.5.5 (yeah, ancient I know, I'll be upgrading it soon).
> Are later versions better at dealing with this problem?
> 
> Any help is appreciated!
> 
> Thanks,
> Jeff
> 
> 
> -- 
> To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
> Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html



More information about the rsync mailing list