rsync'ing large files

Jeffrey Layton jtlayton at poochiereds.net
Thu Apr 22 15:19:54 GMT 2004


I'm using rsync to copy some large (>1GB) oracle datafiles. I've noticed
that sometimes it transfers some of the files twice.

Some earlier posts to this list that I saw in the archives seemed to
indicate that this is a problem with the rsync algorithm itself when
dealing with large files. Some of the mails seemed to indicate that this
can be mitigated by using larger block sizes, though there were some
caveats that increasing block size without increasing checksum size
might cause more hash collisions.

My questions:

1) Can anyone explain the problem to me in layman's terms. Is the
initial bad transfer due to hash collisions?

2) If I'm transferring files that are 1-2GB, would increasing the
block-size parameter to 8k or so help here? Or would I be creating more
chances for hash collisions since I can't increase the checksum size?

3) I'm using 2.5.5 (yeah, ancient I know, I'll be upgrading it soon).
Are later versions better at dealing with this problem?

Any help is appreciated!

Thanks,
Jeff




More information about the rsync mailing list