max file size

Heinz-Josef Claes hjclaes at web.de
Fri Nov 13 04:36:59 MST 2009


On Fri, 13 Nov 2009 01:38:48 -0500
Matt McCutchen <matt at mattmccutchen.net> wrote:

> On Mon, 2009-11-09 at 18:20 +0100, Heinz-Josef Claes wrote:
> > Am Montag, 9. November 2009 17:48:35 schrieb Matt McCutchen:
> > > On Mon, 2009-11-09 at 11:43 +0100, Heinz-Josef Claes wrote:
> > > > does anybody know what's the maximum file size (terabytes?) when using
> > > > rsync with options --checksum and / or --inplace?
> > > >
> > > > What file sizes have been tested in reality? Are there any experiences
> > > > using rsync (with --checksum and / or --inplace) for big files with
> > > > several / dozens or terabytes?
> > > 
> > > I don't believe rsync has a fixed maximum size other than "what can fit
> > > in 64 bits", but I can't speak to any reliability issues that might come
> > > up with extremely large files.
> > > 
> > I've read about a fix for overrun checksum buffers with more than some hundred 
> > terabytes but that was just something undefined . . .
> 
> Indeed, I forgot about that.  The delta-transfer algorithm doesn't work
> for files longer than 2^31 blocks.  With the default maximum block size
> of 2^17, the limit is 2^48 bytes or 256 TB.  You could stretch the limit
> by fixing a larger block size with --block-size .  See:
> 
> https://bugzilla.samba.org/show_bug.cgi?id=5459#c2

Thanks for that information!

Do you (or anybody) every has done a test with big file sizes?

> 
> > > For what purpose are you considering --checksum?  In the case where the
> > > file's size hasn't changed (probably true for large image files), it
> > > will add an extra full read of the file on both sides before the
> > > transfer begins, which would be very expensive for multi-terabyte files.
> > 
> > I want to check if the following is possible:
> > 
> > 1. transport a big block of data (several terabytes) physically from location 
> > A to location B (very long distance) via tapes (or disks).
> > (Location A and B use different storage technologies.)
> > 
> > When the tapes arrive in location B, the block of data has changed in location 
> > A (a program / OS is running and storing data in it).
> > 
> > 2. shutdown application / OS in location A, rsync the delta between Location A 
> > and B online, then restart the system in location B.
> > 
> > (Perhaps step 2 has to be done multiple times.)
> 
> Since the source and destination versions are practically certain to
> differ, --checksum would serve no purpose.  See the man page description
> of --checksum.
> 

Don't understand what you mean. From 1. und 2., only a few percent of the data will change, so the idea is to transfer the differences only. Transferring the whole file online takes too long.
How to do this without check sums (either --checksum or --inbound)?

I'll probably be able to make a test with a file size of some terabytes in the next weeks, but that's not guaranteed.

Regards, HJC


More information about the rsync mailing list