Block Size

David Bolen db3l at fitlinxx.com
Fri Nov 16 10:42:16 EST 2001


Thomas Lambert [thomas at dscassociates.com] writes:

> What is the default block size?  I have a few files 30+mb and data
> is just added to the end of them.  It seems like it takes longer to
> sync them that it was to send it initially.  Should I change the
> block size or something else?

The default is an adaptive block size.  It's based on the file size
divided by 10000, truncated to a multiple of 16, with a minimum of 700
and a maximum of 16K (16384).

So your 30MB file ought to be using 16K blocks.  And yes, depending on
your machines (memory and CPU), it can take a while to synchronize
such files because rsync has to compute two checksums per block,
keeping that in memory, before making the transfer.  During the first
transfer rsync knows there is no target file, so it doesn't bother
with any of that but just sends the bytes.

If you know something about the construction of your file, manually
selecting a block size can be very helpful, since it helps optimize
how many changes rsync finds.  For example, when transferring database
sizes that I know have a 1K page size, I always keep block sizes a
multiple of 1K, since otherwise a single page change in the database
might affect two rsync blocks.  I then scale the block size by
database size to help keep the total number of blocks down, since that
burns memory and computation time.

My database transaction log files are very similar to your file - they
constantly grow so I'm really always only catching up the tail end of
the file.  For those, I use as large a block size as feasible.
However, my files aren't as large (we truncate a lot) so I use 16K
myself.  I believe I've had it work up closer to 32K but then had some
problems, so there may be some signed number issues (e.g., stick just
below 32K).

Not sure how much that would help, although it'll reduce your block
count by about a factor of 2.

-- David

/-----------------------------------------------------------------------\
 \               David Bolen            \   E-mail: db3l at fitlinxx.com  /
  |             FitLinxx, Inc.            \  Phone: (203) 708-5192    |
 /  860 Canal Street, Stamford, CT  06902   \  Fax: (203) 316-5150     \
\-----------------------------------------------------------------------/




More information about the rsync mailing list