[RFC] dynamic checksum size

Donovan Baarda abo at minkirri.apana.org.au
Mon Mar 24 01:02:28 EST 2003


On Mon, Mar 24, 2003 at 12:54:26AM +1100, Donovan Baarda wrote:
> On Sun, Mar 23, 2003 at 03:46:34AM -0800, jw schultz wrote:
> > On Sun, Mar 23, 2003 at 05:45:47PM +1100, Donovan Baarda wrote:
> > > On Sun, 2003-03-23 at 07:40, jw schultz wrote:
[...]
> The block_size heuristic is pretty arbitary, but the blocksum_size
> calculation is not, and calculates the minimum blocksum size to achieve a
> particular probability of success for given file and blocksum sizes. You can
> replace the block_size heuristic with anything, and the second part will
> calculate the required blocksum size.
> 
> > I do have a variant that scales the length devisor for block
> > size from 1024 to 16538 as the length progresses from 1MB to
> > 128MB.  This in conjunction with your sum2 length formula
> > produces this:
> 
> That sounds pretty arbitary to me... I'd prefer something like having it
> grow at the square-root of filesize... so 1K for 1M, 8K for 64M, 16K for
> 256M, 32K for 1G, etc.

A thought occurred to me after writing this; a viable blocksize heuristic is
just a fixed block size. This makes the signature size almost proportional
to the filesize, except for the growth in blocksum size.

I don't necisarily advocate it though. I think increasing the blocksize is a
good idea as files grow because file and signature size also contribute to
CPU load.

-- 
----------------------------------------------------------------------
ABO: finger abo at minkirri.apana.org.au for more info, including pgp key
----------------------------------------------------------------------


More information about the rsync mailing list