Problem with checksum failing on large files

jw schultz jw at pegasys.ws
Sun Oct 13 02:30:01 EST 2002


On Sat, Oct 12, 2002 at 11:13:50AM -0700, Derek Simkowiak wrote:
> > My theory is that this is expected behavior given the check sum size.
> 
>      Craig,
> 	Excellent analysis!
> 
> 	Assuming your hypothesis is correct, I like the adaptive checksum
> idea.  But how much extra processor overhead is there with a larger
> checksum bit size?  Is it worth the extra code and testing to use an
> adaptive algorithm?
> 
> 	I'd be more inclined to say "This ain't the 90's anymore", realize
> that overall filesizes have increased (MP3, MS-Office, CD-R .iso, and DV)
> and that people are moving from dialup to DSL/Cable, and then make either
> the default (a) initial checksum size, or (b) block size, a bit larger.

I lean toward making the block-size adaptive.  Perhaps
something on the order of 700 < (filesize / 15000) < 8KB
Maybe both checksum size and block-size should be adaptive
so that both track the file size so large files have larger
but fewer checksums (compared to current defaults) and small
files retain their current advantages.  

Any change like this will require a protocol bump so should
include the MD4 sum corrections as well.

-- 
________________________________________________________________
	J.W. Schultz            Pegasystems Technologies
	email address:		jw at pegasys.ws

		Remember Cernan and Schmitt



More information about the rsync mailing list