checksum_seed

Craig Barratt cbarratt at users.sourceforge.net
Wed Feb 11 05:56:00 GMT 2004


On Mon, Feb 09, 2004 at 09:14:06AM -0500, Jason M. Felice wrote:

> I got the go-ahead from the client on my --link-by-hash proposal, and
> the seed is making the hash unstable.  I can't figure out why the seed
> is there so I don't know whether to cirumvent it in my particular case
> or calculate a separate, stable hash.

I believe the checksum seed is meant to reduce the chance that different
data could repeatedly produce the same md4 digest over multiple runs.
If a collision happens the hope is that a different checksum seed will
break the collision.

However, my guess is that it doesn't make any difference.  Certainly
adding the seed at the end of the block won't change a collision even
if the seed changes over multiple runs.  File MD4 checksums add the
seed at the beginning, which might help breaking collisions, although
I'm not sure.

Wayne Davison writes:

> There was some talk last year about adding a --fixed-checksum-seed
> option, but no consensus was reached.  It shouldn't hurt to make the
> seed value constant for certain applications, though, so you can feel
> free to proceed in that direction for what you're doing for your client.
> 
> FYI, I just checked in some changes to the checksum_seed code that will
> make it easier to have other options (besides the batch ones) specify
> that a constant seed value is needed.

I would really like a --fixed-csumseed option become a standard
feature in rsync.  Just using the batch value (32761) is fine.
Can I contribute a patch?  The reason I want this is the next
release of BackupPC will support rsync checksum caching, so that
backups don't need to recompute block or file checksums.  This
requires a fixed checksum seed on the remote rsync, hence the
need for --fixed-csumseed.  I've included this feature in a
pre-built rsync for cygwin that I include on the SourceForge
BackupPC downloads.

Craig


More information about the rsync mailing list