recent discussion regarding 'checksums'

grarpamp grarpamp at gmail.com
Thu Sep 23 22:31:07 MDT 2010


Hi. Wanted to add something. There was recent talk
about the use of 'checksums' by rsync to determine
how or what parts of a file to copy. Something like that.

Anyways, it just so happens that I have a number of
files here that rsync completely fails to update...

l -isT */* ; md5 -r */* ; sha1 -r */*
117969 9 -rw-r--r--  1 a a 6144 Sep 21 03:05:37 2010 1/a
117970 9 -rw-r--r--  1 a a 6144 Sep 21 03:05:37 2010 2/a
cdc47d670159eef60916ca03a9d4a007 1/a
cdc47d670159eef60916ca03a9d4a007 2/a
84d0d03198b3952b7648d9ac468684fc42771a58 1/a
fa90bd7b1205c4dd452769f73e69108233726462 2/a
rsync -Haxi --delete ./1/ ./2/
 [silence, proper]
rsync -Haxic --delete ./1/ ./2/
 [silence, whoops! caveat rtfm about md5, sha1 would have caught this]


So in short, be careful what you ask for in the name of speed.
MD5 is completely real world broken.
SHA1 is encountering problems on paper.
Please don't weaken the 'checksums' for those of us who value
integrity. Why not just provide the user some more options...

--hash <type>, where <type> is one of:
o bitwise compare
o whatever the outcome of the sha-3 contest is
o sha256
o sha1
o md5
o crc[x]/etc, lol :)
o none

You can probably find suitable libraries and choices in openssl and
I'd suggest compiling against said libraries rather than copying them.


More information about the rsync mailing list