Optimising the Rsync algorithm for speed by reverting to MD4 hashing

andrew.marlow at uk.bnpparibas.com andrew.marlow at uk.bnpparibas.com
Wed Aug 4 05:30:09 MDT 2010


I don't know why rsync made this move. My guess is that it does not look 
good for rsync to use a discredited algorithm. See 
http://tools.ietf.org/html/draft-turner-md4-to-historic-00.

Creating secure hashing functions is notoriously difficult. Several times 
algorithms previously thought secure have been shown to be vunerable to 
certain attacks. MD5 has also been discovered to be vunerable. See the 
article "MD5 considered harmful today" at 
http://www.win.tue.nl/hashclash/rogue-ca.

So the question is, does rsync need a hashing algorithm that is 
cryptographically secure? I suppose it's due in part to the likelyhood of 
different chunks hashing to the same value. With the MD5 vunerability one 
has to specially engineer it. IMO it is extremely unlikely that it would 
happen by chance when used by rsync. If anyone worries about this then 
maybe rsync would move to SHA-1 at some point. And then what if someone 
finds a problem with SHA-1? Indeed, Bruce Schneier has an article on this 
at http://www.schneier.com/blog/archives/2005/02/sha1_broken.html. Again, 
I reckon that the SHA-1 vunerability would have no practical effect if 
SHA-1 was used in rsync. Just my $0.02.

rsync uses the hashing function to fingerprint the chunks. I do not see 
why this needs to have all the strengths and safeguards of a cryptographic 
algorithm. Unless rsync is supposed to be defending against protocol 
attack? Is it? I didn't think so but I could be wrong, I don't know enough 
about this bit of the rsync code. If it is trying to defend against this 
then IMO it should be using an HMAC rather than just a hash code. Assuming 
it doesn't need these strengths/safeguards then maybe it should use a 
cheaper (i.e. quicker) hashing algorithm.

Regards,

Andrew Marlow




Internet 
Nick.McCarthy at replify.com
Sent by: rsync-bounces at lists.samba.org
04/08/2010 09:46

To
rsync at lists.samba.org
cc

Subject
Optimising the Rsync algorithm for speed by reverting to MD4 hashing






Hi,
 


More information about the rsync mailing list