Optimising the Rsync algorithm for speed by reverting to MD4 hashing
andrew.marlow at uk.bnpparibas.com
andrew.marlow at uk.bnpparibas.com
Wed Aug 4 05:30:09 MDT 2010
I don't know why rsync made this move. My guess is that it does not look
good for rsync to use a discredited algorithm. See
http://tools.ietf.org/html/draft-turner-md4-to-historic-00.
Creating secure hashing functions is notoriously difficult. Several times
algorithms previously thought secure have been shown to be vunerable to
certain attacks. MD5 has also been discovered to be vunerable. See the
article "MD5 considered harmful today" at
http://www.win.tue.nl/hashclash/rogue-ca.
So the question is, does rsync need a hashing algorithm that is
cryptographically secure? I suppose it's due in part to the likelyhood of
different chunks hashing to the same value. With the MD5 vunerability one
has to specially engineer it. IMO it is extremely unlikely that it would
happen by chance when used by rsync. If anyone worries about this then
maybe rsync would move to SHA-1 at some point. And then what if someone
finds a problem with SHA-1? Indeed, Bruce Schneier has an article on this
at http://www.schneier.com/blog/archives/2005/02/sha1_broken.html. Again,
I reckon that the SHA-1 vunerability would have no practical effect if
SHA-1 was used in rsync. Just my $0.02.
rsync uses the hashing function to fingerprint the chunks. I do not see
why this needs to have all the strengths and safeguards of a cryptographic
algorithm. Unless rsync is supposed to be defending against protocol
attack? Is it? I didn't think so but I could be wrong, I don't know enough
about this bit of the rsync code. If it is trying to defend against this
then IMO it should be using an HMAC rather than just a hash code. Assuming
it doesn't need these strengths/safeguards then maybe it should use a
cheaper (i.e. quicker) hashing algorithm.
Regards,
Andrew Marlow
Internet
Nick.McCarthy at replify.com
Sent by: rsync-bounces at lists.samba.org
04/08/2010 09:46
To
rsync at lists.samba.org
cc
Subject
Optimising the Rsync algorithm for speed by reverting to MD4 hashing
Hi,
More information about the rsync
mailing list