Optimising the Rsync algorithm for speed by reverting to MD4 hashing

Mike Bombich mike at bombich.com
Wed Aug 4 09:23:43 MDT 2010


I agree!  I was planning to implement an alternative specifically for my own use to improve performance, though it would be nice to at least have an option to use something cheaper using the standard distro.  Especially when you can expect high performance from storage, the performance hit is very noticeable.  On my system, rsync spends about a quarter of its time in md5 methods.  I can understand why this is perhaps more important for a tool that does IPC transfers of large amounts of data, but I agree that it doesn't need to be cryptographically secure.  We need to verify that what is sent by the sender is the same that is received by the receiver, but a cheaper algorithm would surely suffice.

Mike

On Aug 4, 2010, at 6:30 AM, andrew.marlow at uk.bnpparibas.com wrote:

> 
> I don't know why rsync made this move. My guess is that it does not look good for rsync to use a discredited algorithm. See http://tools.ietf.org/html/draft-turner-md4-to-historic-00. 
> 
> Creating secure hashing functions is notoriously difficult. Several times algorithms previously thought secure have been shown to be vunerable to certain attacks. MD5 has also been discovered to be vunerable. See the article "MD5 considered harmful today" at http://www.win.tue.nl/hashclash/rogue-ca. 
> 
> So the question is, does rsync need a hashing algorithm that is cryptographically secure? I suppose it's due in part to the likelyhood of different chunks hashing to the same value. With the MD5 vunerability one has to specially engineer it. IMO it is extremely unlikely that it would happen by chance when used by rsync. If anyone worries about this then maybe rsync would move to SHA-1 at some point. And then what if someone finds a problem with SHA-1? Indeed, Bruce Schneier has an article on this at http://www.schneier.com/blog/archives/2005/02/sha1_broken.html. Again, I reckon that the SHA-1 vunerability would have no practical effect if SHA-1 was used in rsync. Just my $0.02. 
> 
> rsync uses the hashing function to fingerprint the chunks. I do not see why this needs to have all the strengths and safeguards of a cryptographic algorithm. Unless rsync is supposed to be defending against protocol attack? Is it? I didn't think so but I could be wrong, I don't know enough about this bit of the rsync code. If it is trying to defend against this then IMO it should be using an HMAC rather than just a hash code. Assuming it doesn't need these strengths/safeguards then maybe it should use a cheaper (i.e. quicker) hashing algorithm. 
> 
> Regards,
> 
> Andrew Marlow
> 
> 
> 
> Internet   
> Nick.McCarthy at replify.com
> Sent by: rsync-bounces at lists.samba.org
> 
> 04/08/2010 09:46
> 
> To
> rsync at lists.samba.org
> cc
> Subject
> Optimising the Rsync algorithm for speed by reverting to MD4 hashing
> 
> 
> 
> 
> 
> Hi, 
>   
> From v3.0.0 onwards the hash function implemented by Rsync was changed from MD4 to MD5 (http://rsync.samba.org/ftp/rsync/src/rsync-3.0.0-NEWS). My understanding is that MD5 is a more secure, slower version of MD4 but I am not convinced that the added security of MD5 would alone have merited the change from MD4 (particularly since MD4 is ~30% faster than MD5). I wonder if I am missing other reasons which made the change necessary/desirable? 
>   
> I am looking at ways to optimise Rsync (for speed) hence my interest in this, 
>   
> Thanks, 
>   
> Nick 
>  -- 
> Please use reply-all for most replies to avoid omitting the mailing list.
> To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
> Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html 
> ___________________________________________________________
> This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorised copying, disclosure or distribution of the material in this e-mail is prohibited.
> 
> Please refer to http://www.bnpparibas.co.uk/en/information/legal_information.asp?Code=ECAS-845C5H  for additional disclosures.
> -- 
> Please use reply-all for most replies to avoid omitting the mailing list.
> To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
> Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.samba.org/pipermail/rsync/attachments/20100804/510db94a/attachment.html>


More information about the rsync mailing list