recent discussion regarding 'checksums'

Benjamin R. Haskell rsync at benizi.com
Mon Sep 27 20:33:06 MDT 2010


On Mon, 27 Sep 2010, grarpamp wrote:

> > If
> 
> Ad nauseum... to each shall entertain their own use case scenarios. 
> The overall point is that MD5 is not suitable for data integrity 
> beyond it's known [and unknown] weaknesses.

But the flip side is that rsync is not a security tool.  MD5 is fine for 
rsync for the same reason SHA-1 (which, as with all hashes, will 
eventually be "broken") is fine for git:

1) Security is handled at a different layer (In the cases where a file 
could be switched out for a malicious one with the same MD5, you're 
already screwed, because the attacker can do much more malicious things 
much more easily).

2) The checksum is designed to protect from inadvertent (non-malicious) 
changes (The odds of a random error in a file generating a broken file 
with the same MD5 checksum are so utterly, incomprehensibly small that 
worrying about it is beyond useless).

Also as I think Wayne pointed out, using "-c" isn't very common in the 
first place.  Given that, see the handwavy explanation at Wikipedia on 
how the fact that it's not using straight MD5's (usually) is helpful[1]. 
(Basically, you get some extra bits of security from the fact that it's 
also using the rolling checksum.)

[1] (near the end of) http://en.wikipedia.org/wiki/Rsync#Algorithm ("The 
sender then sends the recipient...")


> I've no faith in an algorithm with such freely generatable collisions 
> to not have other collisions/rot with any of the other 5M inodes 
> turning over at this one particular site each week.

See both the git-related points above.  What's the attack here (point 
#1)?  And if it's not an attack you're worried about (rather, random 
corruption), it's not worth worrying about (point #2).


> Everything's a trade off, features move that decision to the users.

Nonetheless, I do agree with that, and think that having the option to 
use a different hash is a good idea.  Heck, the option to run as many 
different hashes as you want.  I like that Gentoo uses RIPEMD-160, 
SHA-1, and SHA-256 for every file, but it's of questionable utility[2].

[2] http://www.gentoo.org/proj/en/glep/glep-0059.html

Nonetheless, if future-proofing the security of the --checksum mode is 
your concern, the bugs Matt pointed out are useful.

RFE to rsync to support SHA-256
https://bugzilla.redhat.com/show_bug.cgi?id=483056

Port rsync to use NSS library for cryptography (which gets SHA-256 and 
then some) [linked to by Matt from the above bug]
https://bugzilla.redhat.com/show_bug.cgi?id=348161


> Enjoy :)

I do, but probably won't follow up with much.  Mainly because I'm not 
sure 'rsync' has the same "official" stance on security as what I've put 
forth above.  Secondarily because cryptography gets me sidetracked from 
ye olde daye jobbe.

-- 
Best,
Ben


More information about the rsync mailing list