MD4 bug in rsync for lengths = 64 * n

Donovan Baarda abo at minkirri.apana.org.au
Sun Sep 1 17:25:59 EST 2002


On Thu, Aug 29, 2002 at 03:17:34PM -0500, Dave Dykstra wrote:
> On Sun, Aug 04, 2002 at 01:19:43PM -0700, Craig Barratt wrote:
[...]
> > However, as I'm sure is well-known,
> > the Adler crc32 and MD4 computed by librsync don't match those in
> > rsync 2.5.5.
> 
> I do not recall hearing anybody mention that before.

I thought that the librsync and rsync delta etc was so radicaly different
that you could never get them talking... I assume that this is all going
towards making librsync and rsync talk, which would be nice.

> > After swapping the crc32 endianess, changing RS_CHAR_OFFSET from 31 to
> > 0, and adding rsync's checksum_seed to librsync's MD4 they now agree,
> > except in one case: when the total data length (including the 4 byte
> > checksum_seed) is a multiple of 64, the MD4 checksums in librsync and
> > rsync don't agree.  After reviewing the code in librsync, rsync and the
> > original RSA implementation, I believe the bug is in rsync.  It doesn't
> > call mdfour_tail() when the last fragment is empty.  Unfortunately
> > this happens in the particularly common case of 700 + 4 = 64 * 11.
> > The same bug occurs in both the block MD4s and the entire-file MD4.

This is the first detailed description of the problem I've seen. I've heard
it mentioned several times before, and thought that the md4 code in librsync
was the same as in rsync. I've looked and tweaked the md4 code in librsync
and could never see the bug so I thought it was a myth. I also thought that
samba used this code.... I wonder what variant it is using :-)

> > The bug is benign in the sense that it is on both sides so rsync works
> > correctly.  But it is possible (I am certainly not a crypto expert) that
> > missing the trailing block (that includes the bit length) substantially
> > weakens the computed MD4.
> > 
> > The fix is easy: a couple of ">" checks should be ">=".  I can send
> > diffs if you want.  But of course this can't be rolled in unless it
> > is coupled with a bump in the protocol version.  
> 
> Another bump in the protocol version is no problem.  Please submit a patch.

I can submit patches if required for the md4code as tweaked/fixed for
librsync. The fixed code is faster as well as correct :-)

> > I saw some earlier
> > email about fixing MD4 to handle files >= 512MB (I presume this
> > relates to the 64-bit bit count in the final block).  Perhaps this
> > change can be made at the same time?
> 
> Could you please post a reference to that email?  It isn't familiar to me
> and I didn't find it through google.  There have been other problems we've
> been seeing with with the end of large files and zlib compression, though.
> I wonder if it can somehow be related.

It may not have been on the rsync list, but on the librsync list... Please
note that there are several variants of the md4 patch floating around. I've
been meaning to seperate the latest md4 patch from my bigger librsync "delta
refactor patch" for some time.

One of my goals for librsync is to make it use the same md4 API as the RSA
code, as this seems to be more widely used and would allow drop-in use of
the RSA implementation for verification. I'm planning to make the API
changes then submit this implementation to libmd. 

Before the API change I'll release a final bugfix/optimize version using the
old API.

-- 
----------------------------------------------------------------------
ABO: finger abo at minkirri.apana.org.au for more info, including pgp key
----------------------------------------------------------------------



More information about the rsync mailing list