MD4 bug in rsync for lengths = 64 * n

Craig Barratt craig at atheros.com
Mon Sep 2 03:46:01 EST 2002


> This is the first detailed description of the problem I've seen. I've heard
> it mentioned several times before, and thought that the md4 code in librsync
> was the same as in rsync. I've looked and tweaked the md4 code in librsync
> and could never see the bug so I thought it was a myth. I also thought that
> samba used this code.... I wonder what variant it is using :-)

Samba looks right to me.  Anyhow, I looked at the archives and found
this message, so I have simply rediscovered the same bug as Tridge:

    http://www.mail-archive.com/rsync@lists.samba.org/msg03919.html

> > > The fix is easy: a couple of ">" checks should be ">=".  I can send
> > > diffs if you want.  But of course this can't be rolled in unless it
> > > is coupled with a bump in the protocol version.  
> > 
> > Another bump in the protocol version is no problem.  Please submit a patch.
> 
> I can submit patches if required for the md4code as tweaked/fixed for
> librsync. The fixed code is faster as well as correct :-)

Sure, that would be great.  Otherwise, I would be happy to recreate
and test a patch.

> > > email about fixing MD4 to handle files >= 512MB (I presume this
> > > relates to the 64-bit bit count in the final block).  Perhaps this
> > > change can be made at the same time?
> > 
> > Could you please post a reference to that email?  It isn't familiar to me
> > and I didn't find it through google.  There have been other problems we've
> > been seeing with with the end of large files and zlib compression, though.
> > I wonder if it can somehow be related.
> 
> It may not have been on the rsync list, but on the librsync list... Please
> note that there are several variants of the md4 patch floating around. I've
> been meaning to seperate the latest md4 patch from my bigger librsync "delta
> refactor patch" for some time.

I must be spacing.  I can't find the earlier post either.  And I also
can't find my original post in the archives...

Anyhow, the bug occurs for in the file MD4 digest for file lengths >= 512MB.
Step 2 in the RFC for the MD4 algorithm specifies that the lower 64 bits
(not 32 bits) of the data's bit length is embedded in the tail buffer;
see:

    http://www.faqs.org/rfcs/rfc1186.html

Both librsync and rsync use a 32 bit unsigned int for counting the
number of bytes processed.  This is then multiplied by 8 (to get
bits) and this is embedded in the tail buffer when MD4 finishes up.
So for files bigger than 4GB bits (512MB) the 32 bit unsigned int
overflows.  Again, a benign bug but a little disconcerting if you
are using another program to check MD4 digests of large files.

Craig



More information about the rsync mailing list