MD4 bug in rsync for lengths = 64 * n
craig at atheros.com
Mon Sep 2 03:46:01 EST 2002
> This is the first detailed description of the problem I've seen. I've heard
> it mentioned several times before, and thought that the md4 code in librsync
> was the same as in rsync. I've looked and tweaked the md4 code in librsync
> and could never see the bug so I thought it was a myth. I also thought that
> samba used this code.... I wonder what variant it is using :-)
Samba looks right to me. Anyhow, I looked at the archives and found
this message, so I have simply rediscovered the same bug as Tridge:
> > > The fix is easy: a couple of ">" checks should be ">=". I can send
> > > diffs if you want. But of course this can't be rolled in unless it
> > > is coupled with a bump in the protocol version.
> > Another bump in the protocol version is no problem. Please submit a patch.
> I can submit patches if required for the md4code as tweaked/fixed for
> librsync. The fixed code is faster as well as correct :-)
Sure, that would be great. Otherwise, I would be happy to recreate
and test a patch.
> > > email about fixing MD4 to handle files >= 512MB (I presume this
> > > relates to the 64-bit bit count in the final block). Perhaps this
> > > change can be made at the same time?
> > Could you please post a reference to that email? It isn't familiar to me
> > and I didn't find it through google. There have been other problems we've
> > been seeing with with the end of large files and zlib compression, though.
> > I wonder if it can somehow be related.
> It may not have been on the rsync list, but on the librsync list... Please
> note that there are several variants of the md4 patch floating around. I've
> been meaning to seperate the latest md4 patch from my bigger librsync "delta
> refactor patch" for some time.
I must be spacing. I can't find the earlier post either. And I also
can't find my original post in the archives...
Anyhow, the bug occurs for in the file MD4 digest for file lengths >= 512MB.
Step 2 in the RFC for the MD4 algorithm specifies that the lower 64 bits
(not 32 bits) of the data's bit length is embedded in the tail buffer;
Both librsync and rsync use a 32 bit unsigned int for counting the
number of bytes processed. This is then multiplied by 8 (to get
bits) and this is embedded in the tail buffer when MD4 finishes up.
So for files bigger than 4GB bits (512MB) the 32 bit unsigned int
overflows. Again, a benign bug but a little disconcerting if you
are using another program to check MD4 digests of large files.
More information about the rsync