[ccache] Questions about two hot functions in ccache

Michael Niedermayer michaelni at gmx.at
Wed Oct 20 09:03:02 MDT 2010


On Tue, Oct 19, 2010 at 11:18:06PM -0200, Ramiro Polla wrote:
> Hi,
> 
> On Tue, Oct 19, 2010 at 7:15 PM, Joel Rosdahl <joel at rosdahl.net> wrote:
> > On 2010-10-18 03:44, Justin Lebar wrote:
> >> So it appears that 13% of my CPU time is spent computing md4 hashes,
> >> while another 25% is spent in hash_source_code_string but outside the
> >> MD4 code.
> >>
> >> To someone new to the code like me, it appears that there's some room
> >> for optimization here.
> >
> > Indeed. (This was by the way mentioned on the list a couple of weeks
> > ago: http://www.mail-archive.com/ccache@lists.samba.org/msg00532.html)
> [...]
> >> * Why does ccache still use MD4? Surely there's a better / faster hash
> >> out there. I noticed that ccache includes murmurhash, but it doesn't
> >> seem like it's used in too many places. There's probably a good reason
> >> for this, but it's not too apparent to me.
> >> [...]
> >
> > I added murmurhash to get a good hash function for some hash tables I
> > introduced when implementing the direct mode, which is a relatively late
> > invention. I'm far from an expert on hash functions, but surely
> > murmurhash and similar functions (that are designed to be good hash
> > functions for a hash table) aren't suitable to use for identifying
> > (without verification) arbitrary content like a cryptograhic hash
> > function is. Even the 64-bit version of murmurhash has way too high
> > collision rate.
> >
> > MD4 has been there from the start and neither Tridge or I have seen any
> > reason to switch it. MD5, SHA1 and other even more modern cryptograhic
> > hash functions are indeed stronger but also slower, and the increased
> > resistance against various crypto attacks doesn't seem necessary in a
> > tool like ccache. That said, I'm sure there nowadays may exist hash
> > functions that are both better (i.e., with lower collision rate) AND
> > faster than MD4. Do you (or anyone else) know of any with properties
> > that would be a good fit for ccache?
> 
> I'm CC'ing a couple of gurus from FFmpeg in the hope that they can help us.

Iam not really a hash function expert but i know enough math and random
number stuff that i think iam able to comment ...

I assume crypto strength isnt needed and about 64bit collision protection
should be there. That would require at least a 128bit hash.

The first obvious choice is a 128bit CRC this requires a table lookup per
byte though and some 128bit xor and is likely not the fastest but there should
be no doubt about it being good enough

The next idea i have is something along the lines of adler32
first we would use 32bit instead of 8bit input (makes it 4 times faster and
requires a 32bit prime) at this point we would have a 64bit checksum, luckily
adler32 can easily be extended by adding 2 higher order stages
the result would look like this:

#define BASE 0x10000000F // 32bit prime
adler128(const uint32_t *input, int len){
    uint64_t a,b,c,d
    while (len>0) {
        while(len>0 && d<CONSTANT_TO_PREVENT_OVERFLOW){
            uint32_t in=*input++; len--;
            d+=c;
            c+=b;
            b+=a;
            a+=in
        }
        a %= BASE;
        b %= BASE;
        c %= BASE;
        d %= BASE;
    }
    return "a,b,c,d"
}

The above checksum should have the property that any 2 inputs that produce the same
checksum must either be equal or differ in at least 4 32bit words and it should
be stronger than adler32

yet another way to get a fast 128bit hash function is to use a LFG
heres an example:

tmp[i] = input[i] + tmp[i-7] + tmp[i-10]
with input and tmp being uint64_t
this can be implemented extreemly efficient by putting all 10 tmp values in
registers requireing only 1 64bit read and 2 64bit additions for each 8 bytes
of input if the loop is unrolled
the output of that hash function would be 10*64bit and could either be used
directly or passed through a second hash to get 128bit out of it
other values instead of 7 and 10 can be used to improve the strength of this
but then they wont fit in the 16 available registers on x86_64 anymore


-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Awnsering whenever a program halts or runs forever is
On a turing machine, in general impossible (turings halting problem).
On any real computer, always possible as a real computer has a finite number
of states N, and will either halt in less than N cycles or never halt.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://lists.samba.org/pipermail/ccache/attachments/20101020/6b8a91a5/attachment.pgp>


More information about the ccache mailing list