[ccache] Questions about two hot functions in ccache
ramiro.polla at gmail.com
Tue Oct 19 19:18:06 MDT 2010
On Tue, Oct 19, 2010 at 7:15 PM, Joel Rosdahl <joel at rosdahl.net> wrote:
> On 2010-10-18 03:44, Justin Lebar wrote:
>> So it appears that 13% of my CPU time is spent computing md4 hashes,
>> while another 25% is spent in hash_source_code_string but outside the
>> MD4 code.
>> To someone new to the code like me, it appears that there's some room
>> for optimization here.
> Indeed. (This was by the way mentioned on the list a couple of weeks
> ago: http://email@example.com/msg00532.html)
>> * Why does ccache still use MD4? Surely there's a better / faster hash
>> out there. I noticed that ccache includes murmurhash, but it doesn't
>> seem like it's used in too many places. There's probably a good reason
>> for this, but it's not too apparent to me.
> I added murmurhash to get a good hash function for some hash tables I
> introduced when implementing the direct mode, which is a relatively late
> invention. I'm far from an expert on hash functions, but surely
> murmurhash and similar functions (that are designed to be good hash
> functions for a hash table) aren't suitable to use for identifying
> (without verification) arbitrary content like a cryptograhic hash
> function is. Even the 64-bit version of murmurhash has way too high
> collision rate.
> MD4 has been there from the start and neither Tridge or I have seen any
> reason to switch it. MD5, SHA1 and other even more modern cryptograhic
> hash functions are indeed stronger but also slower, and the increased
> resistance against various crypto attacks doesn't seem necessary in a
> tool like ccache. That said, I'm sure there nowadays may exist hash
> functions that are both better (i.e., with lower collision rate) AND
> faster than MD4. Do you (or anyone else) know of any with properties
> that would be a good fit for ccache?
I'm CC'ing a couple of gurus from FFmpeg in the hope that they can help us.
More information about the ccache