[ccache] Using git file hashes for ccache

Wilson Snyder wsnyder at wsnyder.org
Fri Dec 31 05:27:52 MST 2010

>> Maybe the right thing to do would be to have ccache keep track of the
>> source files' attributes. =A0If some environment variable was set,
>> ccache would treat a file with unchanged attributes as unchanged.
>> (ccache could maintain a new index into its cache, indexed on absolute
>> path, or it could hash a string "magic-bitstring | file-path | file
>> attributes" and use the current cache infrastructure.) =A0This seems a
>> lot simpler than trying to interface with git.
>I think that is a better approach too.  It's probably enough to just
>store the mtime and (on unix) ctime.  There are a couple of tricks to
>doing this safely: if the time == the current time, you can't trust it
>because the file could be modified again before the end of the current
>second.  On some filesystems (eg vfat) the fs actually only stores
>2-second granularities.  On some Linux systems you get sub-second
>accuracy while the file is in cache but not when it's been flushed to
>disk (this might have been fixed.)

I also think this is a good approach, though having been
down the road before, mtime isn't always enough as you
noted, but including the size also makes it *almost*
perfect.  Most edits change the size.

Note several tools like scons use this technique, and some
store the hashes in a single hash file inside each source
directory.  That has the nice advantage of allowing sharing,
though the downside of poluting the source areas so I don't
really like it.  I think putting it into the ccache
infrastructure is nicer; but you may still want multiple
hashes to be stored under a hash of the directory name,
instead of a hash of the filename, because that allows
reading fewer files.  (Otherwise reading the hundreds of
hash files will become the new bottleneck.)

More information about the ccache mailing list