[ccache] Optimizing MD4
Andrew Stubbs
ams at codesourcery.com
Fri Dec 11 10:16:11 UTC 2015
On 10/12/15 17:16, Anders Björklund wrote:
> Andrew Stubbs wrote:
>> Most of the rest of the time is spent doing MD4. I have some ideas how
>> to optimize that (by sharing them across runs), but nothing ready to post.
>
> I would be interested in your thoughts on how to speed that part up.
My implementation, which does a bunch of other things besides, hence why
it's not fit to post[*], launches a background task which creates a unix
domain socket in the cache directory (the windows version uses plain old
TCP).
Each invocation of ccache then connects to that socket and asks the
daemon to do the MD4 scan on its behalf. The daemon checks the mtime on
the file and serves the MD4 from its memory cache if nothing has
changed. The stat call could probably be optimized away if the cache is
very fresh (<1s?)
The daemon is single-threaded, but I still found a useful speed-up after
all the headers had been scanned. For many projects its basically the
same set of maybe 20 headers that get scanned over, and over again. I
think the hashing of non-header files is not repetitive, and probably
best handled in ccache itself.
The daemon exits after 10 minutes of inactivity.
In theory, what you get is ccache spending less time in MD4, but more
time in I/O wait. It does seem to be faster, over all, but that might
depend on your hardware.
However, even if the latency of each ccache invocation is the same, the
fact that they're basically idle means you can usefully crank up the
parallelism for all but the initial build.
You could, in principle, use this communication to limit how many
cache-miss compilations are permitted to run in parallel, and therefore
run "make -j" for maximum parallelism without fear of melting your memory.
Unfortunately, I've moved on to other projects and don't have much time
to work on this stuff any more.
Andrew
[*] The implementation was included in some editions of the Sourcery
CodeBench toolchain, so you can find them in the source package if you
really want to. I think you can find it here:
https://sourcery.mentor.com/GNUToolchain/release3047 (free registration
required).
More information about the ccache
mailing list