[ccache] Caching failed compilations

Joel Rosdahl joel at rosdahl.net
Mon Jul 6 22:44:50 CEST 2015


>
> After thinking further, I'd be tempted to say that ccache should *not*
> cache failures with exit codes other than "1" as they're likely not
> repeatable (OOM, Crtl-C, etc.).


> Perhaps just signal a failed compile with a cache result that is present
> but zero-length? (We could also say that it failed if the the binary cache
> is missing, but the stderr cache is present, but that might be problematic.)


That sounds like a reasonable idea, but I have occasionally seen empty
object files in large and busy caches (it could be due to filesystem
failure, hardware failure or hard system reset), so I'm afraid that using
zero-length object files won't work out in practice. See also
https://bugzilla.samba.org/show_bug.cgi?id=9972. But maybe writing some
special content to the object file would be OK?

Sorry, I don't see any advantage in this scheme. You might save a few bytes
> of disk space, and maybe a few inodes, but I've not seen any evidence that
> those are a problem. You'll also add extra file copies to every cache miss,
> and those are already expensive enough.


My primary motivation for considering the mentioned scheme is to reduce
disk seeks, not disk space. If you have a cold disk cache (on a rotating
device), every new i-node that needs to be visited potentially/likely needs
a new disk seek, which is slow. If all parts of the result are stored in
one contiguous file, it should likely be quicker to retrieve. But as
mentioned earlier, I have no data to back up this theory yet.

A secondary motivation for the scheme is that various code paths in ccache
need to handle multiple files for a single result. There can now be between
two (stderr, object) and six (stderr, object, dependency, coverage,
diagnostics, split dwarf) files for each cached result. If one of those
files is missing, then the result should be invalid. This is quite painful
and there are most likely some lurking bugs related to this.

A third motivation is that it would be easier to include a check sum of the
cached data to detect corruption so that ccache won't repeatedly deliver a
bad object file (due to hardware error or whatnot).

Does this sound reasonable? What disadvantages do you see?

-- Joel


More information about the ccache mailing list