[ccache] Caching failed compilations
Andrew Stubbs
andrew_stubbs at mentor.com
Wed Jul 8 13:44:30 UTC 2015
On 08/07/15 14:04, Joel Rosdahl wrote:
> On 7 July 2015 at 10:58, Andrew Stubbs <ams at codesourcery.com
> <mailto:ams at codesourcery.com>> wrote:
> > On 06/07/15 21:44, Joel Rosdahl wrote:
> > > But maybe writing some special content to the object file would be OK?
> >
> > OK, fair enough, but I'd say that once you've opened the file and checked
> > the magic data then you've already killed performance.
>
> On a cache miss, the object file doesn't exist, so it doesn't need to be
> opened. On a cache hit, we need to open and read the file regardless of
> whether it's a real object file or special data encoding an exit code.
> In what way would this kill performance?
On cache-hit, there's currently no reason to actually look inside the
file, right? It just does the copy blind (I forget exactly how). Reading
the initial data from every binary on every cache-hit (the case we want
to be most optimal) sounds like a Bad Thing.
>> A failure can be confirmed by a read, if and only if the length matches, but
>> a compile success will remain on the quick path.
>
> You lost me there. :-) I don't understand what you think would be a slow
> path. Please expand on this.
The most common case must always be the "quick path"; i.e. we should try
to ensure that we can achieve that with the fewest file-stats, opens,
reads, etc. Any other case must needs be slower (because it requires
more decision making), and we should ensure that those extra decisions
are not pushed up into the quick path.
So, if the cached binary has some property that says "this isn't the
most common case", then we need to be able to identify that with as
little additional overhead as possible. The cost of system calls
massively dwarfs the cost of simple logic comparisons, so an optimal
solution would use an indicator that is already available.
For example, if the binary file does not exist then we don't need an
extra system call to figure out that we don't have a plain old fashioned
cache-hit.
For another example, if the binary file has a very specific size then we
can see that from the stat call the code already does (at least, I think
it does). The file size itself could be just a coincidence, so in this
case we'd have to read the file to check for the magic text. However,
since this code is the unlikely case -- the slow path -- this is ok.
> For the standard code paths, yes (barring bugs), but e.g. when doing
> cleanup it has no information about which files to expect so it has to
> try to delete all known file types for a given result.
The performance of cleanup are not important (within reason).
Primary importance is how quickly we can speed through a build
consisting entirely of cache-hits.
Secondary importance is keeping the overhead of a build consisting
entirely of cache-misses to a minimum.
Everything else is the unlikely case, and therefore need only be "not
terrible". :-)
Andrew
More information about the ccache
mailing list