[ccache] Why not cache link commands?

Andrew Stubbs ams at codesourcery.com
Wed Sep 19 03:43:21 MDT 2012


On 18/09/12 22:59, Mike Frysinger wrote:
> the linker's --build-id and associated .note.gnu.build-id section.  you can't
> hash the entire object because it can change between compiles.  build-id lets
> you say "regardless of the hash of the entire object, we know the content that
> matters is unchanged".

Ah, excellent, this is the sort of detail I was looking for!

My own brief experimentation shows that static libraries contain 
troublesome datestamps, but object files appear to be reproducible, 
given the same source and command line (the case ccache handles).

Under what circumstances can the binary change but the build-id remain 
the same? I'm aware of line number, and file path differences in the 
debug info. Is there anything else?

Anyway, as I understand it, ccache could dump the build-id section 
first, if there is one, and hash the entire binary second, if there 
isn't one.

I'm a bit concerned about the build-id though. As I read it, the 
build-id can't tell the difference between a stripped binary and one 
with full debug, and the two certainly produce different output (OK, a 
*very* smart tool could determine that, with a certain link command or 
script, two different inputs are equivalent, but let's not go there). It 
can't even tell the difference between an object with *only* debug.

Hashing the entire binary could lead to additional cache misses in the 
case that the user has made minor, unimportant changes to the build, but 
in the normal case the object file will have come from the cache anyway 
so this won't be a problem.

The library datestamps problem can be got around by hashing the output 
of "ar p libNAME.a" (perhaps combined with "ar t libNAME.a", just to be 
safe, but certainly not with "-v"), or perhaps "objdump -j 
.note.gnu.build-id -s libNAME.a" if we want to use build-ids.

>> "-###" isn't meant to be a wildcard. That's an actual GCC option. I put
>> quotes around it because most shells would interpret the hashes as the
>> start of a comment.
>
> hmm, gotcha.  it does seem to include all the necessary info.  whether it's
> easy for a machine to parse across gcc versions is a diff question :).  seems
> to have changed subtly over time between 3.3.6 and 4.7.1.

Probably true, but it ought to be possible to determine if we do 
understand it, or not, and fall back to the old behaviour if not.

Andrew


More information about the ccache mailing list