[ccache] basename of source file in hashed name in ccache
Frank Klotz
frank.klotz at alcatel-lucent.com
Tue Nov 8 15:18:33 MST 2011
On 11/08/2011 01:49 PM, Joel Rosdahl wrote:
> On 5 November 2011 01:26, Frank Klotz<frank.klotz at alcatel-lucent.com> wrote:
>> [...] I remember quite clearly (and I just confirmed with a colleague who is
>> still there) that the file names in the cache contained BOTH the hash AND the
>> basename of the object file.
> As far as I know, the object files have always been stored using only the hash.
> However, temporary files (stored in $CCACHE_DIR in ccache<=2.4 and
> $CCACHE_DIR/tmp in ccache>=3.0) include (a prefix of) the basename.
>
>> [...] (and another string that the ccache code refers to as "size", although
>> I can't quite figure out what it's the size OF)
> It's the size of the hashed text, i.e. output from the preprocessor. This is
> just a way of making filename collisions somewhat less likely.
>
>> One place we found the basenames invaluable was tracking down a corrupted
>> object file in the cache. Once we confirmed that we had a corrupt object file
>> foo.o, we simple searched for all "*foo.o" files in the cache, compared those
>> in size and content to an actual corrupted object file in the user directory,
>> and easily removed the corrupted file from the cache. Much harder (not
>> impossible, but harder) to do this without the basenames.
> An easy way to do that is:
>
> 1. Remove foo.o from the build tree.
> 2. Build with CCACHE_LOGFILE set to a log file.
> 3. Look for "Created foo.o from X" (where X is a file in the cache) in the log
> file.
> 4. Remove X.
>
> Or even easier:
>
> 1. Remove foo.o from the build tree.
> 2. Build with CCACHE_RECACHE set.
>
>> [...] Anyway, is there a general consensus on whether this would be valuable?
> It doesn't sound like a good idea to me, at least, since you would need to
> store duplicate copies of the object file for two compilations where the source
> content is the same but the file names differ.
1: Would that EVER happen? (I am having trouble visualizing a
situation where this would be a good thing.)
2: If it DID happen, rather than 2 copies, could store one inode with 2
directory entries (hard links) with the 2 names.
So while I understand your objection, I don't think it is a total
deal-killer.
Of course, I also understand that the case FOR the change I proposed
isn't necessarily all that compelling either! It was more that I was
accustomed to seeing it that way, and thought it useful. But I do see
from your other responses that there are other easy ways to accomplish
the specific rationale I gave for it.
It was also somewhat interesting to be able to find all the cached
copies of foo.o in the cache -
find $CCACH_DIR -name '*foo*.o" -ls
And once you have basenames in the ccache, you can find other things to
do with them too....
Not a major issue for me, but wanted to get the suggestion out there.
Thanks,
Frank
> -- Joel
More information about the ccache
mailing list