[ccache] Duplicate object files in the ccache - possible optimization?

Janne Johansson icepic.dz at gmail.com
Sun Feb 19 03:03:07 MST 2012


2011/11/9 Joel Rosdahl <joel at rosdahl.net>:
> On 8 November 2011 23:18, Frank Klotz <frank.klotz at alcatel-lucent.com> wrote:
>> On 11/08/2011 01:49 PM, Joel Rosdahl wrote:
>>> It doesn't sound like a good idea to me, at least, since you would need to
>>> store duplicate copies of the object file for two compilations where the
>>> source content is the same but the file names differ.
>
>> 2: If it DID happen, rather than 2 copies, could store one inode with 2
>> directory entries (hard links) with the 2 names.
>
> I don't think you can in an acceptable way find an existing file to reuse when
> you're about to store a new file -- you'll have to list the directory and
> iterate through its items to look for a reusable file. (That is, unless you
> introduce some kind of higher-level index that makes it possible to efficiently
> list existing files with a given hash.)

Then again, filesystems with built-in deduplication will do this for
you, if you really care about not wasting space on similar files in
different places with different names, so one could hand off this
problem to the filesystem layer.
Don't know a long list of such file systems, but zfs definitely does
this if asked.

-- 
 To our sweethearts and wives.  May they never meet. -- 19th century toast


More information about the ccache mailing list