[ccache] Multiple directories for CCACHE_DIR?

Anders Björklund anders at itension.se
Tue May 17 19:39:08 UTC 2016


Wilson Hong wrote:

> Hi ccache, I am trying to improve ccache build performance by
> improving cache hit rate. One thing I wanna try is to have multiple
> folders specified in CCACHE_DIR. Where first level is local cache,
> and second level in LAN NFS, third level on AWS s3. Ccache will start
> looking for local ccache first, if not found, then search in NFS
> folder and then S3, analogy CPU L1-L2-RAM cache architecture. I take
> a look ccache man page: https://ccache.samba.org/manual.html, but
> still cannot figure out how to do that. Is that supported in ccache?
> If not, would be a good idea to implement that? Any advice is
> welcome. Thanks!

It is not supported out-of-the-box, but it makes perfect sense.

You can put your primary cache directory on NFS, this is described
in the manual (but then *everything* will go out over the network).

When doing support for what would become the "dev/memcached" branch,
we used something called an external cache that does what you want.
It designates a second (or more) directory, where ccache would also
look for cache hits. The implementation of it has varied a bit...

First we would introduce a new kind of hits, like a "half hit".
But that was too much hassle, so we just copied the external file
to the local cache and called that a "hit" too (somewhat slower).
Eventually one would probably want to have separate statistics ?

Now with the memcached support, it could be that one would want to
make the "external" support more generic - perhaps also include S3.


The newer version (for 3.2) is available from here:
https://github.com/itensionanders/ccache/tree/external

It allows for ne "external" directory, using the regular file layout.
The older 3.1 version would allow comma-separated list of directories.

This version has the new option refactored into what *could* become a
storage backend, that would allow for other SQL and NoSQL variants...


I'm not sure what library is best for talking ("directly") with S3,
but it shouldn't be impossible to adapt - it's a simple 4 step API.

The theory is that NFS and HTTP (or SQL) will "lose" compared with
memcached, but it would nice to have some more solid statistics ;-)

The memcached branch (for master) is available here:
https://github.com/jrosdahl/ccache/tree/dev/memcached


There are alternative backends for MySQL and for Couchbase (NoSQL),
but those have not yet been refactored... So only the FS, for now.

/Anders


More information about the ccache mailing list