[ccache] Combining multiple ccache into one

Anders Björklund anders at ecsit.se
Sun Mar 18 12:32:39 UTC 2018


Jason Zhou wrote:
> I am looking for an efficient way to correctly combine multiple
> ccache from hundreds of build machines into a single ccache to build
> a super set ccache. We use 200+ autoscaled cloud machines in our
> build farm and each machine builds a random subsets of the source
> tree. ccache size on each machine is ~70GB and contains ~500K files.
> Having a superset ccache pre-built in the cloud image will greatly
> improve our build time.

Including a pre-populated cache in the OS image is a novel idea,
but I wonder if you would have to resort to that "workaround" ?

You could keep a local cache, and sync it from a "secondary cache".
We have some code for this, but none of it is up to sync with master.

> I noticed the same ccache filename (*.o, *.manifest, *.d) not
> necessarily has the same content (md5sum) on different machines and
> wonder if rsync is the right tool to do this, or is it feasible at
> all to combine ccache.

This is normal. The created files might have different timestamps
and such, that makes their checksum different. But they are supposed
to be interchangable, so none of those differences should *matter*
(if it does, then we are missing to hash something important...)

> I am trying to avoid ccache on NFS mount due to number of machines we
> are dealing with and performance of NFS is not promising. 

Have you tried out the memcached version ? It was developed for
that reason... You can have a cluster of such servers, if needed.

https://github.com/ccache/ccache/tree/dev/memcached

To further scale out, one can keep a local memcached proxy ("moxi")
and have the cluster be disk-backed (using couchbase) for restarts.

https://www.couchbase.com/memcached

/Anders


More information about the ccache mailing list