[ccache] Combining multiple ccache into one

Jason Zhou jzhou at magicleap.com
Mon Mar 19 23:01:08 UTC 2018


Thank you Anders for your response! More questions inline:

> On Mar 18, 2018, at 5:32 AM, Anders Björklund via ccache <ccache at lists.samba.org> wrote:
> 
> Jason Zhou wrote:
>> I am looking for an efficient way to correctly combine multiple
>> ccache from hundreds of build machines into a single ccache to build
>> a super set ccache. We use 200+ autoscaled cloud machines in our
>> build farm and each machine builds a random subsets of the source
>> tree. ccache size on each machine is ~70GB and contains ~500K files.
>> Having a superset ccache pre-built in the cloud image will greatly
>> improve our build time.
> 
> Including a pre-populated cache in the OS image is a novel idea,
> but I wonder if you would have to resort to that "workaround" ?
> 
> You could keep a local cache, and sync it from a "secondary cache".
> We have some code for this, but none of it is up to sync with master.

We are doing this because we need to launch new cloud instances fast to quickly response to build requests. Sync’ing from a secondary cache will take much longer.

> 
>> I noticed the same ccache filename (*.o, *.manifest, *.d) not
>> necessarily has the same content (md5sum) on different machines and
>> wonder if rsync is the right tool to do this, or is it feasible at
>> all to combine ccache.
> 
> This is normal. The created files might have different timestamps
> and such, that makes their checksum different. But they are supposed
> to be interchangable, so none of those differences should *matter*
> (if it does, then we are missing to hash something important…)

So ccache files are interchangeable as long as they have the same filenames (hash name) and this applies to all ccache files: *.o, *.manigest, *.d? If so we can use “rsync -a —ignore-existing” to combine multiple ccache much faster (no need to check file timestamp and size).

> 
>> I am trying to avoid ccache on NFS mount due to number of machines we
>> are dealing with and performance of NFS is not promising. 
> 
> Have you tried out the memcached version ? It was developed for
> that reason... You can have a cluster of such servers, if needed.
> 
> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ccache_ccache_tree_dev_memcached&d=DwICaQ&c=0ia8zh_eZtQM1JEjWgVLZg&r=ByzQxNeEuzdGEv2mSxiU8HCyQpz_6zJt6ZhQ1YL1R9w&m=uaT5rzgzrAj-kp2ARYq2pnY6oVlST6ntGbXXVIOWZxE&s=5-mcUIFb73nP0jUMbgxUBzXqFo4sWO9fAWzuFwV_yDY&e=

I’d love to explore memcached version. The document link above doesn’t mention any memcached info, do you have a link that describes how to use memcache for ccache?

Thanks again for your help!

Jason

> 
> To further scale out, one can keep a local memcached proxy ("moxi")
> and have the cluster be disk-backed (using couchbase) for restarts.
> 
> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.couchbase.com_memcached&d=DwICaQ&c=0ia8zh_eZtQM1JEjWgVLZg&r=ByzQxNeEuzdGEv2mSxiU8HCyQpz_6zJt6ZhQ1YL1R9w&m=uaT5rzgzrAj-kp2ARYq2pnY6oVlST6ntGbXXVIOWZxE&s=Hl8l6FfL0005SPe57kPTjCbvO3jTXHK4u3Irx1L5LCs&e=
> 
> /Anders
> _______________________________________________
> ccache mailing list
> ccache at lists.samba.org
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.samba.org_mailman_listinfo_ccache&d=DwICaQ&c=0ia8zh_eZtQM1JEjWgVLZg&r=ByzQxNeEuzdGEv2mSxiU8HCyQpz_6zJt6ZhQ1YL1R9w&m=uaT5rzgzrAj-kp2ARYq2pnY6oVlST6ntGbXXVIOWZxE&s=RN0I5x9rfUNh6hpFtn0HPVXATIe0-Zx31xa9kF56aFk&e=




More information about the ccache mailing list