[ccache] HOWTO Use CCache + NFS + distributed builds

John Coiner john.coiner at amd.com
Thu Mar 29 16:40:03 GMT 2007


I got CCache working in this scenario:

   * Parallel builds, distributed across many hosts.
   * CCACHE_DIR is located on NFS.
   * Output files are hardlinks into the cache to save space.

Maybe this will help someone else to get it working faster than I did :) 
  There were two hurdles along the way:

1.
CCache-2.4 isn't NFS-resistant out of the box. The suggested workaround 
is to use the "no_subtree_check" mount option, which my NFS server 
doesn't support.

This patch fixes CCache on NFS, regardless of mount options. It should 
work on any filesystem.


diff -u /home/jcoiner/ccache-2.4/ccache.c ccache-2.4_nfs_fix/ccache.c
--- /home/jcoiner/ccache-2.4/ccache.c	Mon Sep 13 06:38:30 2004
+++ ccache-2.4_nfs_fix/ccache.c	Tue Mar 27 12:39:35 2007
@@ -149,6 +149,22 @@
   	return ret;
   }

+static int safe_rename(const char* oldpath, const char* newpath)
+{
+    /* safe_rename is for creating entries in the cache.
+
+       Works like rename(), but it never overwrites an existing
+       cache entry. This avoids corruption on NFS. */
+    int status = link( oldpath, newpath );
+    if( status == 0 || errno == EEXIST )
+    {
+	return unlink( oldpath );
+    }
+    else
+    {
+	return -1;
+    }
+}

   /* run the real compiler and put the result in cache */
   static void to_cache(ARGS *args)
@@ -232,8 +248,8 @@

   	if (stat(tmp_stderr, &st1) != 0 ||
   	    stat(tmp_hashname, &st2) != 0 ||
-	    rename(tmp_hashname, hashname) != 0 ||
-	    rename(tmp_stderr, path_stderr) != 0) {
+	    safe_rename(tmp_hashname, hashname) != 0 ||
+	    safe_rename(tmp_stderr, path_stderr) != 0) {
   		cc_log("failed to rename tmp files - %s\n", strerror(errno));
   		stats_update(STATS_ERROR);
   		failed();


2.
When CCache creates a hardlinked output file, it calls utime() to update 
the timestamp on the object, so that Make realizes that the object has 
changed.

On NFS, utime() has no coherency guarantee, AFAIK. When utime() runs on 
host A, and our parallel implementation of Make is running on host B, 
sometimes Make doesn't see the new timestamp soon enough -- and neglects 
to relink the final binary. That's a one-way ticket to Silent Mysterious 
Failure Town.

Instead of relying on the object file timestamp, we create a dummy file 
with a reliable timestamp:

objs/foo.o objs/foo.o.built :
	if ( ccache gcc -o foo.o -c foo.c ) ; \
	then touch objs/foo.o.built ; \
	else exit 1; \
	fi

binary : objs/foo.o.built
	gcc -o binary objs/foo.o

NFS does make a coherency guarantee, that if a file is written and 
close()d on host A, and subsequently open()ed on host B, that the second 
open() will reflect all modifications and attributes from the close(). 
Since Make does open() when checking timestamps, and the dummy file is 
close()d when it's created, the binary will always relink after the 
object is recompiled.

Good luck.

John




More information about the ccache mailing list