[ccache] PATCH: reduce stat() calls on multilevel caches

Wilson Snyder wsnyder at wsnyder.org
Mon Oct 4 15:08:35 MDT 2010


I'm trying to reduce the load a bit on our NFS ccache server.  I noted the
following pattern of accesses on a cache hit:

# stat("$CACHE_DIR", {...}) = 0
# stat("$CACHE_DIR/tmp", {...}) = 0
# stat("$CACHE_DIR/CACHEDIR.TAG", {...}) = 0
* stat("$CACHE_DIR/3", {...}) = 0
* stat("$CACHE_DIR/3/d", {...}) = 0
# stat("$CACHE_DIR/3/d/3", {...}) = 0
  open("$CACHE_DIR/3/d/3/df7ee0d1f41405e97cc3c5a639dba-14689.manifest", O_RDONLY) = 4
* stat("$CACHE_DIR/f", {...}) = 0
* stat("$CACHE_DIR/f/6", {...}) = 0
# stat("$CACHE_DIR/f/6/c", {...}) = 0
* stat("$CACHE_DIR/f", {...}) = 0
* stat("$CACHE_DIR/f/6", {...}) = 0
# stat("$CACHE_DIR/f/6/c", {...}) = 0
* stat("$CACHE_DIR/f", {...}) = 0
* stat("$CACHE_DIR/f/6", {...}) = 0
# stat("$CACHE_DIR/f/6/c", {...}) = 0
  stat("$CACHE_DIR/f/6/c/51577239e589de02d30b94152958b-2123817.o", {...}) = 0
  open("$CACHE_DIR/f/6/c/51577239e589de02d30b94152958b-2123817.o", O_RDONLY) = 4
  utimes("$CACHE_DIR/f/6/c/51577239e589de02d30b94152958b-2123817.o", NULL) = 0
  utimes("$CACHE_DIR/f/6/c/51577239e589de02d30b94152958b-2123817.stderr", NULL) = -1 ENOENT (No such file or directory)
  open("$CACHE_DIR/f/6/c/51577239e589de02d30b94152958b-2123817.stderr", O_RDONLY) = -1 ENOENT (No such file or directory)
  symlink("lion02:19843:1286222896", "$CACHE_DIR/f/stats.lock") = 0
# stat("$CACHE_DIR/f/stats", {...}) = 0
  open("$CACHE_DIR/f/stats", O_RDONLY) = 4
  open("$CACHE_DIR/f/stats.tmp.lion02.19843", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 4
  rename("$CACHE_DIR/f/stats.tmp.lion02.19843", "$CACHE_DIR/f/stats") = 0
  unlink("$CACHE_DIR/f/stats.lock") = 0

This patch removes those with *s.

There are still extra calls (the #s); it would be best if
the code assumed directories existed and just tried to read
the manifest or .o.  Then the directory created only if a
new file is added.  But, that does more violence then I'm
willing to undertake at the moment.  Improvements to do that
are welcome.

Thanks


diff --git a/ccache.c b/ccache.c
index b84c84e..eed076f 100644
--- a/ccache.c
+++ b/ccache.c
@@ -253,6 +253,10 @@ clean_up_tmp_files()
 /*
  * Transform a name to a full path into the cache directory, creating needed
  * sublevels if needed. Caller frees.
+ *
+ * stat() calls can be expensive on the cache, so we both avoid checking a
+ * single path multiple times, and also avoid stat()ing each level in a
+ * many level cache.
  */
 static char *
 get_path_in_cache(const char *name, const char *suffix)
@@ -260,17 +264,32 @@ get_path_in_cache(const char *name, const char *suffix)
 	int i;
 	char *path;
 	char *result;
+	struct stat st;
 
+	/* Form the entire path */
 	path = x_strdup(cache_dir);
 	for (i = 0; i < nlevels; ++i) {
 		char *p = format("%s/%c", path, name[i]);
 		free(path);
 		path = p;
-		if (create_dir(path) != 0) {
-			cc_log("Failed to create %s", path);
-			failed();
+	}
+
+	/* First see if whole path exists, so can avoid directory one-by-one check */
+	if (!(stat(path, &st) == 0
+	      && S_ISDIR(st.st_mode))) {
+		free(path);
+		path = x_strdup(cache_dir);
+		for (i = 0; i < nlevels; ++i) {
+			char *p = format("%s/%c", path, name[i]);
+			free(path);
+			path = p;
+			if (create_dir(path) != 0) {
+				cc_log("Failed to create %s", path);
+				failed();
+			}
 		}
 	}
+
 	result = format("%s/%s%s", path, name + nlevels, suffix);
 	free(path);
 	return result;



More information about the ccache mailing list