[ccache] Bug? ccache: error: Could not stat <cache file> (permission denied when creating?): No such file or directory

Michael Ellerman michael at ellerman.id.au
Thu Dec 6 19:51:40 MST 2012


Hi folks,

We are using ccache for building the Linux kernel, and recently we've
started getting errors from ccache.

Our cache is 5GB, which seems to sit at around 3-4GB used in practice.
We are also using CCACHE_HARDLINK and CCACHE_NOCOMPRESS.

The error we see is like:
                                   
        ccache: error: Could not stat /scratch/kisskb/ccache/0/d/01dfc7d17f46f8d9b361ba9323e8c9-1921690.o.tmp.stdout.Sprygo.8282
        (permission denied when creating?): No such file or directory                

It started appearing with ccache v3.1.7, which is the first release to
contain commit 3595be0.

(http://gitweb.samba.org/?p=ccache.git;a=commitdiff;h=3595be06de93cb8c4255171be8a647af4e5d460d)

We have been testing with the latest upstream 8ff1dac5.

It seems the cause is not that the stdout file couldn't be created, it's
that the file has been deleted, by another ccache process.

Here is a log from iwatch (inotify watch), watching the ccache
directory:

[ 7/Dec/2012 13:35:33] IN_CREATE /scratch/kisskb/ccache/0/d/01dfc7d17f46f8d9b361ba9323e8c9-1921690.o.tmp.stdout.Sprygo.8282

[ 7/Dec/2012 13:36:00] IN_MOVED_FROM /scratch/kisskb/ccache/0/d/01dfc7d17f46f8d9b361ba9323e8c9-1921690.o.tmp.stdout.Sprygo.8282
[ 7/Dec/2012 13:36:00] IN_MOVED_TO /scratch/kisskb/ccache/0/d/01dfc7d17f46f8d9b361ba9323e8c9-1921690.o.tmp.stdout.Sprygo.8282.Sprygo.11295.rmXXXXXX
[ 7/Dec/2012 13:36:00] IN_DELETE /scratch/kisskb/ccache/0/d/01dfc7d17f46f8d9b361ba9323e8c9-1921690.o.tmp.stdout.Sprygo.8282.Sprygo.11295.rmXXXXXX

[ 7/Dec/2012 13:36:31] IN_CLOSE_WRITE /scratch/kisskb/ccache/0/d/01dfc7d17f46f8d9b361ba9323e8c9-1921690.o.tmp.stdout.Sprygo.8282.Sprygo.11295.rmXXXXXX


You can see the stdout file is created by pid 8282, then 27 seconds
later it is renamed and deleted by pid 11295 using x_unlink().

Then it is closed (by 8282), but because it has been renamed 8282 will
not be able to stat() it, and we hit the error case.

So basically 11295 is running cleanup_dir() and deleting files that are
still being written to. The full log shows 11295 deleting lots of files.

>From a brief look at the code I don't see any logic which would prevent
this from happening, other than that usually there is enough space in
the cache so that only old files are deleted.

I'm not sure what the best way to avoid this is. The old logic of
noticing the stdout is missing and calling failed() at least doesn't
break the build.

Taking a write lease on the stdout file before deleting it should work,
but not on NFS or Windows.

Leaving the stdout files alone would fix the immediate problem, but then
the .o etc. would be missing.

Or we could just make our cache bigger and the problem would probably go
away :)

cheers





More information about the ccache mailing list