[ccache] Corrupt objects from three colliding compiles

Joel Rosdahl joel at rosdahl.net
Sun May 9 07:47:48 MDT 2010


On 2010-05-06 19:14, Wilson Snyder wrote:

> [...]
> [1273161404.432174,2424 ] === 2010-05-06 11:56:44 ===
> [1273161404.432636,2424 ] Hostname: maserati
> [1273161404.432643,2424 ] Cwd: /nfs/compiling/dir02
> [1273161404.432646,2424 ] Source file: INV.c
> [1273161404.432648,2424 ] Object file: INV.o
> [1273161404.432657,2424 ] Trying direct lookup
> [1273161404.439730,2424 ] Looking for object file hash in /nfs/ccache-dir/0/3/8/f19fe0a63fc088a9c570845a0a4eb-584.manifest
> [1273161404.440126,2424 ] Did not find object file hash in manifest
> [1273161404.440134,2424 ] Running preprocessor
> [1273161404.469188,2084 ] === 2010-05-06 11:56:44 ===
> [1273161404.469515,2084 ] Hostname: bird07
> [1273161404.469519,2084 ] Cwd: /nfs/compiling/dir03
> [1273161404.469521,2084 ] Source file: INV.c
> [1273161404.469523,2084 ] Object file: INV.o
> [1273161404.469532,2084 ] Trying direct lookup
> [1273161404.471093,2084 ] Looking for object file hash in /nfs/ccache-dir/0/3/8/f19fe0a63fc088a9c570845a0a4eb-584.manifest
> [1273161404.475427,2084 ] Did not find object file hash in manifest
> [1273161404.475434,2084 ] Running preprocessor
> [1273161404.487509,2424 ] Got object file hash from preprocessor
> [1273161404.492494,2424 ] Object file /nfs/ccache-dir/2/b/2/eedf0591498af8e5f3a27fcc89c4f-69975.o not in cache

Job 2424 doesn't see the object file in the cache...

> [1273161404.492522,2424 ] Running real compiler

...and decides to create it.

> [1273161404.511123,21936] === 2010-05-06 11:56:44 ===
> [1273161404.511506,21936] Hostname: bat05
> [1273161404.511510,21936] Cwd: /nfs/compiling/dir04
> [1273161404.511513,21936] Source file: INV.c
> [1273161404.511515,21936] Object file: INV.o
> [1273161404.511523,21936] Trying direct lookup
> [1273161404.514058,21936] Looking for object file hash in /nfs/ccache-dir/0/3/8/f19fe0a63fc088a9c570845a0a4eb-584.manifest
> [1273161404.517648,21936] Did not find object file hash in manifest
> [1273161404.517654,21936] Running preprocessor
> [1273161404.523658,2084 ] Got object file hash from preprocessor
> [1273161404.542730,2084 ] Created INV.o from /nfs/ccache-dir/2/b/2/eedf0591498af8e5f3a27fcc89c4f-69975.o

But here, job 2084 sees the object file in the cache ~34 ms before the
object file was created...

> [1273161404.564688,21936] Got object file hash from preprocessor
> [1273161404.569955,2084 ] Added object file hash to /nfs/ccache-dir/0/3/8/f19fe0a63fc088a9c570845a0a4eb-584.manifest
> [1273161404.570387,2084 ] Succeded getting cached result
> [1273161404.576412,2424 ] Stored in cache: /nfs/ccache-dir/2/b/2/eedf0591498af8e5f3a27fcc89c4f-69975.o

...by job 2424 here. Maybe the clocks are more than 12 ms apart?

> [1273161404.588776,21936] Created INV.o from /nfs/ccache-dir/2/b/2/eedf0591498af8e5f3a27fcc89c4f-69975.o   ***BAD OBJECT***
> [1273161404.596098,2424 ] Created INV.o from /nfs/ccache-dir/2/b/2/eedf0591498af8e5f3a27fcc89c4f-69975.o
> [1273161404.630619,21936] Added object file hash to /nfs/ccache-dir/0/3/8/f19fe0a63fc088a9c570845a0a4eb-584.manifest
> [1273161404.634864,21936] Succeded getting cached result
> [1273161404.638443,2424 ] Added object file hash to /nfs/ccache-dir/0/3/8/f19fe0a63fc088a9c570845a0a4eb-584.manifest
> 
> It looks like the third job's read of the cache is stuck in
> between the first's write and the second's.  Note that times
> may be off by >8ms, this would put the bad object read right
> on top of the object update by the second compile when it
> hits the NFS server.

Since renames should be atomic on NFS, I don't see any problem here, but
I'm by no means an NFS expert.

-- Joel


More information about the ccache mailing list