[ccache] Stumbling blocks with ccache and embedded/encapsulated environments

Wed Nov 10 16:56:29 MST 2010

I don't want to rain on peoples' parade here, because ccache is a
great product that has real benefits, but I do want to share some of
our findings regarding the use of ccache in our very large product --
we were surprised by them, and you may be as well.  These findings are
specifically for *large products*.  In our case, the total source code
file size is on the order of 3 gigabytes (which includes not only
C/C++ but also Java source files, a couple hundred thousand lines of
makefiles, etc).  It's the Android mobile phone OS, fwiw: it builds
something like 1-2 gigabytes of .o files from C/C++ during a full
build, and does a ton of Java compilation, resource compilation,
Dalvik compilation, etc as well.

Very short version:  if your 'make' dependencies or equivalent are
well-written, using ccache will almost always *increase* your
incremental build times.  This wasn't immediately obvious to us but
makes sense in hindsight: if your dependencies are well constructed,
then when you run 'make' it won't try to build something unless it
really has changed.  We see *very* few ccache hits over time when
doing incremental builds.

Slightly longer version:  if your 'make' dependencies are
well-written, then using ccache will almost always increase your
incremental build times.  Even if your dependencies are slightly
inefficient (i.e. you're getting some unnecessary compilation on a
regular basis, but not tons), ccache may well still be slowing you
down overall on incremental builds unless your computers have lots of
RAM relative to the size of your project.  It turns out that fitting
the build inputs *and* outputs into the VM/filesystem buffer cache
usually provides much more build-time benefit than ccache.  (Unless
your project is very large, you're almost certainly fitting everything
in ram as a matter of course and so ccache is a fine idea.)

If you're regularly doing something equivalent to a 'clean' build from
make's perspective, but with a hot ccache, then ccache is solving
exactly your problem and you definitely want to use it.

Long answer; only applicable to very large projects:

The issue is around VM/file system buffer cache management.  If you're
using ccache, then you'll effectively be doubling the number of .o
files that are paged into memory during the course of a build.  If the
extra VM paging winds up pushing source code out of the VM buffer
cache, then that is going to be a significant hit when your build
system actually needs to build that source file -- it'll have to go to
the disk for it rather that just reading it out of memory.  Ideally
your build machines will have enough RAM to hold in the buffer cache
the entire source base, plus the entire set of .o / .class / etc
intermediate build output files, *plus* the entire ccache contents for
the project.  That's on top of the actual memory usage of the
compilers etc that run during the course of the build.  As soon as
things are being kicked out of the buffer cache during the course of a
build, you'll take a speed hit that will more than eradicate the
benefits due to using ccache.

Our product may be a bit pathological in this way: a lot of the
RAM-hungry tools and source files are not C/C++, which means a heavy
demand on RAM in ways that ccache can't help with.  The ordering of
RAM in a typical build system makes things worse, in fact; the basic
C/C++ compilation units during which ccache pushes source files and .o
contents through the buffer cache are typically earlier during
compilation, then later on in the build things like linkers run, which
are RAM hungry themselves and wind up pushing the sources out of the
buffer cache on memory-constrained machines.  Then when you want to
run an incremental build the sources have to be paged right back in
from disk, and there goes your time benefit.

Android's ccache footprint is one or two gigabytes(!) of .o files.
We've found that on computers with less than around 20-24 gigs of RAM
(!!), ccache tends to increase build times -- 12 gigs isn't enough to
hold everything in the buffer cache; 24 is.  Once everything is in the
buffer cache, ccache's ability to avoid running gcc is a win,
especially for clean builds with a hot cache.

--
chris tate
android framework engineer

On Wed, Nov 10, 2010 at 2:54 PM, Paul Smith <paul at mad-scientist.net> wrote:
> Hi all; I've been considering for a long time enabling ccache support in
> our build environment.  However, our environment has a few unusual
> aspects and I wondered if anyone had any thoughts about steps we might
> take to ensure things are working properly.  The documentation I've
> found is just not _quite_ precise enough about exactly what is added to
> the hash.
>
> Very briefly we have multiple "workspaces" per user, mostly stored on
> their own systems.  These workspaces are typically pointing to different
> lines of development, and in those some files are the same and some are
> different (pretty basic).  What I'd like to do is have one ccache per
> user per host, so that all the workspaces for a given user on a given
> host share the cache (rather than, for example, one cache per workspace
> or sharing caches between users and/or hosts--that could come later).
> Again, pretty straightforward.
>
> The first interesting bit is that in our build environment we have a set
> of (multiple different) cross-compilers, along with completely
> encapsulated environments (usr/include, usr/lib, etc.) for different
> targets.  These compilers and environments are packed up into tarballs
> which are kept in our source tree, and unpacked by our build system when
> our build starts.  We do not use the native compiler at all.
>
> The second interesting bit is that the actual file that is invoked is
> not the actual compiler, but a symlink to a shell script wrapper that
> invokes the real compiler with a set of extra command-line arguments.
> So we invoke a command like "i686-rhel4-linux-glibc-g++", which is a
> symlink to a generic shell script like "toolchain-wrapper.sh", which
> unpacks the symlink name ("i686-rhel4-linux-glibc-g++") to determine
> that we want to run the g++ compiler to generate 32bit code compiled
> against a Red Hat EL 4/GNU libc environment, then invokes a real
> compiler with the right options to make that happen.  A different
> command (say "x86_64-rhel5-linux-glibc-gcc") is a symlink to the same
> "toolchain-wrapper.sh" file, but you get very different results.
>
> The final interesting thing is that when we unpack these compiler
> tarballs we use the -m option so that all the files we unpack have their
> times set to "now", rather than the times they had when they were packed
> up.  Thinking about this I believe we could remove this in this case, so
> the timestamps would be preserved, if that would be useful.
>
>
> So, a few things: first the default mtime/size to determine if compilers
> have changed won't work well for us.  Every time I do a clean build and
> my compilers are unpacked again, the timestamp on them will change (due
> to tar -m), so I won't get any cache hits (right?)
>
> If I remove the -m so that the timestamps in the tarball are preserved,
> then the timestamps will always be identical, unless I load up a new
> compiler build.  So that's actually nice.
>
> What about the script wrapper?  Loading a new compiler will change the
> timestamp (at least) on the script wrapper as well but here I worry
> about incorrect duplication in the same build.  For example suppose I
> build a file into two objects like this:
>
>        i686-rhel4-linux-glibc-g++ -o 32bit/foo.o -c foo.c
>        x86_64_rhel4-linux-glibc-g++ -o 64bit/foo.o -c foo.c
>
> Now both of these are symlinks to the same wrapper script so ccache will
> cache the same mtime/size for both compilers.  Also, they have the same
> flags at this level.  Underneath, of course, the wrapper script will
> invoke completely different compilers with different flags but that's
> not visible to ccache.  Suppose the preprocessor output was the same in
> both cases so that's not an issue: only the compiler generated 32bit .o
> for the first and a 64bit .o for the second.
>
> So, my question is, is the NAME of the compiler part of the hash as well
> as the mtime/size, so that ccache won't consider the second one to be a
> copy of the first?
>
> Of course I can always resort to CCACHE_COMPILERCHECK='%compiler% -v'
> which will tell me what I want to know (that these compilers are very
> different).  But it would be nice to avoid that overhead if possible.
>
> Also if I DO go with CCACHE_COMPILERCHECK, is ONLY that output part of
> the hash?  Or is the mtime/size for the compiler also part of the hash?
>
> It would be nice for debugging/etc. if there was a way to see exactly
> what elements are going into the hash for a given target.
>
>
> Sorry for the long email; thanks for any pointers or tips!
>
> _______________________________________________
> ccache mailing list
> ccache at lists.samba.org
> https://lists.samba.org/mailman/listinfo/ccache
>