[distcc] distcc in disguise (a suggestion for all "wrappers")

Wayne Davison wayned at users.sourceforge.net
Mon Feb 3 23:39:45 GMT 2003

I've been using distcc quite a bit lately because I've been playing
around with gentoo linux (and having a compile farm is the one thing
that makes the compiling of larger packages happen in a reasonable
amount of time on gentoo). Why disguise distcc as gcc (etc.)? I've
noticed several packages that do not honor the setting of CC or mess
up if it is set to something weird. I'm currently using the gentoo
wrapper setup that was previously mentioned on this list, but I'd like
to see something cleaner -- something that could be implemented for
all compiler-munging packages (distcc, ccache, etc.) and would let
each user choose what to use and in what order to use it.

Here's my idea:

We create a new idiom for each gcc-in-disguise package which describes
how they munge the PATH before running the next step in the chain. The
idea is that we strip all directories from the start of the PATH up to
and including the directory in which our executable was found. This
means that each package's set of scripts needs to go into their own
directory, but this restriction makes it easy for the user to
configure things fully.

For example:

export PATH
gcc -c foo.c

When the command is run, it finds gcc in /usr/lib/ccache, which (if it
needs to run something) removes its directory from the PATH, exports
the result, and then calls gcc. This results in the distcc version of
gcc getting called (this presumes that distcc gets some gcc-cloaking
logic -- see below), which calls "gcc" with the distcc directory
stripped from the start, and so on. Eventually the gcc in /usr/bin
gets called (or wherever gcc actually lives).

This logic improves on the current wrapper ideas I've seen around in
the following ways:

 - Different packages don't need to know about each other -- it's up
   to the user to choose the right PATH sequence, not the wrapper

 - The wrapper doesn't have to search for the next item it's calling
   and try to figure out if it's us or not -- we just tweak the PATH
   and go. (This would simplify ccache, for instance.)

 - A package can intercept only some of the gcc/cc/etc. command names
   (by not having a full set of compiler names in its dir) and things
   still behave properly (since the one that gets found strips off all
   the starting dirs down to the directory where it was found, this
   ensures that the next-called command continues on in the PATH

 - The user can configure something like colorgcc into the chain
   wherever they like, potentially adding extra meaning to the colors.
   For instance, in the above setup, the colored compiler output is
   cached into ccache, potentially allowing you to switch colors on
   each successive run. It would also be possible to put colorgcc
   after distcc (and in the path of each distccd) and have different
   colors for each host's output.

 - We avoid the use of absolute pathnames on the command-line, which
   can currently cause problems for distcc between hosts that have the
   compiler in different places.

 - We configure how it all works based on one standard environment
   variable, allowing each person to customize their setup and
   avoiding any need for root permissions to tweak /etc files.

 - The wrapper is actually ccache/distcc/colorgcc in disguise, so
   there's no extra cost to running each bit like there would be one
   or more separate wrapper programs. I.e., even if a single wrapper
   script created the command "ccache colorgcc distcc -c foo.c" in one
   go and ran it, that's actually one extra program-run/exec than just
   having each piece pass the "gcc -c foo.c" on to whomever is next.

So, what do you think? Any downside I'm missing?

I've created a patch for distcc-1.1 that implements this (which should
apply to the CVS version without problem, I think):


[Non gentoo users can tune out now, if they like.]

I've also created a few things for gentoo linux:

 - A new distcc ebuild:  

 - A new ccache ebuild :

 - A ebuild.sh patch:
The ebuild.sh patch just tweaks the PATH to call ccache and distcc if
they are enabled in the FEATURES.

The ccache ebuild just changes where the wrapper symlinks are stored
to /usr/lib/ccache from /usr/bin/ccache -- you can omit installing
this if you want to manually tweak the ebuild.sh.patch to change
/usr/lib/ccache back into /usr/bin/ccache.


More information about the distcc mailing list