[distcc] distcc in disguise (a suggestion for all "wrappers")
Wayne Davison
wayned at users.sourceforge.net
Mon Feb 3 23:39:45 GMT 2003
I've been using distcc quite a bit lately because I've been playing
around with gentoo linux (and having a compile farm is the one thing
that makes the compiling of larger packages happen in a reasonable
amount of time on gentoo). Why disguise distcc as gcc (etc.)? I've
noticed several packages that do not honor the setting of CC or mess
up if it is set to something weird. I'm currently using the gentoo
wrapper setup that was previously mentioned on this list, but I'd like
to see something cleaner -- something that could be implemented for
all compiler-munging packages (distcc, ccache, etc.) and would let
each user choose what to use and in what order to use it.
Here's my idea:
We create a new idiom for each gcc-in-disguise package which describes
how they munge the PATH before running the next step in the chain. The
idea is that we strip all directories from the start of the PATH up to
and including the directory in which our executable was found. This
means that each package's set of scripts needs to go into their own
directory, but this restriction makes it easy for the user to
configure things fully.
For example:
PATH=/usr/lib/ccache:/usr/lib/colorgcc:/usr/lib/distcc:/usr/bin:/bin
export PATH
gcc -c foo.c
When the command is run, it finds gcc in /usr/lib/ccache, which (if it
needs to run something) removes its directory from the PATH, exports
the result, and then calls gcc. This results in the distcc version of
gcc getting called (this presumes that distcc gets some gcc-cloaking
logic -- see below), which calls "gcc" with the distcc directory
stripped from the start, and so on. Eventually the gcc in /usr/bin
gets called (or wherever gcc actually lives).
This logic improves on the current wrapper ideas I've seen around in
the following ways:
- Different packages don't need to know about each other -- it's up
to the user to choose the right PATH sequence, not the wrapper
writer.
- The wrapper doesn't have to search for the next item it's calling
and try to figure out if it's us or not -- we just tweak the PATH
and go. (This would simplify ccache, for instance.)
- A package can intercept only some of the gcc/cc/etc. command names
(by not having a full set of compiler names in its dir) and things
still behave properly (since the one that gets found strips off all
the starting dirs down to the directory where it was found, this
ensures that the next-called command continues on in the PATH
chain).
- The user can configure something like colorgcc into the chain
wherever they like, potentially adding extra meaning to the colors.
For instance, in the above setup, the colored compiler output is
cached into ccache, potentially allowing you to switch colors on
each successive run. It would also be possible to put colorgcc
after distcc (and in the path of each distccd) and have different
colors for each host's output.
- We avoid the use of absolute pathnames on the command-line, which
can currently cause problems for distcc between hosts that have the
compiler in different places.
- We configure how it all works based on one standard environment
variable, allowing each person to customize their setup and
avoiding any need for root permissions to tweak /etc files.
- The wrapper is actually ccache/distcc/colorgcc in disguise, so
there's no extra cost to running each bit like there would be one
or more separate wrapper programs. I.e., even if a single wrapper
script created the command "ccache colorgcc distcc -c foo.c" in one
go and ran it, that's actually one extra program-run/exec than just
having each piece pass the "gcc -c foo.c" on to whomever is next.
So, what do you think? Any downside I'm missing?
I've created a patch for distcc-1.1 that implements this (which should
apply to the CVS version without problem, I think):
http://www.clari.net/~wayne/wrapper.patch
[Non gentoo users can tune out now, if they like.]
I've also created a few things for gentoo linux:
- A new distcc ebuild:
http://www.clari.net/~wayne/distcc-1.1-r1.tar.bz2
- A new ccache ebuild :
http://www.clari.net/~wayne/ccache-2.1.1-r1.tar.bz2
- A ebuild.sh patch:
http://www.clari.net/~wayne/ebuild.sh.patch
The ebuild.sh patch just tweaks the PATH to call ccache and distcc if
they are enabled in the FEATURES.
The ccache ebuild just changes where the wrapper symlinks are stored
to /usr/lib/ccache from /usr/bin/ccache -- you can omit installing
this if you want to manually tweak the ebuild.sh.patch to change
/usr/lib/ccache back into /usr/bin/ccache.
..wayne..
More information about the distcc
mailing list