[distcc] Re: distributed compilation with gcc: distcc

Alexandre Oliva oliva at lsd.ic.unicamp.br
Tue Jul 23 05:46:01 GMT 2002


On Jul 22, 2002, Martin Pool <mbp at samba.org> wrote:

> I suspect the easiest thing will be to share the builddir over NFS.

I agree.  I've been playing with distcc (along with ccache) and found
it to significantly speed up builds of gcc and gdb using 3-6 machines
at home (one of my favorite examples was that an all-gcc all-gdb build
went down from 7 to just short of 2 minutes, using 4 build machines,
and that without ccache!)

There are two problems with extending this to typical builds of gcc:

- when we build target libraries, we use the just-built xgcc as the
  compiler (except in the case of Canadian crosses), but the
  definition of CC_FOR_TARGET does not contain distcc in it, and it's
  not easy to put it in.  This means all of libgcc, libstdc++,
  libjava, etc, get compiled on the build machine.  If you have a
  distcc farm and tell make -j to use those many machines, your build
  machine will thrash when it gets to the point of building target
  libraries, because it won't be able to share the work with other
  machines.  In the case of Canadian crosses, the problem may get more
  complicated, because the PATH (to which one has presumably added the
  directory containing the cross tools necessary to build the Canadian
  cross) is not passed to the remote distccd, so it is likely to fail
  to find the cross tool.

- when we bootstrap gcc natively, we use stage1/xgcc and stage2/xgcc
  (with relative pathnames) to build the next stage, so distcc can't
  be used as it is now, since it expects compiler pathnames to be
  found in the PATH; it does not `cd' to a directory remotely before
  starting the compiler.  Also, the CC passed to every stage's build
  does not contain distcc (*), and there's no easy way to do it right
  now.  So, the same thrashing problem occurs when we bootstrap gcc
  with make -jN.


My suggestion to alleviate this problem involves 3 tasks:

- add a flag to distcc (and modify its protocol accordingly) to enable
  it to send a (partial?) PATH over to the daemon, such that one
  doesn't have to restart the daemon remotely to find cross compilers
  for Canadian crosses.  i.e., distcc -P /my/toolchain/bin would
  make sure the intended compiler is used for the entire build.

- add a flag to distcc (and modify its protocol accordingly) to tell
  distccd to chdir to given directory before running the command.
  Ideally, this flag should take an argument that specifies a
  transform pattern to be applied to the CWD, such that say a local
  pathname can be turned into a network-visible automounted pathname
  (i.e., distcc -C s:^:/net/`uname -n`: causes /local/tmp/mybuild/gcc
  to be referenced as /net/<hostname>/local/tmp/mybuild/gcc, assuming
  /net is where automount or amd does host mounts.  In some cases,
  such as pathnames that already are network-uniform (a shared /home),
  this would not be necessary, so we may want to decouple the request
  to chdir from pathname transforms, but I don't think it's worth it.

- introduce hooks in the GCC Makefiles to let one set prefixes for
  CC_FOR_TARGET and CC for stages, such that distcc can be easily
  prepended, along with the additional flags introduced above.  The
  latter would be used for native builds and bootstraps, as well as
  non-Canadian crosses (even though it wouldn't hurt Canadian crosses
  too), whereas the former would be mostly useful for Canadian
  crosses, but it would also help all other builds by ensuring that
  the intended toolchain is used for the initial build.


* nor ccache, but that would be pointless: the timestamp change in the
  compiler driver invalidates the cache; but then, when using ccache
  and distcc, ccache thinks distcc is the compiler driver, so I think
  we may actually end up with false matches.  Perhaps there should be
  a way to tell ccache to compute a checksum of a file compiler, and
  give it that checksum in the command line, to be used instead of
  taking info from the compiler driver.

-- 
Alexandre Oliva   Enjoy Guarana', see http://www.ic.unicamp.br/~oliva/
Red Hat GCC Developer                 aoliva@{redhat.com, gcc.gnu.org}
CS PhD student at IC-Unicamp        oliva@{lsd.ic.unicamp.br, gnu.org}
Free Software Evangelist                Professional serial bug killer




More information about the distcc mailing list