[distcc] Re: distributed compilation with gcc: distcc
Alexandre Oliva
oliva at lsd.ic.unicamp.br
Tue Jul 23 05:46:01 GMT 2002
On Jul 22, 2002, Martin Pool <mbp at samba.org> wrote:
> I suspect the easiest thing will be to share the builddir over NFS.
I agree. I've been playing with distcc (along with ccache) and found
it to significantly speed up builds of gcc and gdb using 3-6 machines
at home (one of my favorite examples was that an all-gcc all-gdb build
went down from 7 to just short of 2 minutes, using 4 build machines,
and that without ccache!)
There are two problems with extending this to typical builds of gcc:
- when we build target libraries, we use the just-built xgcc as the
compiler (except in the case of Canadian crosses), but the
definition of CC_FOR_TARGET does not contain distcc in it, and it's
not easy to put it in. This means all of libgcc, libstdc++,
libjava, etc, get compiled on the build machine. If you have a
distcc farm and tell make -j to use those many machines, your build
machine will thrash when it gets to the point of building target
libraries, because it won't be able to share the work with other
machines. In the case of Canadian crosses, the problem may get more
complicated, because the PATH (to which one has presumably added the
directory containing the cross tools necessary to build the Canadian
cross) is not passed to the remote distccd, so it is likely to fail
to find the cross tool.
- when we bootstrap gcc natively, we use stage1/xgcc and stage2/xgcc
(with relative pathnames) to build the next stage, so distcc can't
be used as it is now, since it expects compiler pathnames to be
found in the PATH; it does not `cd' to a directory remotely before
starting the compiler. Also, the CC passed to every stage's build
does not contain distcc (*), and there's no easy way to do it right
now. So, the same thrashing problem occurs when we bootstrap gcc
with make -jN.
My suggestion to alleviate this problem involves 3 tasks:
- add a flag to distcc (and modify its protocol accordingly) to enable
it to send a (partial?) PATH over to the daemon, such that one
doesn't have to restart the daemon remotely to find cross compilers
for Canadian crosses. i.e., distcc -P /my/toolchain/bin would
make sure the intended compiler is used for the entire build.
- add a flag to distcc (and modify its protocol accordingly) to tell
distccd to chdir to given directory before running the command.
Ideally, this flag should take an argument that specifies a
transform pattern to be applied to the CWD, such that say a local
pathname can be turned into a network-visible automounted pathname
(i.e., distcc -C s:^:/net/`uname -n`: causes /local/tmp/mybuild/gcc
to be referenced as /net/<hostname>/local/tmp/mybuild/gcc, assuming
/net is where automount or amd does host mounts. In some cases,
such as pathnames that already are network-uniform (a shared /home),
this would not be necessary, so we may want to decouple the request
to chdir from pathname transforms, but I don't think it's worth it.
- introduce hooks in the GCC Makefiles to let one set prefixes for
CC_FOR_TARGET and CC for stages, such that distcc can be easily
prepended, along with the additional flags introduced above. The
latter would be used for native builds and bootstraps, as well as
non-Canadian crosses (even though it wouldn't hurt Canadian crosses
too), whereas the former would be mostly useful for Canadian
crosses, but it would also help all other builds by ensuring that
the intended toolchain is used for the initial build.
* nor ccache, but that would be pointless: the timestamp change in the
compiler driver invalidates the cache; but then, when using ccache
and distcc, ccache thinks distcc is the compiler driver, so I think
we may actually end up with false matches. Perhaps there should be
a way to tell ccache to compute a checksum of a file compiler, and
give it that checksum in the command line, to be used instead of
taking info from the compiler driver.
--
Alexandre Oliva Enjoy Guarana', see http://www.ic.unicamp.br/~oliva/
Red Hat GCC Developer aoliva@{redhat.com, gcc.gnu.org}
CS PhD student at IC-Unicamp oliva@{lsd.ic.unicamp.br, gnu.org}
Free Software Evangelist Professional serial bug killer
More information about the distcc
mailing list