[distcc] High parallelism and local FS mounts

Dan Kegel dank at kegel.com
Wed Feb 8 17:34:23 GMT 2006


On 2/8/06, Syed U. Aqeel <aqeesye at iit.edu> wrote:
> I'd like to try distributing a large build over about 80 nodes. Each
> node has an identical build and I have mounted the file system with my
> source code (and where I'd like build products to be delivered) on each
> machine.
>
> Is it possible to get distcc to access the source locally instead of
> zipping it across with SSH? It seems to me that the latency involved in
> shipping so many tiny source files across the network and then putting
> together the build products would be substantial, particularly as the
> number of nodes available grows.

When you say "access the source locally", I take it you really mean
"access the source over the network file system".  Network file
systems, it turns out, do not perform especially well for tasks
like parallel builds.

When you say "zipping it across with SSH" and
"shipping so many tiny source files across the network",
I take it you really mean
"streaming the preprocessed source over a tcp connection".
In other words, it doesn't use ssh (unless you tell it to),
and it never sends individual tiny source files.

I use a similarly sized cluster.  The primary advantage of having
more than 20 nodes is that it allows multiple users to use the
cluster at the same time without overloading it, but for best results,
you need to apply one of the randomize or load balancing patches
to distcc.

You can't tell distcc to read the files locally.  For that, you'd
need to use a different distributed build system, e.g. dmake.
http://tools.openoffice.org/dmake/
I don't know how compatible that is with normal make.

- Dan

--
Wine for Windows ISVs: http://kegel.com/wine/isv


More information about the distcc mailing list