[distcc] Re: keeping localhost busy

Scott Lystig Fritchie nospam at snookles.com
Mon Sep 29 17:49:26 GMT 2003


>>>>> "mp" == Martin Pool  <mbp at samba.org> writes:

>> On 26 Sep 2003, Jeff <rustysawdust at yahoo.com> wrote:
>> Last month there was an interesting thread entitled "distcc over
>> slow net links". I have a similar problem in that two of the
>> "animals" on my farm have very old CPUs (although they are on a
>> local 100baseT network).  [...]

I have that same situation: distcc servers are heterogenous hardware,
some being much slower than others.  AFAIK, the best thing you can do
is to put the slowest machines at the end of your DISTCC_HOSTS list.

Many months ago I wrote a TCP load balancer specifically for use with
distcc.  It gives jobs to the fastest distcc server that is currently
idle.  Hosts are configured in fastest-to-slowest order.  Using the
balancer, several developers in the same office get fairest access to
the fastest distcc servers that are available that instant, rather
than all developers using the same "DISTCC_HOSTS='box1:4 box2:4 box3:2
...'"  and grinding box1 into dust while leaving the others idle.  It
also avoids naive round-robin assignment where a job is given to slow
"box8" when a faster "box2" is currently idle.

See http://www.snookles.com/erlang/tcpbalance/ for details.

mp> Timothee suggested killing the job on B and re-running it on
mp> localhost, but for at least this case it would be wasteful because
mp> B is as fast as localhost.  For C++ code, transit time is
mp> relatively small.

This is the path to madness ... or the path to a great deal of
complexity: e.g. keeping track of past compilation times for certain
files to know whether it would be more profitable to abort the remote
compilation and restart it locally.  Ick.

mp> I think the real problem here is that recursive Make is harmful.
mp> The correct fix would be for Make to start additional jobs while
mp> it is waiting for B.

I agree: recursive make is harmful.  That is expressly the reason why
I have been using SCons, http://www.scons.org/, to replace a *very*
large recursive Make build scheme (many millions of lines of C & C++
code).  SCons builds a single dependency tree and can walk it in
parallel and keep multiple CPUs/distcc backends busy regardless of how
the source is laid out.  I understand (but have not verified) that
Boost Jam's syntax is Make-like but creates a single dependency tree.
You can even create a single dependency tree using plain Make (though
it's difficult to do well).  IMO, if you don't have a global view of
dependencies, you won't be able to fully exploit parallelism of "make
-j", or whatever tool you're using.

-Scott



More information about the distcc mailing list