[distcc] keeping localhost busy

Martin Pool mbp at samba.org
Mon Sep 29 00:51:15 GMT 2003


On 26 Sep 2003, Jeff <rustysawdust at yahoo.com> wrote:
> Last month there was an interesting thread entitled "distcc over slow net 
> links". I have a similar problem in that two of the "animals" on my farm 
> have very old CPUs (although they are on a local 100baseT network). On a 
> real farm these animals would probably be turned into glue or hamburger, 
> but on a distcc farm I think that's animal cruelty! :)

In fact the same problem can happen even if all of the machines are
the same speed.  I can see the problem pretty clearly when building
OpenPegasus, which uses a recursive Make through many directories and
has some C++ files which are very slow to build.  It spends a lot of
time blocked waiting for large files to complete.
 
> Here's a quote from the original thread posted by Timothee:
> 
> >What I'm thinking, is that once local hosts are starved, distcc should
> >find out that there is stuff running on slow hosts, and dupe the compile
> >work on the local hosts, sending back whatever finishes first.
> 
> The thread went into some interesting discussions on how to choose a faster 
> machine, and how distcc might be able to keep track of the speeds of each 
> server. In the case of this email, I believe "local hosts" meant the hosts 
> on his fast local network. But what I'd really like to do is to simply keep 
> "localhost" busy. That wouldn't require any of the additional complexity of 
> tracking the power of each server, and it also would be a little bit more 
> friendly if more than one person was trying to use the farm at a time.
> 
> The project I'm working on consists of over 80 libraries, and some of them 
> are quite large. As make gets towards the end of each library, I often see 
> those two slow machines drag out the end of the build for 20 or 30 seconds. 
> With this many libraries, each one of those pauses really starts to hurt. 
> Using distcc's excellent new graph tool, it becomes especially obvious when 
> the fast hosts have all "scrolled off to white" and you see the two green 
> bars remaining.

Maybe you should build more than one library at a time, so that even
when reaching the end of any one build, there are several jobs that
can be run?

> I haven't yet looked at the distcc source, so I'm not sure how complex it 
> might be to implement a solution to keep the localhost busy. Because I'm 
> not familiar with the architecture I'm probably not the best person to 
> design the solution, but I'd be more than happy to help try to implement it 
> if there's interest.

You don't need to look at the source.  Just tell me if you can think
of a better algorithm.  Once you can describe it in English or
pseudocode translating into C is fairly simple.

At least in the example I'm looking at: a little while before the end
of the build, five distcc processes get started by Make to build five
C++ files.  So we start two locally, two on machine A, one on machine
B.  It turns out that the file sent to machine B takes a very long
time to build because the code is more complex.  Until all five tasks
complete, Make doesn't start any more jobs, so localhost and A sit
idle.

Timothee suggested killing the job on B and re-running it on
localhost, but for at least this case it would be wasteful because B
is as fast as localhost.  For C++ code, transit time is relatively
small.

I think the real problem here is that recursive Make is harmful.  The
correct fix would be for Make to start additional jobs while it is
waiting for B.

-- 
Martin 



More information about the distcc mailing list