[distcc] keeping localhost busy
Martin Pool
mbp at samba.org
Mon Sep 29 00:51:15 GMT 2003
On 26 Sep 2003, Jeff <rustysawdust at yahoo.com> wrote:
> Last month there was an interesting thread entitled "distcc over slow net
> links". I have a similar problem in that two of the "animals" on my farm
> have very old CPUs (although they are on a local 100baseT network). On a
> real farm these animals would probably be turned into glue or hamburger,
> but on a distcc farm I think that's animal cruelty! :)
In fact the same problem can happen even if all of the machines are
the same speed. I can see the problem pretty clearly when building
OpenPegasus, which uses a recursive Make through many directories and
has some C++ files which are very slow to build. It spends a lot of
time blocked waiting for large files to complete.
> Here's a quote from the original thread posted by Timothee:
>
> >What I'm thinking, is that once local hosts are starved, distcc should
> >find out that there is stuff running on slow hosts, and dupe the compile
> >work on the local hosts, sending back whatever finishes first.
>
> The thread went into some interesting discussions on how to choose a faster
> machine, and how distcc might be able to keep track of the speeds of each
> server. In the case of this email, I believe "local hosts" meant the hosts
> on his fast local network. But what I'd really like to do is to simply keep
> "localhost" busy. That wouldn't require any of the additional complexity of
> tracking the power of each server, and it also would be a little bit more
> friendly if more than one person was trying to use the farm at a time.
>
> The project I'm working on consists of over 80 libraries, and some of them
> are quite large. As make gets towards the end of each library, I often see
> those two slow machines drag out the end of the build for 20 or 30 seconds.
> With this many libraries, each one of those pauses really starts to hurt.
> Using distcc's excellent new graph tool, it becomes especially obvious when
> the fast hosts have all "scrolled off to white" and you see the two green
> bars remaining.
Maybe you should build more than one library at a time, so that even
when reaching the end of any one build, there are several jobs that
can be run?
> I haven't yet looked at the distcc source, so I'm not sure how complex it
> might be to implement a solution to keep the localhost busy. Because I'm
> not familiar with the architecture I'm probably not the best person to
> design the solution, but I'd be more than happy to help try to implement it
> if there's interest.
You don't need to look at the source. Just tell me if you can think
of a better algorithm. Once you can describe it in English or
pseudocode translating into C is fairly simple.
At least in the example I'm looking at: a little while before the end
of the build, five distcc processes get started by Make to build five
C++ files. So we start two locally, two on machine A, one on machine
B. It turns out that the file sent to machine B takes a very long
time to build because the code is more complex. Until all five tasks
complete, Make doesn't start any more jobs, so localhost and A sit
idle.
Timothee suggested killing the job on B and re-running it on
localhost, but for at least this case it would be wasteful because B
is as fast as localhost. For C++ code, transit time is relatively
small.
I think the real problem here is that recursive Make is harmful. The
correct fix would be for Make to start additional jobs while it is
waiting for B.
--
Martin
More information about the distcc
mailing list