[distcc] Loadbalanced distcc

Victor Norman vtnpgh at yahoo.com
Thu Jan 5 15:44:53 GMT 2006

Dan, et al., 

Your simpler implementation has obvious advantages and disadvantages, which
I'll enumerate:

o much simpler than dmucs
o requires less administrator configuration than dmucs: in dmucs the administrator has to somehow figure out which of his/her machines are the most "powerful" -- i.e., which can compile a file the fastest.  This is not easy to compute.

o when doing large builds, the load average on a machine can change significantly, especially when multiple users are sharing the compilation farm.  Dmucs adapts to this situation automatically, whereas your solution won't, as I understand it.

I sure would like to incorporate your solution into dmucs somehow -- so that the power and number of cpus on a compilation server can be computed automatically.... I'll put it on my to do list...


Dan Kegel <dank at kegel.com> wrote: On 1/5/06, Victor Norman  wrote:
> The nice things about dmucs are:
> o if other engineers are using the compile machines for other tasks, the
> loadavg daemons will communicate the loads to the dmucs server, and those
> machines will be used less often for compiles -- so my system takes into
> account not just loadavg based on number of compiles on a machine, but on
> other factors as well.

This is a very good thing.
I have an alternative approach which achieves
part of this benefit without requiring a central server.
The idea is, the user invokes 'make' via a wrapper
that runs a client-side program which queries all
the servers in parallel to see how quickly they respond to a compile
request for 'hello, world'.  Servers that respond promptly
are added to the host list for the current make job.
This lets you avoid machines which are loaded down or otherwise having problems.
It's not as good as dmucs, but it's quite useful.

> o machines ca n be added and deleted from the compilation farm dynamically.
> For example, a powerful desktop machine on an engineers desktop can be added
> to the compilation farm when it goes into screen saver mode.  When the
> eningeer comes back to his/her desk, the machine can be immediately removed
> from the farm.  I've tried this, and it works well.

I think you can get the same effect with my approach.
I also have a patch for distcc which rejects jobs when the load is
too high; I could extend it to reject jobs when a suitable screensaver
isn't running.
(Many screensavers use too much CPU power!)

> o machines with differing "power" can all contribute to the compilation
> farm.  Each machine is assigned a "power index", and the most powerful,
> available, and not-overloaded machines will be given out for compiles.

That's something I haven't looked at.  (My servers are all equally powerful.)

> o the system scales really well.  I've tried it with 6 compilations going
> simultaneously, and the time to compile our code based went up from 44
> minutes to 56 minutes -- an increase of 27%.

Can I get you to run the benchmark included in distcc
(the one that I posted results for recently) with one and six users?
(If it helps, we're writing a script to make this easier, all you
have to do is give it the six client hostnames to use.)

> o ideally, it would be very nice to be able to know how long different files
> take to compile

I have a statistics patch for distccd that (among other things) gathers info
about which files take the longest to build, but I use it mostly to
give feedback to developers.  The stats also let me know when it's
time to add servers to a cluster.

- Dan

Wine for Windows ISVs: http://kegel.com/wine/isv

Yahoo! Photos
 Ring in the New Year with Photo Calendars. Add photos, events, holidays, whatever.
-------------- next part --------------
HTML attachment scrubbed and removed

More information about the distcc mailing list