[distcc] My big plans for distcc
Victor Norman
vtnpgh at yahoo.com
Fri Sep 3 18:15:26 GMT 2004
All,
I have been experimenting with distcc and pcons (a
derivative of cons that does parallel compilations),
and have so far achieved really exciting results: my
compilation time for our code has gone from 3 hr. 57
min. to 39 mins. -- about 1/7th of the time. I think
it is pretty exciting.
But, I'm the only one that has been using this new
setup so far, and we have 70 software engineers that
would like to start using it.
In order for my setup to work with that many
engineers, of whom up to, say, 20 could be compiling
simultaneously, I think I need to make some extensions
to distcc.
Our setup is this: we have some fast Solaris machines,
mostly multiprocessors, that many people use for
compiling, etc. We also have a few multiprocessor
Linux boxes, which nobody uses right now. And, we
have alot of underused not so fast Solaris boxes and
desktop Linux machines which could be used for
compiling, I think. We are all within Marconi, and
all machines use NFS, so compilers, home directories,
etc. are all available everywhere.
What I'd like to make available to the engineers is an
easy-to-use, fast, fair, and reliable compilation
farm.
I thought I'd share my plans with you all. I would
love to get feedback on my ideas.
o Goals for the system:
o to pick best servers available, whether 1 engineer
is compiling or 20 are.
o to degrade gracefully under heavy loads.
o fairness when heavily loaded.
o to work with compiles dispatched from multiple
machines simultaneously (i.e., multiple "clients").
o to work with other programs that automatically
monitor load average
and CPU availability.
o to work with our heterogeneous network of servers
-- some very fast, some
multiprocessor, some very slow, some conditionally
available.
o Plan:
o use the existing 'hosts' file, which indicates the
current server
list, how many compilations can be run on the
machine, and how to
communicate with the server.
o a new config file will be read by distcc. This
file contains the list
of all servers that *may* be available as
compilation servers. Each line
in the file contains the hostname/ipaddress,
number of processors, the
machine's current load-average, and a
"power_index", which indicates
how fast the server is at compiling programs. All
information in this
file is static, except the load-average
information, which is updated
periodically by a separate daemon (see below).
The compilation farm
administrator is the only one to add servers to
this file, when new
servers are put on the network.
o distcc reads this file to figure out how powerful
a CPU is.
o distcc groups machines into tiers, where each tier
contains machines with
similar compiling power. The "top" tier is the
group of fastest
machines, and the "bottom" tier is the group of
slowest machines.
o server selection is as follows:
def pick_server:
for each tier, from top to bottom:
randomize the list of servers in the tier
for each server in the list:
if the server is available (i.e., it is not
marked by distcc as
blocked or on probation, etc):
return the server.
o multiple compilations on dispatchers will all
share the same hosts
file, and use the same directory for lock files,
etc.
o distcc groups servers into tiers based on the
server's:
o power_index
o load-average
o status (blocked, on probation, etc.)
o I don't know the exact algorithm for this
computation, but it will be
simple and fast.
o an "availability server daemon" will run on a
single machine and will
write to the hosts file, adding machines that
become available, and removing machines that are not
available anymore. And it will write the
load-averages into the new hosts config file. It
gets this informationi by periodically receiving
messages from each machine that is configured to
participate in the compilation system.
o All machines in the compilation farm will run a
small client daemon to
communicate with this server daemon, so that the
machine's load-average
and availability status is updated periodically.
A machine's
availability may be determined by its
load-average, or when its
screensaver is running or not, etc.
o we have a prototype for this already -- it is
basic socket programming,
mostly.
o Separate thought:
o I think we may need a distcc config file
(.distccrc) to allow the user
to set DISTCC_DIR, host selection algorithm,
default arguments, etc.
As more features become available, it would make
maintaining the system
easier, I think.
I relish your feedback.
Vic Norman
_______________________________
Do you Yahoo!?
Win 1 of 4,000 free domain names from Yahoo! Enter now.
http://promotions.yahoo.com/goldrush
More information about the distcc
mailing list