[distcc] Working on several distcc enhancements (take 2)
vtnpgh at yahoo.com
Mon Nov 28 14:40:39 GMT 2005
My DMUCS setup, which uses distcc, takes care of the problems of overloading a compilation host, randomization of host selection, and optimization (choosing the best available host). The administrator of the system assigns a "power index" (i.e., an integer representing the speed of the cpus on the host) to each compilation host, and the system uses a single host-server process, which knows about the power indices of the compiliation hosts, and does randomization of the host selection. If you use the system the way I do, then you give multiple relatively-equivalent compiliation hosts the same power index, so that the compilations will be distributed randomly across thos relatively-equivalent systems. The host-server takes into account the load averages on the compilation hosts so that they don't get too overloaded. And, the host-server won't give a compilation host more compile jobs than it can handle, as configured by the administrator.
What my system doesn't have is a way to compute the power index of a compilation host automatically. That is what I would like to add still.
if you are more interested, I have submitted the code to this forum and it is waiting for moderator approval to get posted (hint, hint Martin :-).
Laurent Calburtin <laurent.calburtin at free.fr> wrote: > If your network is 100MB, and you've got a reasonable performance
> switch and high performance file server, network overhead on using
> a network cache this way isn't particularly noticeable compared the
> gains in speed by grabbing files from the cache that were compiled
> by other people.
You mean having the file server accessed by both clients for reading
and servers for writing? I was thinking about accessing the file
server from the distccd servers only but you're right one can take
advantage of it directly from clients. I'll give it a try. I hope
that the first time penalty (double network transfer of the output)
will not overcome the benefits.
> As for server selection schemes, I suspect it will be worthwhile to
> ensure that your load balancing avoids race conditions - it seems
> likely that the server with the highest score at any given moment
> might find itself swamped by several eager clients, especially as
> you get more programmers and continuous integration/build-bot
If a server gets swamped, it won't keep the highest score for long.
But I agree there may be contention if many clients broadcast to ask
for the best host at the same time and then all select the same one.
As far as I know, when the distccd daemon is busy, it awaits for
children processes to exit and one can't know how many clients are
blocked attempting to connect to the daemon. So when all servers are
busy, it's difficult to say which one has the less pending
connections. That information would be usefull for load balancing.
distcc mailing list http://distcc.samba.org/
To unsubscribe or change options:
Yahoo! Music Unlimited - Access over 1 million songs. Try it free.
-------------- next part --------------
HTML attachment scrubbed and removed
More information about the distcc