[distcc] Working on several distcc enhancements (take 2)

Victor Norman vtnpgh at yahoo.com
Mon Nov 28 14:40:39 GMT 2005

  My DMUCS setup, which uses distcc, takes care of the problems of  overloading a compilation host, randomization of host selection, and  optimization (choosing the best available host).  The  administrator of the system assigns a "power index" (i.e., an integer  representing the speed of the cpus on the host) to each compilation  host, and the system uses a single host-server process, which knows  about the power indices of the compiliation hosts, and does  randomization of the host selection.  If you use the system the  way I do, then you give multiple relatively-equivalent compiliation  hosts the same power index, so that the compilations will be  distributed randomly across thos relatively-equivalent systems.   The host-server takes into account the load averages on the compilation  hosts so that they don't get too overloaded.  And, the host-server  won't give a compilation host more compile jobs than it can handle, as  configured by the administrator.
  What my system doesn't have is a way to compute the power index of a  compilation host automatically.  That is what I would like to add  still.
  if you are more interested, I have submitted the code to this forum and  it is waiting for moderator approval to get posted (hint, hint Martin  :-).

Laurent Calburtin <laurent.calburtin at free.fr> wrote:  > If your network is 100MB, and you've got a reasonable performance  
> switch and high performance file server, network overhead on using  
> a network cache this way isn't particularly noticeable compared the  
> gains in speed by grabbing files from the cache that were compiled  
> by other people.

You mean having the file server accessed by both clients for reading  
and servers for writing? I was thinking about accessing the file  
server from the distccd servers only but you're right one can take  
advantage of it directly from clients. I'll give it a try. I hope  
that the first time penalty (double network transfer of the output)  
will not overcome the benefits.

> As for server selection schemes, I suspect it will be worthwhile to  
> ensure that your load balancing avoids race conditions - it seems  
> likely that the server with the highest score at any given moment  
> might find itself swamped by several eager clients, especially as  
> you get more programmers and continuous integration/build-bot  
> processes.

If a server gets swamped, it won't keep the highest score for long.  
But I agree there may be contention if many clients broadcast to ask  
for the best host at the same time and then all select the same one.
As far as I know, when the distccd daemon is busy, it awaits for  
children processes to exit and one can't know how many clients are  
blocked attempting to connect to the daemon. So when all servers are  
busy, it's difficult to say which one has the less pending  
connections. That information would be usefull for load balancing.


distcc mailing list            http://distcc.samba.org/
To unsubscribe or change options: 

 Yahoo! Music Unlimited - Access over 1 million songs. Try it free.
-------------- next part --------------
HTML attachment scrubbed and removed

More information about the distcc mailing list