[distcc] Working on several distcc enhancements

Dan Kegel daniel.r.kegel at gmail.com
Fri Nov 18 03:15:59 GMT 2005


I'm working on the following enhancements to distcc,
all motivated by observing shortcomings in real use
in a demanding environment:

1. gcc-2.95.3 sometimes spins on invalid input.  The user eventually
    aborts the build, but distccd does not then kill the compile job.
    Distccd should kill the compile job on timeout (say, 20 minutes)
    or if the client disconnects.

2. There is no supported way to autodiscover distcc servers.
    There should be a simple command to find the active distcc
    servers at a local site via DNS (and possibly other protocols,
    but I'm happy enough with DNS).  (I posted an early version
    of lsdistcc on the list already, but have made more progress since then.)

3. There is no ready-made way to monitor a distcc cluster's health.
    There should be a simple way to measure compile latency
    of all machines in a cluster, and an example crontab script
    showing how to use it to trigger email alerts if a machine goes bad.
    Likewise, distccd should keep statistics of its own health and activity,
    and make them available via HTTP for easy remote access.

4. When a distccd server is full up on active jobs, and other nearby
    servers are not, it's a shame that clients which connect to the
    wrong server have to wait.   Perhaps the server should actively
    turn away compile requests, so the client could do a local compilation
    or try another server.  Or perhaps a (set of redundant) load balancers
    would be appropriate.

5. If Alice has already compiled everything on client A, and Bob starts a job
    to compile the same everything on client B, it's a shame that Bob
has to wait;
    perhaps distccd (or a load balancer!) should (carefully) cache results.

6. distccd is a known insecure service.  Even with the IP address
access control list,
    Bad Guys could potentially use it to subvert a network.  A tighter access
    control scheme might be appropriate for some sites, e.g. using kerberos
    to restrict access to just the people allowed to submit code to the revision
    control system (who can subvert everything anyhow).

I have preliminary code for the first three, haven't started on the
load-balancing cache yet, and only have a little demo code for
kerberos access control.
I'm being helped on and off by a number of folks, including Thomas
Kho, Jeff Evarts,
and Dongmin Zhang.

Just thought I'd post to see if anyone else was using distcc heavily
and was interested in testing any of the above (or even helping code it).
- Dan


More information about the distcc mailing list