[distcc] Working on several distcc enhancements (take 2)

Martin Pool mbp at sourcefrog.net
Thu Apr 28 17:18:30 MDT 2011


Those all sound like great improvements.

Apple did some work to discover distcc servers over Rendezvous/mDNS;
perhaps that can be used elsewhere or it can be a modular alternative
to having them in regular DNS entries.

Perhaps caching can sit on top of ccache; at any rate it would be good
to explain why it makes sense to do it differently.

Having each client choose a servers to use in ignorance of how they're
being used by other concurrent clients (as I think they now do?) will
give inefficient queueing so it would be good to improve that.

I would like to see a really simple shared-secret or asymmetric-key
authentication system, that may be easier to bring up than Kerberos.
On the other hand perhaps it is worth looking again at whether the
performance hit from SSH is bearable, especially now it has built in
connection reuse.  (I don't think it did when I started, but
encryption has probably got faster relative to compilation.)

Martin




On 19 November 2005 09:39, Daniel Kegel <dank at kegel.com> wrote:
> I'm working on the following enhancements to distcc,
> all motivated by observing shortcomings in real use
> in a demanding environment:
>
> 1. gcc-2.95.3 sometimes spins on invalid input.  The user eventually
>   aborts the build, but distccd does not then kill the compile job.
>   Distccd should kill the compile job on timeout (say, 20 minutes)
>   or if the client disconnects.
>
> 2. Hung servers make users very, very unhappy,
>   and unfortunately, distcc servers tend to hang (or appear to hang)
>   much more often than one would like, but not often enough to
>   be easy to debug.
>   To insulate users from hung servers, there should be
>   a simple way to prequalify distcc servers before a build
>   run.  I am extending the lsdistcc program I posted
>   earlier to actually run a trivial compilation on each
>   server; it will only list servers which complete the trivial
>   compilation by a deadline (say, 1 second).
>   (And, of course, lsdistcc lets you autodiscover
>   distcc servers listed in DNS, which makes deploying at
>   large sites much easier.)
>
> 3. There is no ready-made way to monitor a distcc cluster's health.
>   There should be a simple way to measure compile latency
>   of all machines in a cluster, and an example crontab script
>   showing how to use it to trigger email alerts if a machine goes bad.
>   Likewise, distccd should keep statistics of its own health and activity,
>   and make them available via HTTP for easy remote access.
>
> 4. When a distccd server is full up on active jobs, and other nearby
>   servers are not, it's a shame that clients which connect to the
>   wrong server have to wait.   Perhaps the server should actively
>   turn away compile requests, so the client could do a local compilation
>   or try another server.  Or perhaps a (set of redundant) load balancers
>   would be appropriate.
>
> 5. If Alice has already compiled everything on client A, and Bob starts a
> job
>   to compile the same everything on client B, it's a shame that Bob has to
> wait;
>   perhaps distccd (or a load balancer!) should (carefully) cache results.
>
> 6. distccd is a known insecure service.  Even with the IP address access
> control list,
>   Bad Guys could potentially use it to subvert a network.  A tighter access
>   control scheme might be appropriate for some sites, e.g. using kerberos
>   to restrict access to just the people allowed to submit code to the
> revision
>   control system (who can subvert everything anyhow).
>
> I have preliminary code for the first three, haven't started on the
> load-balancing cache yet, and only have a little demo code for
> kerberos access control.
> I'm being helped on and off by a number of folks, including Thomas
> Kho, Jeff Evarts, and Dongmin Zhang.
>
> If I do decide to do a load-balancing cache, I'll probably
> start by writing nonblocking versions of the dcc_* networking functions.
> Ideally I'd end up with a library that would let you plug
> in caching on the client, in the proxy, or on the server.
>
> Just thought I'd post to see if anyone else was using distcc heavily
> and was interested in testing any of the above (or even helping code it).
> __ distcc mailing list            http://distcc.samba.org/
> To unsubscribe or change options:
> https://lists.samba.org/mailman/listinfo/distcc
>


More information about the distcc mailing list