[distcc] Re: Bug#214621: distcc: better handling for round-robin DNS

Zygo Blaxell zblaxell at genki.hungrycats.org
Wed Oct 8 21:12:50 GMT 2003


On Wed, Oct 08, 2003 at 12:32:42PM +1000, Martin Pool wrote:
> On  7 Oct 2003, Zygo Blaxell <zblaxell at genki.hungrycats.org> wrote:
> > 2.  Locking, backoff, and scheduling should be done by IP address.
> > Currently if any single volunteer fails, distcc will drop all of the
> > IP addresses associated with the name.

Actually I now realize the above is wrong, it should only be the backoff
that uses the IP.  Scheduling and locking should be done using just
the names...

> A fair number of machines have multiple A entries because they have
> multiple network interfaces.  (In particular, I use one such in my
> build farm.)  We wouldn't want them to get twice as many jobs, or to
> interfere with sending the jobs over the best interface.

...and this is why.  Well, that, and the fact that I actually want
only a subset of each rr dns record to be used by any given client, as
this tends to result in better load balancing when there are very fast
machines at the head of the list.

Certainly the error-reporting wish still holds even (especially)
in this case.  One of the interfaces may be down, or a user might
be unintentionally using a multi-homed host's name instead of an
interface-specific alias, and wondering why it only works once or twice
until it hits an unreachable IP address and puts the DNS name into
backoff mode.

> Using multiple names with different options for each is an intruiging
> idea, but once you start putting configuration like that on each
> client you've lost much of the benefit.

I encode the relative speeds of the machines into the DNS round robin
names.  I have a small office LAN with several identical machines where
there is a single 'distcc' rr dns record.   I also have a larger office
with three CPU architectures, three gcc versions, and CPU speeds ranging
from 0.2 to 3.0 GHz, with a number of records based on what features the
client needs and what resources it is entitled to.

So for example any "distcc3" machine has less than half the FSB speed
of any "distcc1" machine.  When faster machines are added to the pool,
the "distcc1" machines move to "distcc2", "distcc2" machines move to
"distcc3", and so on.  Client configuration doesn't have to change
every time some machine is upgraded or broken.

> If this is going to be stored in DNS, then using something like the
> SRV record would be a better solution.  I don't see any way in which
> SRV is worse than A, and it avoids overloading the mechanism.

I don't see any way in which SRV is better than A, or for that matter
_different_ than A.  In either case I'm putting a bunch of rr aliases
in DNS, but with SRV I need to add client- and server-side support that
I didn't need with A.  I'd also still need diagnostics from distcc that
tell me which machine (by IP:port) I have to fix if a distcc volunteer
breaks.

> Alternatively I can just be lazy and suggest that you fetch
> /etc/distcc/hosts over HTTP, rsync or some similar mechanism.

The volunteers list is a function of project (pre- or post- gcc-2.95)
and architecture (x86 or not) and priority of the current work (urgent
or not--does it take up the few fast machines or the many slow ones).
I wouldn't be using RR DNS if I didn't think it was the best approach.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 232 bytes
Desc: not available
Url : http://lists.samba.org/archive/distcc/attachments/20031008/8d650db0/attachment.bin


More information about the distcc mailing list