[distcc] Re: Bug#214621: distcc: better handling for round-robin DNS

Martin Pool mbp at samba.org
Tue Dec 2 02:27:08 GMT 2003


On 15 Oct 2003, Zygo Blaxell <zblaxell at genki.hungrycats.org> wrote:
> On Mon, Oct 13, 2003 at 05:23:07PM +1000, Martin Pool wrote:
> > What happens when localhost's real name is listed in the alias?  That
> > would cause us to send jobs over the loopback interface, which is a
> > bit inefficient.
> 
> That's more or less exactly what happens:  connections to <IP of local
> machine's eth0 interface>:distcc.  Fixing this gets complicated,
> especially if you consider stuff like IP aliases, dummy interfaces,
> and IP-takeover systems.  

Yes.

> Modern loopback "hardware" runs at some huge number of gigabits, so
> in practice the extra overhead is buried firmly in the noise.

It ain't necessarily so.  The issue here is not the bandwidth of local
TCP connections, which I'm prepared to assume is semi-infinite.  The
problem is that running gcc as two separate processes (cpp and cc1) is
less efficient than running just one.  Partly because we pay the
overhead of the gcc frontend twice over, but it's even more important
that gcc can use an integrated cpp.

Also the .i goes not just from cpp straight to cc1, but rather from
cpp to a temporary file to distcc across a socket to distccd to a
temporary file to the compiler.  There is a little extra cost there
too.

Having said that the difference may not be big enough to worry about,
but it's not necessarily negligible.  If you have plenty of machines
it probably is though.

> There could be theoretically twice the number of jobs
> scheduled on the local machine, if both "localhost" and
> "some.name.that.may.resolve.to.a.loopback.ip.address" are listed in
> DISTCC_HOSTS; however, that could already happen if two people are running
> distcc jobs with the same DISTCC_HOSTS list on the same network, whether
> aliases are used or not.

distccd limits the number of concurrent jobs, independent of
client-side limits.

There is an upside to this though: a loopback connection to the local
distccd means that local jobs will be included in distccd's count.

-- 
Martin 



More information about the distcc mailing list