[distcc] "failed to write" with 9 identical machines, but works with only 5

Martin Pool mbp at canonical.com
Thu Oct 5 07:00:30 GMT 2006


On  4 Oct 2006, "Springer, Doug" <Doug.Springer at astronics.com> wrote:
> Hi,
> 
> With 5 machines, -j30 works just fine.  I am running Fedora Core 4 on a
> pentium4, running 2.4Ghz.  The "servers" are compaq tc1100 pentium M at
> 1Ghz machines, all with Fedora Core 5.  The servers are set up with the
> same script and files using the same NFS mount, so they should be pretty
> much identical, except for machine name and logins.
> 
> The path into all 9 machines is on the same wire, but they are on a
> switched network, if that matters.
> 
> With 9 machines running -j10 causes these errors:
> 
> distcc[526] (dcc_writex) ERROR: failed to write: No route to host
> distcc[526] (dcc_writex) ERROR: failed to write: Broken pipe
> distcc[526] Warning: failed to distribute net/ipv4/tcp_input.c to
> tc1100-053, running locally instead
> 
> When I do DISTCC_VERBOSE=1 DISTCC_LOG=/tmp/distcc.log, naturally I don't
> get any errors, even with -j30.  Classic.  Turn on debug and it slows
> things down just enough for the problem to disappear.

I haven't heard of that before.  It would be interesting to see what
happens in the log of tc1100-053 at the time this occurs - did it also
see the socket broken?

As far as I know there's nothing distcc can do that could provoke 'no
route to host', that should be entirely in the kernel networking level.

My suspicion would be that there's something in your switch, kernel,
network driver or network cards that can't handle the load and so is
copping out with 'no route to host'.

You could try looking at a network dump with Ethereal to see what
packets went across before the failure - was there a RST, or did the
network just stop?   You might also look at the kernel log messages on
both ends.

> 
> Here is the version:
> 
> distcc --version
> distcc 2.18.3 i686-pc-linux-gnu (protocols 1 and 2) (default port 3632)
>   built Sep 12 2006 11:38:3
> 
> Any Ideas?  I know, just run with fewer machines :)
> 
> Thanks in advance!
> Doug
> 
> 
> 

> __ 
> distcc mailing list            http://distcc.samba.org/
> To unsubscribe or change options: 
> https://lists.samba.org/mailman/listinfo/distcc
-- 
Martin


More information about the distcc mailing list