[distcc] Re: More debugging info for FIN_WAIT1 bug with RH 6

Hien D. Ngo hien at moses.xp.com
Thu Sep 5 21:46:09 GMT 2002

Turning off tcp_cork seems to make the problems under RH 6 go away; no more 
FIN_WAIT1 connections, netstat is completely clean.  Turning tcp_cork off also works 
for me on the RH 7 machines, so I can happily build with distcc now on all my 
machines :)  BTW, the FIN_WAIT1 connections never die off, they just keep 
accumulating as far as I can tell.

No weird firewalling in between the various machines (though there may be several 
routers in between.)

Thanks much for the fixes.


---- Original Message ----
From:		Martin Pool
Date:		Wed 9/4/02 22:29
To:		Hien D. Ngo
Cc:		distcc at lists.samba.org
Subject:	Re: More debugging info for FIN_WAIT1 bug with RH 6

On  4 Sep 2002, "Hien D. Ngo" <hien at moses.xp.com> wrote:
Content-Description: Mail message body
> I found the snippet of the distcc log for a FIN_WAIT1 connection.  The child 
> that is spawned exits so it tries to compile locally.  Both the remote and local 
> compiles exit with the same error code.  The file being compiled actually exits 
> a real compile error in the log output (usually foo.cpp is trying to #include a 
> header that doesn't exist or some such error.)

OK, that would explain why the client process goes away.  Just
dropping the socket halfway through is the expected behaviour.  

To get a small speed boost we start opening the socket before the
preprocessor has completed, and in the relatively rare case where the
preprocessor fails, we just drop the socket.  This ought to cause the
server to complain a little but cope.

I am pretty sure that getting stuck in FIN_WAIT1 with no timer
indicates a client kernel bug.  Perhaps not using TCP_CORK would avoid

You don't have any wierd firewalling or routing stuff between the
machines, do you?

Aside from that there is probably not much more that we can do.  I
guess the sockets and server processes will go away eventually.


More information about the distcc mailing list