HAVE_SENDFILE change not effective (Re: [distcc] Problems with distcc hanging on large compiles)

Hien D. Ngo hien at moses.xp.com
Fri Aug 30 11:47:01 GMT 2002

Recompiled after commenting out '#define HAVE_SENDFILE 1' from config.h, no luck.  
The hangs are the same as before, here's the requested additional info.  Also, I 
noticed there's a HAVE_SYS_SENDFILE_H section in config.h.  Does this need to be 
commented out as well?


ngoh at bldmaster.foo.com $ netstat -to | grep 4200
tcp        0  27255 bldmaster.foo.com:38772  build01.ny.ficc.gs.:4200 ESTABLISHED 
unkn-4 (101.85/0/0)
tcp        0  16811 bldmaster.foo.com:38783  build01.ny.ficc.gs.:4200 ESTABLISHED 
unkn-4 (64.14/0/0)
ngoh at bldmaster.foo.com $ cat /tmp/distcc.trace
read(5,  <unfinished ...>

ngoh at build01.foo.com $ netstat -to | grep 4200
tcp    71707      0 build01.ny.ficc.gs.:4200 bldmaster.foo.com:38772  ESTABLISHED 
off (0.00/0/0)
tcp    71707      0 build01.ny.ficc.gs.:4200 bldmaster.foo.com:38783  ESTABLISHED 
off (0.00/0/0)
ngoh at build01.foo.com $ cat distccd.trace
open("/tmp/distcc_00002493/server_0011494.i", O_WRONLY|O_CREAT|O_TRUNC, 0600 
<unfinished ...>

---- Original Message ----
From:		Martin Pool
Date:		Thu 8/29/02 22:43
To:		Hien D. Ngo
Cc:		distcc at lists.samba.org
Subject:	Re: Different problem (Re: [distcc] Problems with distcc hanging on 
large compiles (Patch not effective))

On 29 Aug 2002, "Hien D. Ngo" <hien at moses.xp.com> wrote:
> As luck would have it, two compile sessions got hung up.  The backtrace info for 
> other side is not very useful, though.
> I'll try the HAVE_SENDFILE recompile and report back.

It looks like this one is the other way around.  I think on your first
mail it was transmission of the .o file from server to client that
hung, whereas here it is the transmission of the .i file from client
to server.

Thanks for reproducing it.

In the original bug report, there was data in the sender's transmit
queue and nothing in the receiver's receive queue.  That's pretty
strange -- it almost looks as if there is a kernel bug that is
stopping it from being pushed across.  It would help if you could
include the output of "netstat -to" to show what the kernel thinks its
doing, and also perhaps a tcpdump on that socket for one minute or so
after you notice it's got into the stuck state.  For example, if the
client is on port 54522, do "tcpdump tcp port 54522".  I noticed that
in the 

Perhaps the strange stack shown by gdb indicates that 

The other thing that might help describe it, since gdb isn't
cooperating, is an strace of the two processes leading up to the
hang.  Do something like "strace -o /tmp/distccd.trace -ff `pidof


distcc mailing list
distcc at lists.samba.org

More information about the distcc mailing list