[distcc] IO timeouts

Morrell, Michael michael.morrell at intel.com
Tue May 5 16:39:54 MDT 2015


Here is some log output from the client related to the timeout and the file descriptors:

distcc[35805] [Tue May  5 15:28:36 2015] (dcc_lock_host) got cpu lock on machine2/8,cpp,lzo slot 0 as fd7
distcc[35805] [Tue May  5 15:28:36 2015] (dcc_lock_host) got cpu lock on localhost slot 1 as fd8
distcc[35805] [Tue May  5 15:28:36 2015] (dcc_connect_by_addr) created socket on fd9
distcc[35805] [Tue May  5 15:28:36 2015] (dcc_select_for_write) select for write on fd9 for 20s
distcc[35805] [Tue May  5 15:28:36 2015] (dcc_get_io_timeout) Using IO timeout value: 600
distcc[35805] [Tue May  5 15:28:36 2015] (dcc_select_for_read) select for read on fd9 for 600s
distcc[35805] [Tue May  5 15:28:37 2015] (dcc_unlock) release lock fd8
distcc[35805] [Tue May  5 15:28:37 2015] (dcc_connect_by_addr) created socket on fd8
distcc[35805] [Tue May  5 15:28:37 2015] (dcc_select_for_write) select for write on fd8 for 20s
distcc[35805] [Tue May  5 15:28:57 2015] (dcc_select_for_write) ERROR: IO timeout
distcc[35805] [Tue May  5 15:28:57 2015] ERROR: timeout while connecting to 172.xx.xx.xx:3632
distcc[35805] [Tue May  5 15:28:57 2015] (dcc_unlock) release lock fd7

I added an extra rs_trace call to print the “created socket on” message so I could see where the fd for the timed out select was coming from.

   Michael

On May 5, 2015, at 3:10 PM, Martin Pool <mbp at sourcefrog.net<mailto:mbp at sourcefrog.net>> wrote:

I wonder if you have a firewall on the server.

On Tue, May 5, 2015 at 3:08 PM, Morrell, Michael <michael.morrell at intel.com<mailto:michael.morrell at intel.com>> wrote:
Nothing is in the server’s logs.  I started it with:

  distccd —daemon -a xxx.xx.xx.xx —log-file ~/distccd.log —verbose

The last line is “(dcc_create_kids) up to 10 children” from its initialization.

   Michael

On May 5, 2015, at 2:40 PM, Martin Pool <mbp at sourcefrog.net<mailto:mbp at sourcefrog.net><mailto:mbp at sourcefrog.net<mailto:mbp at sourcefrog.net>>> wrote:

What's in the server's logs?

On Tue, May 5, 2015 at 2:31 PM Morrell, Michael <michael.morrell at intel.com<mailto:michael.morrell at intel.com><mailto:michael.morrell at intel.com<mailto:michael.morrell at intel.com>>> wrote:
I’m new to distcc.  I downloaded 3.2rc1 and compiled it on OS X recently and began experimenting.

I’m getting a lot of “IO timeout” errors and I don’t know what could be causing them.

I am using two servers running distccd, each with 8 processors.

My DISTCC_HOSTS is “machine1/8,cpp,lzo machine2/8,cpp,lzo”.

I have DISTCC_FALLBACK set to 0.

I’m running the command “pump make -j16”.

Usually, I get no compilation done at all.  Everyone that is attempted gets a timeout.

I’m even increased dcc_connect_timeout from 4 to 20 seconds and it still occurs.

From the logs, I see 31 object files being tried, and 17 get the error 107 (EXIT_IO_ERROR) and another 14 get error 116 (EXIT_NO_HOSTS).

I’d appreciate any ideas on what to do here.

Thanks,

   Michael


__
distcc mailing list            http://distcc.samba.org/
To unsubscribe or change options:
https://lists.samba.org/mailman/listinfo/distcc




--
Martin



More information about the distcc mailing list