[distcc] homogeneous environments

Robert W. Anderson anderson110 at poptop.llnl.gov
Wed Apr 29 17:39:19 GMT 2009


Fergus Henderson wrote:
> On Tue, Apr 28, 2009 at 4:28 PM, Robert W. Anderson 
> <anderson110 at poptop.llnl.gov <mailto:anderson110 at poptop.llnl.gov>> wrote:
> 
> 
>     I have an environment where we have many nodes potentially available
>     for compilation, and all of them see the same file spaces via NFS.
>      We are seeing decent performance out of distcc 3.1 using pump mode,
>     but from reading the docs there may be big performance gains left to
>     wring out in this special(?) case.
> 
>     If I understand correctly, distcc's pump mode finds a set of header
>     files necessary to send along with the source file to enable
>     compilation on a remote node.  In a homogeneous environment, it
>     seems both steps here are unnecessary if the master and slave nodes
>     are more or less indistinguishable in terms of compiler, sources,
>     and headers.
> 
>     I think we could really achieve some screaming compile times (over
>     thousands of source files) if these steps could be bypassed with the
>     user's explicit acknowledgement that he is making assumptions about
>     the homogeneity of his build server machines.
> 
>     How extensive would the modifications be to support such an
>     optimization?  It was not clear to me after a few minutes of poking
>     around in the source, and thought I'd seek an expert opinion first.
> 
> 
> Typically NFS is a lot slower than local file access.
> So it's not clear that this approach would actually improve overall 
> performance.
> 
> Distcc can work faster than NFS, because it sends all of the source 
> files at once, requiring only one round-trip between the client and the 
> distcc server for each compilation.  With NFS, you need a round-trip 
> between the distcc server and the NFS server for each header file that 
> is included (directly or indirectly) from the source file being compiled.
> 
> Of course with distcc, if your source files are on NFS, the client needs 
> to do the same round-trips to the NFS server to fetch the files, but 
> this is not as bad as having the distcc servers do that, because the 
> distcc client need only fetch each file once for the whole build, not 
> once for each compilation in which it is referenced, and after that the 
> file will probably be cached.  In addition, the client machine is more 
> likely to have source files cached from previous builds, since on the 
> client machine you're probably compiling the same sources that you 
> compiled last time, whereas on the distcc server machines they are 
> serving lots of different users who may be compiling very different 
> programs.
> 
> Another issue with this approach is that there may also be additional 
> security considerations.  Currently distcc servers normally run as user 
> "distcc", which may not have access to the user's NFS files, so this 
> approach would not work if the source files are not world-readable.  Of 
> course it would be possible to address this issue by having the distcc 
> server authenticate the user, and then access the user's files on NFS as 
> that user, but that would require additional authentication, which would 
> have a performance impact.  For example one way to do it would be to use 
> distcc's ssh mode, but that mode has a major performance impact. (The 
> recently posted patches for GSSAPI support have less performance impact, 
> but there is still a significant impact.)
> 
> For the approach that you are considering, you may not need to use 
> distcc at all;
> a simple script using ssh may be sufficient, though the overheads of ssh 
> may be prohibitive (ssh connection sharing may help with that, although 
> that has security concerns of its own).
> If you do want to modify distcc, I'd guess that the modifications needed 
> would be moderate in scope.

Fergus,

Thanks for the clear and detailed reply.  First I should note that I am 
already using ssh mode (via rsh) because I was unable to make TCP mode 
work.  I don't know if this is some kind of port blocking restriction on 
my machine or what:

distcc[1663] (dcc_pump_sendfile) ERROR: sendfile failed: Connection 
reset by peer
distcc[1663] (dcc_writex) ERROR: failed to write: Broken pipe
distcc[1663] Warning: failed to distribute source.c to host16,cpp,lzo, 
running locally instead

Perhaps getting TCP mode running should be my first performance priority.

I just tried what you suggested in your last paragraph, manually 
distributing compiles via rsh, and am finding that it is, as you 
suspected, a little slower than distcc using pump mode.  Rather than 
pursue that any further, based on your comments, I would like to see if 
I can get TCP pump mode working first.

Thanks,
-- 
Robert W. Anderson
Center for Applied Scientific Computing
Email: anderson110 at llnl.gov
Tel: 925-424-2858  Fax: 925-423-8704


More information about the distcc mailing list