[distcc] homogeneous environments
Robert W. Anderson
anderson110 at poptop.llnl.gov
Wed Apr 29 17:39:19 GMT 2009
Fergus Henderson wrote:
> On Tue, Apr 28, 2009 at 4:28 PM, Robert W. Anderson
> <anderson110 at poptop.llnl.gov <mailto:anderson110 at poptop.llnl.gov>> wrote:
>
>
> I have an environment where we have many nodes potentially available
> for compilation, and all of them see the same file spaces via NFS.
> We are seeing decent performance out of distcc 3.1 using pump mode,
> but from reading the docs there may be big performance gains left to
> wring out in this special(?) case.
>
> If I understand correctly, distcc's pump mode finds a set of header
> files necessary to send along with the source file to enable
> compilation on a remote node. In a homogeneous environment, it
> seems both steps here are unnecessary if the master and slave nodes
> are more or less indistinguishable in terms of compiler, sources,
> and headers.
>
> I think we could really achieve some screaming compile times (over
> thousands of source files) if these steps could be bypassed with the
> user's explicit acknowledgement that he is making assumptions about
> the homogeneity of his build server machines.
>
> How extensive would the modifications be to support such an
> optimization? It was not clear to me after a few minutes of poking
> around in the source, and thought I'd seek an expert opinion first.
>
>
> Typically NFS is a lot slower than local file access.
> So it's not clear that this approach would actually improve overall
> performance.
>
> Distcc can work faster than NFS, because it sends all of the source
> files at once, requiring only one round-trip between the client and the
> distcc server for each compilation. With NFS, you need a round-trip
> between the distcc server and the NFS server for each header file that
> is included (directly or indirectly) from the source file being compiled.
>
> Of course with distcc, if your source files are on NFS, the client needs
> to do the same round-trips to the NFS server to fetch the files, but
> this is not as bad as having the distcc servers do that, because the
> distcc client need only fetch each file once for the whole build, not
> once for each compilation in which it is referenced, and after that the
> file will probably be cached. In addition, the client machine is more
> likely to have source files cached from previous builds, since on the
> client machine you're probably compiling the same sources that you
> compiled last time, whereas on the distcc server machines they are
> serving lots of different users who may be compiling very different
> programs.
>
> Another issue with this approach is that there may also be additional
> security considerations. Currently distcc servers normally run as user
> "distcc", which may not have access to the user's NFS files, so this
> approach would not work if the source files are not world-readable. Of
> course it would be possible to address this issue by having the distcc
> server authenticate the user, and then access the user's files on NFS as
> that user, but that would require additional authentication, which would
> have a performance impact. For example one way to do it would be to use
> distcc's ssh mode, but that mode has a major performance impact. (The
> recently posted patches for GSSAPI support have less performance impact,
> but there is still a significant impact.)
>
> For the approach that you are considering, you may not need to use
> distcc at all;
> a simple script using ssh may be sufficient, though the overheads of ssh
> may be prohibitive (ssh connection sharing may help with that, although
> that has security concerns of its own).
> If you do want to modify distcc, I'd guess that the modifications needed
> would be moderate in scope.
Fergus,
Thanks for the clear and detailed reply. First I should note that I am
already using ssh mode (via rsh) because I was unable to make TCP mode
work. I don't know if this is some kind of port blocking restriction on
my machine or what:
distcc[1663] (dcc_pump_sendfile) ERROR: sendfile failed: Connection
reset by peer
distcc[1663] (dcc_writex) ERROR: failed to write: Broken pipe
distcc[1663] Warning: failed to distribute source.c to host16,cpp,lzo,
running locally instead
Perhaps getting TCP mode running should be my first performance priority.
I just tried what you suggested in your last paragraph, manually
distributing compiles via rsh, and am finding that it is, as you
suspected, a little slower than distcc using pump mode. Rather than
pursue that any further, based on your comments, I would like to see if
I can get TCP pump mode working first.
Thanks,
--
Robert W. Anderson
Center for Applied Scientific Computing
Email: anderson110 at llnl.gov
Tel: 925-424-2858 Fax: 925-423-8704
More information about the distcc
mailing list