[distcc] PCH Headers and distcc (again!)

Wed Jun 11 15:32:19 GMT 2008

Sascha wrote:

> The idea of sending the header files to the server sounds interesting,
> but this rises some questions. (I also played with the idea of
> transferring all required files to the server, but never implemented
> anything except for the PCH patch.)
>
> For debug builds, the compiler needs to see the correct file path as-is
> on the client, which may be difficult/impossible to reproduce on the
> server. One way to fix this is to run the preprocessor (on the server)
> separately and then patch the preprocessed file (convert '# 123
> "/var/distcc/file-cache/a6df3d6.h"' back to '# 123
> "E:/someSDK/whatever.h").

What we've done in distcc 3.0 is to patch the debug sections of the object
files.
The client path is reproduced on the server, with a prefix, e.g.
/tmp/distcc/a3b4x7,
and then the compiler is invoked to produce an object file, and then finally
distcc
mmaps in the object file, parses it to figure out which parts are the debug
sections,
and edits those debug sections, replacing the longer path names from the
server
with shorter path names by removing the /tmp/distcc/a3b4x7 prefix.

This is of course platform specific and our current implementation only
works for ELF files.
But on other platforms it should be pretty easy to work around the problem
using the
"directory" command to gdb.

Another thing: what happens with all the preprocessor related options
> (e.g. -MF). Running the preprocessor on the client machine solves this
> nicely, but when the preprocessor is run on the server, then these files
> must be patched and sent back to the client. Or the client (which scans
> the files anyway) interprets these options and generates the dependency
> files?

In distcc 3.0, the .d file is produced on the server and sent back to the
client.
Preprocessor options are passed to the compiler when it is invoked on the
server.

Another (I think more elegant) way to deal with this is to run the
> compiler with an LD_PRELOAD that wraps
> open()/read()/access()/stat()/opendir()/readdir()/closedir(). Any source
> files opened read-only by the compiler could then be transferred
> on-demand to a local file cache. Write-only files are sent back to the
> client. (Some magic may be required to figure out which files to
> transmit and which files to use from the local compiler installation,
> but I am sure this can be sorted out.)

Google experimented with this approach before I joined Google.
We had an implementation of it, although I think the implementation
techniques were different than using LD_PRELOAD.

The big drawback of this approach is that you have a lot more network
round-trips.
You need a separate round trip for every file that the server fetches from
the client.
If you have high latency, or if you need to fetch a lot of files, this is
very costly for the
overall latency of the build.

For full builds, throughput is more important than latency.  But for
incremental builds,
latency is very important.  And the user shouldn't have to guess whether the
next
build is going to be long-running or not.  So we want an approach that has
low latency
for incremental builds.

-- 
Fergus Henderson <fergus at google.com>
-------------- next part --------------
HTML attachment scrubbed and removed