[distcc] PCH Headers and distcc (again!)

Wed Jun 11 15:37:08 GMT 2008

More private email copied back to the public list.

Sascha wrote:

> Is it possible to make your LD_PRELOAD based implementation available?

I suspect that making that code publically available would be a significant
amount of work.

I am pretty sure that most of the network round-trips can be eliminated
> in an LD_PRELOAD based approach. Here's how:
>
> The server maintains a file hash based cache of files transferred from
> the client to the server. So a typical (slow path) client server chat
> for querying a file looks like this:
>
>   1. Server: What's the hash of file "/some/file/on/the/client".
>   2. Client: Hash is ABCD
>   3. Server: Send me the file with hash ABCD
>   4. Client: Here's the file.
>
> Steps 3. and 4. will be skipped in case of a file cache hit. This will
> work, but obviously it's slow.
>
> To speed things up, the client pre-scans the sources to determine the
> required header files and the directories that will probably be
> read/scanned by the compiler. The initial request the contains a
> file/directory to hash mapping. The server will then be able to serve
> most of the files from its local cache without querying the client.
>
> The client would keep a file+mtime to file-hash mapping to avoid
> recomputing file hashes and to translate hashes back to local file names.
>
> To further speed things up, the server may as well pre-scan the source
> and header files and query multiple files in one request. This way the
> number of round trips will be tolerable when the server's file cache
> does not contain any of the required files.
>
> The slow path query/response chat could be kept as a fall back to catch
> things that were missed in the client pre-scan (things like '#include
> SOME_MACRO(args)'.
>
> I could imagine that this works reasonably fast. It also has benefits
> for the case where multiple clients are compiling the same (or similar)
> code bases, because many of the required files will already be on the
> server.

Ah, but then now you have all the complexity of include scanning, plus all
the complexity of LD_PRELOAD.
At this point I think this solution stops looking so elegant.  Since include
scanning alone is sufficient to give
good performance, why add that extra complexity?  Increased generality, I
suppose, but since the performance
is going to be poor for cases which include scanning doesn't catch, it's not
clear that this is a significant win over
just falling back to local compilation for those cases, and so it's not
clear that the increased complexity is
worthwhile.  IMHO.

-- 
Fergus Henderson <fergus at google.com>
-------------- next part --------------
HTML attachment scrubbed and removed