[distcc] Re: re: distributed compiler cache

MartinPool mbp at samba.org
Thu Sep 12 08:08:00 GMT 2002


On 12 Sep 2002, Joerg Beyer <j.beyer at web.de> wrote:

> The main difference is, that my cache is constantly running daemon. I like
> to share the cached .o files between a working group of people.

I agree that is a common and useful scenario.

> My first attempt was to put cache locking into ccache, but that
> seems harder to me.  The scheduling of the compile jobs looks also
> simple to me with a central compoment (the master).

Everybody says that, but I really am not convinced that a centralized
scheduler can do any better at spreading load.

distcc already comes pretty close to linear scalability within certain
ranges, and I still have some ideas for better load balancing and
pipelining.  A central controller can't possibly make 3 machines more
than 3x faster, so the best possible improvement is pretty small, and
I am not sure that it justifies introducing another moving part.

> Also distcc and ccache implement some things like comandline
> parsing twice. 

True, but to some extent that is necessary, because they need to get
different information out of the command line.

> I am not sure, that my concept fits better, but I will try. 

> Do you see any chance to integrate our solutions?

I think that should be quite possible.

I think I would approach it this way:

 1. Make sure ccache really works well with distcc as separate
    processes; write test cases for this.

 2. Also, test ccache being called by the server to provide a
    server-side cache. 

    If while doing this you discover some way in which ccache and
    distcc don't interact well, then fix that detail.  For example
    there may be some argument which is parsed inconsistently, or it
    might be possible to make them default to calling each other.

 3. Now, try to measure how much performance is lost by having ccache
    and distcc in separate programs.  If it is less than a couple of
    percent, don't worry about it.  

    I kind of doubt that parsing the arguments twice and forking an
    extra process is really a significant cost, but if it is, perhaps
    it can be addressed by merging the existing codebases.

Even compilers are built out of several interacting processes -- cpp,
cc1, as, ld, etc.  This modularity doesn't kill performance or smooth
operation, and indeed it allows unexpected flexibility such as distcc
and ccache!

-- 
Martin 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://lists.samba.org/archive/distcc/attachments/20020912/f2e3ae71/attachment.bin


More information about the distcc mailing list