[distcc] Contributing to distCC

Martin Pool mbp at sourcefrog.net
Mon Dec 13 19:36:44 GMT 2004

Assaf Lavie wrote:

> /Hello All.
> I'm a part of a group of 3 C.S. students (all experienced programmers) 
> who are participating in an open-source grid-computing workshop. We 
> are all interested in distributed compilation, and would like to 
> contribute to distCC as part of our workshop.
> /


One interesting year-long project might be to write a distributed 
compiler for Java or C#.

There are small features to do in distcc but I don't know if anything 
adds up to three (or even one) man-year of work.

One larger feature might be to generalize it from just C compilation 
into generalized remote batch processing; this requires both some 
technical changes and also finding and characterizing some tasks 
amenable to this kind of distribution.

You could look at automatically transporting the compiler to the remote 
machine but that may not be very academically satisfying.

> /One thing that struck us is that distCC is a very mature and stable 
> project, and the truth is we're having trouble deciding how exactly we 
> can make a significant contribution to it.
> Therefor I'd like to address the developer community of distCC and ask 
> for suggestions on how to contribute to distCC. Our main goal with 
> this year-long workshop is to improve on an existing algorithm by way 
> of distributizing (is that a word?) it. Are there aspects of distCC 
> that could be further seperated and distibuted among machines? /

You could improve the scheduling; this would depend on having good 
access to a large number of machines for testing.  (I don't, at the 
moment, which holds me back from doing much here.)

> /We gather that preprocessing is done on a local machine, and then 
> compilation of translation units takes place on peer machines. Would 
> it make any sense, would it be an improvement, to distribute the 
> preprocessing stage itself? Could the compliation process itself be 
> split up into smaller processes that could run in parallel on 
> different machines? These are the sort of enhacements that we are 
> required to implement.
> /

Why don't you do some preliminary investigation into how those might be 
done and post your ideas?



More information about the distcc mailing list