[distcc] Contributing to distCC / Massively parallel compilations

Assaf Lavie lavie at runbox.com
Tue Dec 14 20:04:35 GMT 2004


Victor Norman wrote:

>Assaf, et al.,
>
>Sometime back I posted a message about "My big plans for distcc".  I have
>implemented most of those plans and they are in use here at Marconi.  I have a
>compilation farm of linux and solaris boxes of varying CPUs speeds and with
>varying numbers of cpus each.  I have also designated one solaris box as my
>"host server" and have a program running there that gives out hosts to
>compilations, keeps track of which machines ("hosts") in the compilation farm
>are available/accessible, gives out status information to other programs, etc.
>  
>
[snip]

>So: summary: Here are some items you might do/research:
>
>o rewrite the system in C/C++.
>o investigate how to handle hosts when their load averages go up: how should
>they be moved down in the tier system.  Or is there a better way than the tier
>system?  Remember: the goal is always to produce the fastest compile, which
>generally means using the fastest cpus that are available.
>o investigate how to compute a machine's power index.  How much difference does
>the OS make?  You could try compiling on a machine running solaris, and then
>the same machine running linux (gentoo for sparc, e.g.).  What difference does
>it make?
>o investigate how much network bandwidth is used in the system.  Does it affect
>a compilation's time?
>
>
>Vic
>  
>
Hi Victor,

When you say you've implemented your "big plans for distCC" - are those 
enhancements already a part of the latest distCC version?
I also want to remark that while porting the system to C/C++ may be a 
good idea, it's not quite the type of work we are expected to do. I'm 
sure we all can learn a lot about the subject from porting code, but the 
idea of this workshop is to develop something on our own (even as part 
of an existing system).

The idea of improving the scheduling sounds interesting. Developing a 
way to measure the performance of each host and optimize the 
distribution accordingly - is that what you have in mind?

Can you please explain a bit more what you mean by "supporting many 
compilations simultaneously from multiple machines"?

Thanks.

I just want to add: if any of my previous posts gave the mistaken 
impression that I am the instructor of the workshop (i.e. that I have 
students) I just want to clarify that I'm not. Just one in a group of 
three students that have chosen distributed compilation as the subject 
of their grid computing workshop.

Assaf
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lavie.vcf
Type: text/x-vcard
Size: 169 bytes
Desc: not available
Url : http://lists.samba.org/archive/distcc/attachments/20041214/9e4615ad/lavie.vcf


More information about the distcc mailing list