[distcc] Contributing to distCC / Massively parallel compilations

Victor Norman vtnpgh at yahoo.com
Tue Dec 14 20:54:53 GMT 2004

Assaf, et al., 

--- Assaf Lavie <lavie at runbox.com> wrote:

> Victor Norman wrote:
> >Assaf, et al.,
> >
> >Sometime back I posted a message about "My big plans for distcc".  I have
> >implemented most of those plans and they are in use here at Marconi.  I have
> a
> >compilation farm of linux and solaris boxes of varying CPUs speeds and with
> >varying numbers of cpus each.  I have also designated one solaris box as my
> >"host server" and have a program running there that gives out hosts to
> >compilations, keeps track of which machines ("hosts") in the compilation
> farm
> >are available/accessible, gives out status information to other programs,
> etc.
> >  
> >
> [snip]
> >So: summary: Here are some items you might do/research:
> >
> >o rewrite the system in C/C++.
> >o investigate how to handle hosts when their load averages go up: how should
> >they be moved down in the tier system.  Or is there a better way than the
> tier
> >system?  Remember: the goal is always to produce the fastest compile, which
> >generally means using the fastest cpus that are available.
> >o investigate how to compute a machine's power index.  How much difference
> does
> >the OS make?  You could try compiling on a machine running solaris, and then
> >the same machine running linux (gentoo for sparc, e.g.).  What difference
> does
> >it make?
> >o investigate how much network bandwidth is used in the system.  Does it
> affect
> >a compilation's time?
> >
> >
> >Vic
> >  
> >
> Hi Victor,
> When you say you've implemented your "big plans for distCC" - are those 
> enhancements already a part of the latest distCC version?
> I also want to remark that while porting the system to C/C++ may be a 
> good idea, it's not quite the type of work we are expected to do. I'm 
> sure we all can learn a lot about the subject from porting code, but the 
> idea of this workshop is to develop something on our own (even as part 
> of an existing system).

No, I have made no changes to distcc.  My system just uses distcc to create a
multi-user, multiprocessor, distributed, multi-os, load-balancing environment
for compiling.  How much of my prototype code should actually be incorporated
into distcc is definitely something up for debate.

> The idea of improving the scheduling sounds interesting. Developing a 
> way to measure the performance of each host and optimize the 
> distribution accordingly - is that what you have in mind?

Yes, that's right.

> Can you please explain a bit more what you mean by "supporting many 
> compilations simultaneously from multiple machines"?

I mean that multiple software engineers log in to their favorite build machines
(or use their own wimpy desktop Sun Ultra 5), and start builds.  Each build
contacts the single host-server machine to get a host for each compilation that
happens in the build.  So, host requests from multiple machines, and the host
server tries to manage the system so that all builds are fast, and none of the
compilation hosts are totally overwhelmed with requests.

(note: you could probably show that we cannot find a "best" solution: the
problem is probably np-complete -- like the "bin packing" problem...)


> Thanks.
> I just want to add: if any of my previous posts gave the mistaken 
> impression that I am the instructor of the workshop (i.e. that I have 
> students) I just want to clarify that I'm not. Just one in a group of 
> three students that have chosen distributed compilation as the subject 
> of their grid computing workshop.
> Assaf
> > begin:vcard
> fn:Assaf Lavie
> n:Lavie;Assaf
> email;internet:lavie at runbox.com
> tel;work:+972-3-9225252x915
> tel;cell:+972-50-8212803
> x-mozilla-html:TRUE
> version:2.1
> end:vcard

Do you Yahoo!? 
Yahoo! Mail - Easier than ever with enhanced search. Learn more.

More information about the distcc mailing list