[distcc] Preprocessing limit
Martin Pool
mbp at samba.org
Sun Feb 22 23:44:14 GMT 2004
On 20 Feb 2004, john moser <bluefoxicy at linux.net> wrote:
> try reading it again, from the beginning.
>
> Let me try it like this.
>
> DISTRIBUTED COMPUTING NETWORK: 3700 NODES
> NODE X: 1.5 Ghz Processor
> TIME TO PROCESS 5 PARALLEL PREPROCESSINGS: 45 SECONDS
> NODES USED: 5
> TIME TO PROCESS 10 PARALLEL PREPROCESSINGS: 130 SECONDS
> NODES USED: 10
> TIME TO PROCESS 3700 PARALLEL PREPROCESSINGS: 49 DAYS, 6 HOURS, 15 MINUTES, 27 SECONDS
> NODES USED: about 1-2 at a time, as the preprocessings slowly finish on those last 3 days and get sent out at random times.
>
> You can NOT do as many preprocesses in parallel as you have nodes sometimes.
> To MAXIMIZE efficienty, you need to specify -j$NUMBEROFNODES and LOCK the
> number of parallel preprocessing operations to a lower number. Then, WHILE
> one node is compiling a complex source file, you can preprocess AND send out
> another job, possibly BEFORE that one finishes.
>
> Simple enough? The idea is to get the job OFF the box ASAP so it can come
> back FINISHED ASAP.
>
> Now, THNK this time, before you incur my wrath again.
Very funny.
Remember kids, THNK first!
>
> On 19 Feb 2004, john moser <bluefoxicy at linux.net> wrote:
>
> > distcc needs a way to limit how many preprocessing jobs it can run at once.
> > It may be advantageous to have, say, 150,000 make jobs (if you have a 100000
> > node computing network, for example; let's say HP decides it wants to wait
> > 2 minutes to compile a new operating system and all its tools). Running
> > 150,000 parallel preprocessings will take hours. After maybe 80% of that time,
> > a few jobs will trickle out to the compiling nodes.
> >
> > Instead, one could limit how many preprocessings can occur. The distcc would
> > sleep until there's a free local preprocessing slot, and then run that
> > preprocessing, then ship out to a free node. In this way, the actual
> > efficiency will more effectively approach the theoretical efficiency.
> >
> > Think about it. Need you wait 10 minutes with 50 jobs before shoving them out
> > the network? Are you always going to have enough processing power to get close
> > to theorectical values? What's the best way to get off the box
> > ASAP?
>
> That's what the -j level is for.
>
--
Martin
More information about the distcc
mailing list