[distcc] distcc over slow net links

Tue Aug 26 01:32:45 GMT 2003

On Mon, 25 Aug 2003 15:00:09 +0200 (CEST)
Dag Wieers <dag at wieers.com> wrote:

> On Mon, 25 Aug 2003, Martin Pool wrote:
> 
> > On Mon 2003-08-25, Dag Wieers wrote:
> > 
> > > Well, at the end of compiling the set of objects (either the end of
> > > that parallelization or when -j5 was given, after the 5th job).
> > 
> > What set of objects?  Make does not tell distcc the list of targets that
> > will be built or the -j level.  
> 
> I agree, what I was talking about would have meant some changes so that 
> distcc was a bit more clever (and not just a seperated process).

Do you have any details in mind?

> No, I wouldn't requeue it to a nearby machine, I would rebuild the 
> left-overs (that we're waiting on) to _any_ idle machine and kill them off 
> as soon as I get a result from one of them. A waste of resources sometimes
> (if they're not used for anything else hardly a waste).

Also, in some cases, the network may itself be a limiting factor.  (It
probably will be for 10Mbps or slower.)  Flooding the network with
several copies of the job might make the situation worse.

> > I think I'd like it to gain information about the speed of machines as
> > it goes along.  Perhaps that would allow it to sort the machines into a
> > different order, or perhaps have stronger preferences between machines
> > than it does at the moment.
> 
> Yeah, but you cannot learn this information from the files you send it. 
> Because they're not all equal. 

I suspect that for any given project, it may be reasonable to assume
that all files are of roughly equal difficulty to compile.

> Maybe sending over Bogomips (or processor 
> info) may be a simple implementation that doesn't have a lot of overhead.
> The logic would make some sense to that information.

Bogomips measures time to execute a small loop.  I can't see that it
would correlate to compiler performance in any very precise way.
Compilers are going to depend heavily on cache and memory bus
performance, which bogomips I think does not.  

On varying architectures not only will compiler performance per cycle
vary, but also Bogomips is not comparable.

And of course non-Linux systems don't have bogomips.

> Or you could have some test-file that you send the first time to a machine 
> and cache that for later use. (And maybe do a new test every 100 runs to 
> make sure that a server isn't replaced in the meantime).
> 
> And the latency could be measured by doing some ping/tcptraceroute and the 
> time it takes for the server to answer ?
> 
> Just brainstorming...

Sure.  Don't let me discourage you.

-- 
Martin 

GNU does not eliminate all the world's problems, only some of them.
		-- The GNU Manifesto