[distcc] distcc over slow net links

Mon Aug 25 13:00:09 GMT 2003

On Mon, 25 Aug 2003, Martin Pool wrote:

> On Mon 2003-08-25, Dag Wieers wrote:
> 
> > Well, at the end of compiling the set of objects (either the end of
> > that parallelization or when -j5 was given, after the 5th job).
> 
> What set of objects?  Make does not tell distcc the list of targets that
> will be built or the -j level.  

I agree, what I was talking about would have meant some changes so that 
distcc was a bit more clever (and not just a seperated process).

> > > One can imagine a naive algorithm wasting a lot of time.
> > > 
> > > To turn the question around: why don't we just schedule the job on
> > > the nearby machine in the first place?
> > 
> > Because it may make configuring complex (define what machine is nearby
> > and what not) and a configuration may be static while the environment
> > is changing a lot (read: I could be moving from offices).
> 
> But re-queueing the job on a nearby machine seems to require this same
> knowledge too.

No, I wouldn't requeue it to a nearby machine, I would rebuild the 
left-overs (that we're waiting on) to _any_ idle machine and kill them off 
as soon as I get a result from one of them. A waste of resources sometimes
(if they're not used for anything else hardly a waste).

> > If we could test the latency of the network and the 'power' of each
> > system before compiling and have a some clever logic for deciding
> > which systems to use in what order, it would be even better. Because
> > there's no additional configuring involved and it would work with
> > dynamic set-ups.
> 
> I think I'd like it to gain information about the speed of machines as
> it goes along.  Perhaps that would allow it to sort the machines into a
> different order, or perhaps have stronger preferences between machines
> than it does at the moment.

Yeah, but you cannot learn this information from the files you send it. 
Because they're not all equal. Maybe sending over Bogomips (or processor 
info) may be a simple implementation that doesn't have a lot of overhead.
The logic would make some sense to that information.

Or you could have some test-file that you send the first time to a machine 
and cache that for later use. (And maybe do a new test every 100 runs to 
make sure that a server isn't replaced in the meantime).

And the latency could be measured by doing some ping/tcptraceroute and the 
time it takes for the server to answer ?

Just brainstorming...
--   dag wieers,  dag at wieers.com,  http://dag.wieers.com/   --
[Any errors in spelling, tact or fact are transmission errors]