[distcc] Re: keeping localhost busy

Tue Sep 30 03:30:55 GMT 2003

On 29 Sep 2003, Jeff <rustysawdust at yahoo.com> wrote:
> Martin Pool wrote:
> >Maybe you should build more than one library at a time, so that even
> >when reaching the end of any one build, there are several jobs that
> >can be run?
> 
> Due to library dependencies, this isn't always easy. For instance, you 
> might have a library which contains Qt ".ui" files that need to have "uic" 
> run on them to generate headers that other libraries need to include. I 
> suppose you could change your make system to do a "make uic" run before the 
> "make install". 

Yes, of course you need to build the libraries in a way that respects
their dependencies.  If there is a strictly linear chain of
dependencies then you are out of luck.

I don't know about KDE, but for GNOME there are several underlying
libraries which could be built in parallel.  For example (from memory)
libxml and libgtk both depend on glib, but they do not depend on each
other.

> However I think that part of the beauty of distcc is that 
> with most source code you don't have to change a thing.

Well, this also would not require a change other than running make in
more than one directory at once.  It should be reasonably easy to add
to something like GARNOME, GAR or Portage.

> >You don't need to look at the source.  Just tell me if you can
> >think of a better algorithm.  Once you can describe it in English
> >or pseudocode translating into C is fairly simple.
> 
> Since this problem only happens at the end of a build, I was thinking that 
> everything would run normally until all the jobs had been assigned to a 
> machine. After that point, as soon as localhost becomes idle give it one of 
> the jobs already assigned to one of the remote machines, and then take the 
> work of whomever completes it first.

It's not a bad idea.  

How would you know when localhost was idle?  You might go off the
number of distcc processes running, which is how we assign jobs at the
moment.  

But how do you know that it is not e.g. generating documentation or
linking something, both of which are pretty common activities at the
end of a build.

> 
> >Timothee suggested killing the job on B and re-running it on
> >localhost, but for at least this case it would be wasteful because B
> >is as fast as localhost.  For C++ code, transit time is relatively
> >small.
> 
> My suggestion is different in that you wouldn't kill the remote job. I 
> think the other thread already established that estimating how long it will 
> take machine B to finish its job is very complex. I think there are 
> probably many people using distcc who have a localhost which is relatively 
> fast, but a farm which has obsolete, hand-me-down, or donated machines. In 
> that case, I think it's likely you'll end up waiting for one or two of your 
> slowest machines at the end of a build. In the algorithm I described above, 
> the worst that might happen is that you startup a job on localhost, the 
> remote host finishes first, and then you have to cancel the local job. But 
> considering the localhost was idle anyway (probably just sitting around 
> waiting for the last few objects to create a library), this isn't a big 
> deal.

*If* you know it's idle.

And it's not just CPU time; it's possible that starting another job
locally will use up VM and cause e.g. Make or some distcc clients to
be swapped out when they otherwise would not be.

> The only case I can think of that this might be an issue is when your 
> localhost is part of the farm, and other people are actively using
> it.

Or you're even actually using the machine yourself.

-- 
Martin