[distcc] distcc scalability with # of users?
Daniel Kegel
dkegel at google.com
Sat Apr 17 01:42:13 GMT 2004
Dan Kegel wrote:
> I'm interested in using distcc in a group of 16 programmers
> where all 16 workstations are equally fast, all 16
> workstations are both compile servers and compile clients,
> and all access a shared copy of distcc over NFS or SMB.
> Several potential problems come to mind when thinking about this:
>
> 1. Since the list of hosts read from $prefix/etc/distcc/hosts is
> the same for all workstations, every workstation will
> issue large compile jobs to itself sometimes even though it'd be better
> off only handling preprocessing and linking (right?)
We couldn't demonstrate this with our little synthetic benchmark.
(Maybe it happens in the real world; no idea.)
> 2. Distcc won't currently check the load average of each compile server,
> so workstations busy with non-distcc jobs will get slammed with
> distcc jobs, negatively impacting normal use of the workstations.
>
> 3. If more than one user is issuing distcc jobs, their distcc's
> will sometimes issue jobs to the same machine by chance
> (fairly often, if distcc assigns jobs in order of the etc/distcc/hosts
> file).
We did verify these two just now using a trivial synthetic benchmark.
It'd probably be 'easy' to make the distcc server check the load
average, and drop the connection if it was over some configurable threshold.
As dparent.c says,
* @todo Quite soon we need load management. Basically when we think
* we're "too busy" we should stop accepting connections. This could
* be because of the load average, or because too many jobs are
* running, or perhaps just because of a signal from the administrator
* of this machine.
So maybe we'll try to implement this todo note.
- Dan
More information about the distcc
mailing list