[distcc] Suggestion about host selection

Thomas Schürger thomas at schuerger.com
Thu Jun 19 18:57:23 GMT 2008


> If this was the way it was done, it'll lead to poor utilization of 
> servers in some situations: the number of concurrent jobs accepted at 
> the servers is 2 greater than their number of CPUs. So, the client would 
> fill the first server with more jobs than in can handle at the same time 
> before even considering the second server. (Remember that the slot 
> mechanism on the client does not take into account which servers other 
> clients have reserved.)

It would be fine with me if the current slot selection would remain the
default, but it should also be possible to use the other slot selection
if the user wants that.

> On the other hand, the statement 'prefers hosts towards the start of the 
> list' is very much true in the aggregate when you have multiple 
> concurrent clients using the servers!  Then you should consider using 
> the --randomize flag, which probably should have been the default 
> setting anyway.

Where is that flag? Randomized selection sounds good. What about
using an exponential distribution, which prefers slots towards the
start of the list? Would be easy to implement.

> The major omission in the current code, in my opinion, is that 
> randomization does not take into account the specified host slots.

OK, that would be something to change then.

It would be fine if one could list a host multiple times (which would
emulate the behavior I was looking for). This is not possible currently.

For example, I could choose to use

host1/1 host1/1 host1/1 host1/1 host1/1 host1/1 host2/1 host2/1 host3/1 
host3/1 host3/1

which would lead to what I wanted. But with the current selection
algorithm, each of the hosts' slots would have the same slot number
(all 1), so when host1/1 is locked, distcc would try to use the
second host1/1 entry, which of course is also locked (same lockfile
name). So in practice this is really the same as "host1/1 host2/1
host3/1".

The easiest way for a better selection implementation would be to
first expand the host/slotcount list to a list of host/slotnumber
pairs and then select

a) linearly from the front
b) with exponential random distribution
c) with uniform random distribution
d) ... what ever else may seem appropriate


Greetings,
Thomas.



More information about the distcc mailing list