[distcc] distcc with LVS and stuff

Wed Aug 28 03:14:01 GMT 2002

Hi,

I'm about to try deploying distcc in the build environment where I work, 
I suppose the current system we use may be of some use, so I'll describe 
it first:

A central server (big, reasonably quick disks, backups etc) hosts all of 
the developer home directories, and serves them via NFS.  gcc is wrapped 
by a perl script, which makes an ssh connection (on a specific port) to 
a machine which runs LVS and load balances these ssh connections over 
all various boxes - mainly developer desktop machines.  LVS load 
balances (with low latency) the tcp connections, and farms them out to 
machines on a weighted round-robin basis (e.g. a dual CPU athlon has a 
higher priority than a P3-700).  The home directories are shared to all 
the build boxes via NFS, and the clocks are kept in sync with NTP.  If 
machines fall ill, this is noticed by the load balancer (using 'mon', 
and a couple more perl scripts), and the machine is temporarily 
withdrawn from the load balanced pool.

http://www.linuxvirtualserver.org/

http://www.kernel.org/software/mon/

Bad points with the current setup:

. Central server has become a bottleneck, as number of developers have 
increased
. NFS makes it a bit brittle (but then mon takes care of this)
. SSH connection overhead is not insignificant (could be swapped for rsh 
with little reduction in security, as NFS sucks anyway) however it is 
much better with blowfish than the default 3des!
. Have to make sure all boxes have the same headers etc. installed 
(actually pretty easy, we use Debian, and our own devel package in our 
apt respository pulls in all necessary stuff automagically)

The big plan:

. Alter our build scheme to put all object files on developers local 
disks, reading the source files over NFS, thus taking the make (we use 
our own perl based make scheme which parallelises very well, but can 
chew CPU a bit), link, and dep steps off the central box
. Use distcc with LVS to gain global (i.e. from many client boxes, and 
different users) load balancing, and weighting to servers (most servers 
are dual CPU, and the range of speeds vary quite a lot) - I've tested 
this briefly, and it seems to work OK
. Whack all insecure traffic (including NFS and distcc) on a seperate 
VLAN to keep them away from less trusted hosts
. Buy one of these http://www.tyan.com/products/html/thundergche.html 
for the central server if it all still sucks ;o)

Here are a couple of things I'm uncertain about - any feedback would be 
cool:

. Are there likely to be any problems with the use of LVS?
. We also use ccache, and I'm concerned about leaving the cache 
directory on NFS - is this likely to cause problems with locking, I'm 
also wondering how soon before this becomes the bottleneck - any 
thoughts on a way of having a multi-level cache with additional caches 
on the disks of the local build machines?
. Is there any easy way to get the distcc server to block, rather than 
taking on extra jobs over a certain number - we compile some c++ source 
files which make g++ chew RAM quite heavily, all the build boxes have at 
least 512M, but they can start grinding when simultaneously compiling 4 
big files and running X etc - maybe there are inetd replacements that 
have this functionality already?

As an aside, it seems to me that it wouldn't be very much work to get 
distcc working with "fsh" http://www.lysator.liu.se/fsh/ to make it a 
lot more secure without much overhead?  Of course, this isn't of any use 
for us as it would pretty much break the LVS load balancing..

Phew,

Tim.