[distcc] interesting survey results (fwd)

Tue Mar 11 06:49:59 GMT 2003

0. What version of distcc are you using?

distcc-1.1

> 1. Your name and email address:

Tobias Klausmann (klausman-distcc<snail>schwarzvogel<spot>de)

> 2. OK to publish this?  Yes/no/yes, but anonymously:

Yes, just make the Email-Address spambot-unfriendly (if you need
to quote it at all).

> 3. Your codebase: lines of code (by wc -l), and language:
Linux 2.4.21-pre4
240139 lines of code total in about 10000 files, code is rarely
compiled in its entirety as it is portable between more than a
dozen targets. Compiler is gcc 3.2.

> 4. Your machines: number, OS, processor, memory, network connectivity:
Two machines, one 1.3GHz Athlon (512M RAM), one dual-P2/400 (768M
of RAM), network is switched 100MBit FDX.

> 5. Time to compile, with and without distcc:
Measurements where from hot cache, with the source hosted on the
Athlon, both machines largely idle (apart from the distcc).
This is a full build, everything is remade.

Or, in shell words:

make bzImage;make clean # to fill the cache
for i in $(seq 1 11); do
	date +%s>>d
	make -j$i bzImage
	date +%s>>d
	make clean
done

Here are the results:
1: 237
2: 189
3: 189
4: 160
5: 162
6: 156
7: 156
8: 156
9: 156
10: 156
11: 156
12: 157

The synchronity down to a second is amazing. But the general idea
of using 2*N_CPU seems to be correct here, too. If I find the time
(and mye boss lets me), I'll test it with a larger set of dual-P3s
at work (we host ~600 Compaq DL-360 of which at least some might
be used for this. Or maybe I'll use spare machines).

> 6. Any other observations:
Steepest speedup is (unsurprisingly) going from -j1 to -j2. After
that, the overhead (both as produced by distcc and by context
switches because n_jobs > cpus) seems to slow things down more and
more. The sweet spot seems to be around 6, depending on how
parallelizable the compile is. Of course, partial builds profit
near to nothing as changes are mostly in one spot, where -jN
parallelizes just about not at all. For comparison, here are
values for n=1..4 *without* using distcc on the Athlon:

1: 236
2: 237
3: 237
4: 236

Usually, I'd expect the -j2 to reduce the effect of I/O latency.
with hot caches, this is virtually nonexistant. I guess making the
caches hot with a tailored find -name '*.[c|h]'... might be worth
the hassle if one wants to compile a kernel, as the find
approximates a sequential read. But that is stuff for another
benchmarking session ;)

> Thanks!
> Martin

I have to thank for distcc :)
Tobias

----- End forwarded message -----
-- 
Martin