[distcc] SSH encryption overhead

Martin Pool mbp at samba.org
Thu Sep 19 05:06:00 GMT 2002

On 18 Sep 2002, Aaron Lehmann <aaronl at vitelus.com> wrote:

> Speaking of connection overhead, are you using an optimized OpenSSL?

No, I'm using the stock Debian build.

Eventually I would like to try building large chunks of Debian using
distcc -- mostly because it might throw up some interesting problems,
but also because it might actually be useful to the project.  The
variety of wierd things people do in makefiles is remarkable.  I
haven't ever tried building it before.

It would be nice if Debian could either ship -mpentium packages for
some crucial parts, as I'm sure you realize.

> That sounds sane. I was looking for similar functionality in ssh
> itself the other day and was disappointed not to find it.

There is a program called fsh that tries to do connection hoarding,
for CVS in particular.  When I tried it (months ago), it was slow and
fragile.  If we put in a bit of support on both ends, so that a new
distccd doesn't need to be forked every time, then it should be easy
to do it internally.

> > Using lzo compression of the data first may (or may not) be a
> > performance win, by reducing the amount that needs to be compressed
>                                                            ^^^^^^^^^^ encrypted?
> > and hashed.

Sorry, yes, "encrypted".

> This would probably be a win on the receiving side, at least.
> http://www.oberhumer.com/opensource/lzo/ claims ~20 MB/sec
> decompression on a P133, and 40MB/sec would be enough to break even on
> my 1100MHz Celeron according to my benchmarks of openssl's blowfish
> cipher (assuming a 2:1 compression ratio, which I don't think is
> unreasonable on preprocessed C source).

There are some numbers for lzo in CVS.  Compression is less than half
as expensive as running cpp (to say nothing of cc), and it gets a bit
better than 2:1.

> I just tested it on said machine. It compresses about 42MB/sec and
> decompresses 120MB/sec. That's a huge win on the receiving side, but I
> need 80MB/sec to break even when sending (I'm continuing to assume a
> 2:1 ratio). So it wouldn't be a win in terms of CPU time, but
> considering the fact that my network transport can not handle
> 42MB/sec, LZO compression would probably be a good idea anyway (it
> would get data across the network much faster).

As you said, SSH will be slightly more expensive than just Blowfish,
also because the data has to pass from distcc to the

LZO is most interesting for slow networks, even with TCP.  Two
indicative scenarios are:

  slow laptop, slow (wireless?) network, fast server

  many remote CPUs

In both these cases, it is possible that the network will be the
limiting factor.  It would possibly be better for the client to
schedule no local compilations and do more work on compression,
because it will allow it to shift more of the work elsewhere.

If you consider an infinite supply of volunteer machines, then the
challenge is for the client to balance up its own CPU usage (for cpp
and lzo) and network bandwidth to dispatch as many jobs as it can.  By
changing the compression level depending on cpu load, it might keep
both close to 100% (thought actually doing that would probably be

It may be that the cost of LZO is not so much the compression itself,
but rather than it will prevent us from using sendfile(), and we'll
therefore take more context switches while sending.  (But they're
pretty cheap, so who can tell?)

> I think blowfish would be a better choice of cipher. AES is arguably
> weaker in light of recent cryptanalysis, and OpenSSL provides an
> assembly-optimized implementation of blowfish (which I believe is used
> by OpenSSH). AES is implemented by rijndael.c in the OpenSSH
> distribution and does not seem to be optimized below the C level.

I'm just using the Debian default.


This debate is about more than choosing a method of developing software.
	-- Microsoft Corporation, FAQ Regarding Shared Source

More information about the distcc mailing list