[distcc] SSH encryption overhead

Thu Sep 19 03:33:00 GMT 2002

On 18 Sep 2002, Aaron Lehmann <aaronl at vitelus.com> wrote:
> [distcc faq]
> > However, ssh encryption is pretty expensive. Rough measurements show
> > that transferring a file across ssh takes CPU time comparable to
> > compiling it, which would make it fairly impractical for distcc.

Here's what I did.  I took a typical .i file, from the compilation of
Mozilla 1.1 (which works, by the way).

  % ls -l cppout_0000003401.i
  -rw-r--r--    1 mbp      mbp        705544 2002-09-19 13:06 cppout_0000003401.i
  % time ssh nevada 'cat >/dev/null' < cppout_0000003401.i 
  ssh nevada 'cat >/dev/null' < cppout_0000003401.i  0.32s user 0.01s system 39% cpu 0.827 total
  % time gcc -c cppout_0000003401.i
  gcc -c cppout_0000003401.i  0.84s user 0.02s system 43% cpu 1.971 total

So CPU usage just to ship the file across the network is 38% of
compiling it.  If the same amount of effort was expended on both sides
(probably not), plus some more to send back the .o, then this would be
pretty marginal.

However, there is a presumably nearly constant time to set up an SSH
connection.  This might be estimated with 

  % time ssh nevada /bin/true
  ssh nevada /bin/true  0.20s user 0.00s system 34% cpu 0.582 total

If we hold the connection open, we might avoid that cost, which makes
things a bit better.

I realized that I have "Compression yes" in my .ssh/config, because
most of the machines are remote, though not nevada.  With that off,
things are rather better

  % time ssh  -o 'Compression no' nevada 'cat >/dev/null' < cppout_0000003401.i
  ssh -o 'Compression no' nevada 'cat >/dev/null' < cppout_0000003401.i  0.25s user 0.01s system 18% cpu 1.418 total

Naively subtracting the connection overhead from this might indicate
that we can send the file in 0.05 CPU seconds, which is pretty
reasonable.

As somebody pointed out recently, port forwarding is not secure, but
perhaps it will do to measure overhead.

  % ssh  -f -L 4201:localhost:4200 -o 'Compression no' nevada sleep 3600 

  % DISTCC_HOSTS=nevada ; time ./src/distcc -c ./cppout_0000003401.i
  ./src/distcc -c ./cppout_0000003401.i  0.00s user 0.00s system 0% cpu 1.331 total

  % DISTCC_HOSTS=127.0.0.1:4201 ; time ./src/distcc -c ./cppout_0000003401.i
  ./src/distcc -c ./cppout_0000003401.i  0.00s user 0.00s system 0% cpu 1.369 total
  % DISTCC_HOSTS=127.0.0.1:4201 ; time ./src/distcc -c ./cppout_0000003401.i
  ./src/distcc -c ./cppout_0000003401.i  0.00s user 0.01s system 0% cpu 1.318 total
  % DISTCC_HOSTS=127.0.0.1:4201 ; time ./src/distcc -c ./cppout_0000003401.i
  ./src/distcc -c ./cppout_0000003401.i  0.00s user 0.01s system 0% cpu 1.290 total

So it might seem that there is no significant impact on elapsed time
on the client from using an SSH tunnel.

If we hold open a connection using a long-running client side process
then performance might be comparable to port forwarding, but it will
be more secure.  

What I have in mind is that SSH connections will go via a little
process that listens on a Unix-domain socket in the distcc directory.
Unix permissions will make sure that program can only be accessed by
the relevant user.  It accepts connections one at a time and forwards
them across an ssh channel to a server on the other end.  

This process is automatically created by distcc if it's not there the
first time it is needed, by forking off and going into a special
mode.  (It needs protection against two trying it at the same time.)
At the other end of the SSH connection, it needs to run something like
"distccd -i", but in a mode that will accept multiple jobs over a
single connection.

Using lzo compression of the data first may (or may not) be a
performance win, by reducing the amount that needs to be compressed
and hashed.

ssh -v for these machines seems to show that it is using

  debug1: kex: server->client aes128-cbc hmac-md5 none
  debug1: kex: client->server aes128-cbc hmac-md5 none

Thanks for prompting me to look at it again.  I'll update the FAQ.

Having said all that, I think I would like to fix some of the little
argument parsing nits, finish the test case, add load limiting, and
then make a 1.0 release.

> I was pondering recommending a compromise where distcc would use ssh to
> transmit commands, but actual transfer of data would occur over
> unsecured sockets. The benefit is that this system removes the "any
> command can be run through a simple network connection" issue. I've come
> to think that this is a bad idea, because if you're on an untrusted
> network you can't trust unsecured data transfer anyway. Source code
> and the resulting binary aren't usually very private, but SSH also
> protects the integrity of data that travels over it. Without some
> cryptographic integrity verification, a malicious user could modify
> either the code on its way to compilation or the object code on the way
> back to the client. This would be a great way to break into any computer
> which runs the binary that distcc orchestrated the compilation of.

I don't want to get into doing half-hearted security.  If nothing else
it will greatly increase the distcc maintenance burden, by making it a
possible point of security compromise.  If instead it depends on the
transport being secure (either an isolated lan, or ssh), then things
are much simpler.  

There is still the possibility of security problems such as tmpfile
races allowing local attacks, but it's a smaller exposure.

If the source goes across a hostile network without integrity checks,
then people might try tricks like

 #include "/etc/passwd"

Cheers,
-- 
Martin 

Start your own revolution and cut out the middle man.
		-- Billy Bragg