[distcc] LZO compression

Martin Pool mbp at samba.org
Mon Dec 16 00:54:01 GMT 2002


On 13 Dec 2002, Stephen White <swhite at decisionsoft.com> wrote:
> My company does a fair amount of work in C++, preprocessed C++ source
> files can be quite large and bits of our internal network are not
> terribly fast .. so I thought I'd look at implementing the LZO
> compression @todo in the source code.

Well done!

Some comments: (You're not obliged to do all these things of course.
This is a good start.)

Compression obviously needs to be exercised by the test cases.

Rather than a DISTCC_USE_LZO variable, I think I'd prefer a
DISTCC_COMPRESS, so that it can be used to name different schemes in
the future.  (Perhaps sub-algorithms of LZO to get different
speed/space tradeoffs, perhaps gzip in extreme cases...)

This might be a good time to add a message digest check to the
protocol to protect against transmission errors.  That might allow use
of the fast/dangerous lzo decompressor, although doing so would depend
on calculating and knowing the digest before sending the body, which
conflicts with the idea of not loading it all into memory.

I'd like to be able to link against the system's liblzo if it has one,
to save (a trivial amount of) memory and disk, but mostly because it's
just cleaner.  There ought to be a --with-included-lzo as for popt.

The protocol flag is the right place to indicate use of LZO.  I had
imagined just saying "protocol 2 is the same but with LZO".  I'm not
sure the flags are really orthogonal to protocol version.

As you say, some of the functions are a bit hefty.

I'd probably put the lzo distribution in a separate directory.

Please use consistent indentation.  Every project would like this,
even if they disagree on what the standard should be.  Everything in
distcc should be K&R with 4-space indents (same as standard Java
style) except perhaps for some old files.

I'd like to measure the performance hit for LZO and see if it can't be
on by default.

I'm going to commit this to patches/ so that people can easily find
it.

-- 
Martin 

"Perhaps the truth is less interesting than the facts?"
	-- http://www.theregister.co.uk/content/6/28574.html



More information about the distcc mailing list