[distcc] LZO compression

Alexandre Oliva oliva at lsd.ic.unicamp.br
Tue Dec 17 20:43:00 GMT 2002


On Dec 16, 2002, Martin Pool <mbp at samba.org> wrote:

> We might also consider whether it's worth compressing data going in to
> SSH, or whether we should rely on its optional compression.

/me thinks we might be better off introducing plug-ins for
transmission of files, and possibly support different transmission
mechanisms for upload of the source file and download of the object
file.  (one is mostly text, the other is mostly binary, so they
compress significantly differently).


One of the ideas for a plug in would be to `compress´ transmitted data
grabbing some ideas from rsync.  A distccd plugin could keep a local
cache of hashed rsync blocks, and the corresponding distcc plugin
would compute hashes on the preprocessed file just like rsync does and
send the hashes.  The distccd plugin could quickly check which blocks
are present in the local cache and request the client to send only
those that are not (like rsync does).  The difference is that we'd
keep pre-computed hashes on the server in (semi?-)persistent memory,
instead of going over a large collection of files trying to find a
good fit.

This could save a lot of bandwidth not only while rebuilding projects
(somewhat like ccache, but with a finer granularity, if you're lucky
to have the file submitted for compilation on the same server), but
also when building large projects with many source files that all
include the same large header files (C++ STL comes to mind).

-- 
Alexandre Oliva   Enjoy Guarana', see http://www.ic.unicamp.br/~oliva/
Red Hat GCC Developer                 aoliva@{redhat.com, gcc.gnu.org}
CS PhD student at IC-Unicamp        oliva@{lsd.ic.unicamp.br, gnu.org}
Free Software Evangelist                Professional serial bug killer



More information about the distcc mailing list