[distcc] Re: Using distcc for other tasks (distributed
"filtering")
Christian Leber
christian at leber.de
Sun Feb 15 21:45:29 GMT 2004
On Mon, Feb 16, 2004 at 07:33:11AM +1100, Ben Elliston wrote:
> gzip has a nice property that:
> cat A B | gzip > foo.gz
>
> is functionally equivalent to:
> (gzip < A && gzip < B) > bar.gz
>
> The best way to parallelise your compression work would be to divide
> your workload into N pieces, where N is the number of machines you
> have. Use split(1) to break the input into N pieces and use each host
> to gzip one chunk. At the end, "cat" the result together again.
Exactly, i have about 30000 pieces, my problem is that i don't know how
to get distcc to pipe it through the gzip(*) on the remote boxes.
Christian Leber
(*) in fact it's stuff from 7z that takes about 15x the time, for gzip
this would not be worth, but is decompressable with normal gzip
--
"Omnis enim res, quae dando non deficit, dum habetur et non datur,
nondum habetur, quomodo habenda est." (Aurelius Augustinus)
Translation: <http://gnuhh.org/work/fsf-europe/augustinus.html>
More information about the distcc
mailing list