[distcc] Re: Using distcc for other tasks (distributed "filtering")
Ben Elliston
bje at wasabisystems.com
Sun Feb 15 20:33:11 GMT 2004
Christian Leber <christian at leber.de> writes:
> I really enjoy using distcc for compilations, now I'm searching for
> a simple way to distribute data compression over a network
> (compressing Knoppix, takes about 100 minutes ever time), i just
> need to get a 64 kb to the other boxes, compress them and get them
> back (that may be for example 30000 64kb blocks), so it's basically
> the same like compressing.
gzip has a nice property that:
cat A B | gzip > foo.gz
is functionally equivalent to:
(gzip < A && gzip < B) > bar.gz
The best way to parallelise your compression work would be to divide
your workload into N pieces, where N is the number of machines you
have. Use split(1) to break the input into N pieces and use each host
to gzip one chunk. At the end, "cat" the result together again.
Ben
More information about the distcc
mailing list