compressed archives

Christopher Vance vance at aurema.com
Wed Mar 5 18:12:50 EST 2003


Suppose I have a particular version of a largish compressed archive,
most likely a .tgz or .tbz2, and that a remote machine has a newer,
and only slightly different, version of the same archive, where most
of the content hasn't actually changed much.  I might attempt to obtain
a copy of the newer archive by first copying my local older copy to
the newer name as a file to update from.

My understanding is that a small change in a file before compression
can result in a large difference afterwards.

If rsync were to do its file stat and content comparisons on the
uncompressed copy of both archives, might this not result in less
network traffic (sending only the small changes) than just looking at
the compressed copies?  (Yes, I realize that there are the additional
(non-network) expenses of decompressing at both ends, and probably
recompressing at the destination.)

My particular application is OS installation tarballs, but a number of
bloated or huge software products out there have sourceballs where
there might also be real savings.

Have I chosen the wrong tree to bark up?

-- 
Christopher Vance


More information about the rsync mailing list