GZIP, ZIP, ISO, RPM files and rsync, tar, cpio

jw schultz jw at pegasys.ws
Thu Aug 28 19:30:36 EST 2003


On Thu, Aug 28, 2003 at 10:31:08AM +0200, Jacobus Erasmus wrote:
> I noticed with rsync and compressed files or package files the transfer
> efficiency drops considerably. Eg. rsync an ISO image of a distribution
> will give you between 30% and 60% of the original transfer although from
> Beta1-Beta2 the change could not have been that great. The same thing
> happens with ZIP files for obvious reasons.

There are several things that may be contributing to that.

One is the truncated blocksum collision problem common with
files larger than around 3 or 400MB which results in a
retransmission with untruncated blocksums.  Upgrading to
a version >= 2.5.7 (when available) will address that.
If either end of the sync is running rsync protocol 26 or
earlier a larger block size may help.

Another is that the default rsync block size will not take
advantage of the natural data alignment present in an ISO
image.  An ISO image is (to the best of my knowledge) a
filesystem with each file starting on a 2048 byte boundary.
Using --block-size=2048 may help there.

Finally, Most distribution ISOs use package formats, such as
RPM, that compress the package contents.  These compressed
packages may even if the installed fileset is unchanged
contain bits of meta-data that have been updated impacting
the rsyncabilty of the package file.  In any case changing
even one internal file of a compressed package can disrupt
rsyncing the entire package file.  The only possible
amelioration of this would be the use of the gzip
--rsyncable option (which requires a patched gzip) by the
package builders--assuming they use gzip for package
compression.  Given the effect of improving rsyncability and
thereby reducing bandwidth requirements such a change to
their package build scripts could well be to their
advantage.

> My question or feature request if you want to call it is. Is it possible
> to modify or add a feature to rsync that would allow it to for example
> (decompress a gz file, unpackage an RPM file) through say a config file
> specifying the necessary steps and repackage and compress the file on
> the other side. This would make it possible to transfer eg. ISO images
> considerably faster but would make the complexity of setup obviously
> pretty high. 
> 
> Just a question.

In theory it is possible but it beyond the scope of rsync.
One could wrap rsync in a script that might unpackage an RPM
file or do other pre and post-processing but in most cases
it isn't really worthwhile.

-- 
________________________________________________________________
	J.W. Schultz            Pegasystems Technologies
	email address:		jw at pegasys.ws

		Remember Cernan and Schmitt



More information about the rsync mailing list