Fw: Re: GZIP, ZIP, ISO, RPM files and rsync, tar, cpio

jw schultz jw at pegasys.ws
Thu Aug 28 20:06:22 EST 2003


On Thu, Aug 28, 2003 at 12:51:16PM +0300, Sviatoslav Sviridov/Lintec Project wrote:
> 
> Sorry for direct reply, but mail server at samba.org blocks my messages.

Postmasters, Martin, For your consideration.

> Begin forwarded message:
> 
> Date: Thu, 28 Aug 2003 12:43:54 +0300
> From: Sviatoslav Sviridov/Lintec Project <svd at lintec.minsk.by>
> To: rsync at lists.samba.org
> Subject: Re: GZIP, ZIP, ISO, RPM files and rsync, tar, cpio
> 
> 
> On Thu, 28 Aug 2003 02:30:36 -0700
> jw schultz <jw at pegasys.ws> wrote:
> 
> > Finally, Most distribution ISOs use package formats, such as
> > RPM, that compress the package contents.  These compressed
> > packages may even if the installed fileset is unchanged
> > contain bits of meta-data that have been updated impacting
> > the rsyncabilty of the package file.  In any case changing
> > even one internal file of a compressed package can disrupt
> > rsyncing the entire package file.  The only possible
> > amelioration of this would be the use of the gzip
> > --rsyncable option (which requires a patched gzip) by the
> > package builders--assuming they use gzip for package
> > compression.  Given the effect of improving rsyncability and
> > thereby reducing bandwidth requirements such a change to
> > their package build scripts could well be to their
> > advantage.
> 
> BTW, is there patch for bzip2 that adds --rsyncable option? Or may bw
> someone working on it?

I don't expect so.

The --rsyncable patch for gzip uses file content patterns to
reset the compression algorithm so that even if you insert
or delete data early in the file rsync can still find
matching blocks.  Look at the patch for further details.

As far as i can tell from the manpage bzip2 is compresses
data in fixed size blocks with a reset on block boundaries.
This means that it is moderately rsyncable as long as you
never insert or delete data.  You can change early data
without affecting later blocks but only if the offsets of
later blocks remain the same.  This does not lend it to an
rsyncable patch.  This does mean that bzip2 is good for
block oriented data such as database tablespace files and
for files that are appended to but bzip2 would be
undesireable for text, word processor, tar and other less
structured files.

In terms of rsync bzip may gain you considerably better
compression but the bandwidth cost may not be worthwhile.

My comments regarding the merits of bzip2 relate only to
rsync and should not be taken as reflecting on any other
qualities.

-- 
________________________________________________________________
	J.W. Schultz            Pegasystems Technologies
	email address:		jw at pegasys.ws

		Remember Cernan and Schmitt



More information about the rsync mailing list