Fix for batch mode (was Re: batch mode maintainability)
Dave Dykstra
dwd at bell-labs.com
Sat Feb 2 02:49:07 EST 2002
On Thu, Jan 31, 2002 at 02:42:47PM -0801, Jos Backus wrote:
...
> +Batch mode can be used to apply the same set of updates to many identical
> +systems\&. Suppose one has a directory tree which is replicated on a number of
> +hosts\&. Now suppose some changes have to be made to this source directory tree
> +and those changes need to be propagated to the other hosts\&. In order to do
> +this using batch mode, the first step is to make a copy of the source
> +directory tree before the changes are applied; this is called the original
> +source directory tree\&. The changes are then applied to the new source
> +directory tree\&.
It doesn't have to be an extra copy, it can be on any of the hosts. Also,
I think that calling the two directories the "original source directory"
and "source directory" is confusing. The usual terminology in the man page
is "source" and "destination". How about calling it something like the
"primary destination tree" to indicate that it is first, and say that
commonly it is on the same host that the source tree is on. I'd drop
"directory" and just use "source tree" and "primary destination tree".
> Next, rsync is run with the write-batch option to apply the
> +changes made to the new source directory tree to the original source directory
> +tree\&. The write-batch option causes the information needed to repeat this
> +operation against another original source directory tree to be stored in a
> +batch update fileset (see below) by the rsync client\&. The filename of each
> +file in the fileset starts with a prefix specified by the user as an argument
> +to the write-batch option\&. This fileset is then copied to each remote host,
> +where rsync is run with the read-batch option, again specifying the same
> +prefix, and the source directory tree\&.
There's another "source directory tree". That should be "destination tree".
>Rsync updates the source directory
> +tree using the information stored in the batch update fileset\&.
> +.PP
> +The fileset consists of 4 files:
> +.IP o
> +<prefix>\fB.rsync_argvs\fP command-line arguments
> +.IP o
> +<prefix>\fB.rsync_flist\fP rsync internal file metadata
> +.IP o
> +<prefix>\fB.rsync_csums\fP rsync checksums
> +.IP o
> +<prefix>\fB.rsync_delta\fP data blocks for file update & change
> +.PP
> +The .rsync_argvs file contains a command-line suitable for updating a source
> +directory tree using that batch update fileset\&. It can be executed using a
> +Bourne(-like) shell, optionally passing in an alternate source directory tree
> +pathname\& which is then used instead of the original path\&. This is useful
> +when the source directory tree path differs from the original source directory
> +tree path\&.
> +.PP
> +Generating the batch update fileset once saves having to perform the file
> +status, checksum and data block generation more than once when updating
> +multiple source directory trees\&. Multicast transport protocols can be used
> +to transfer the batch update files in parallel to many hosts at once, instead
> +of sending the same data to every host individually\&.
> +.PP
> +Example:
> +.PP
> +\fBCaveats\fP:
> +.IP o
> +The read-batch option expects the source directory tree it is meant to update
> +to be identical to the source directory tree that was used to create the batch
> +update fileset\&. When a difference between the source directory trees is
> +encountered the update will fail at that point, leaving the source directory
> +tree in a partially updated state\&. In that case, rsync can be used in its
> +regular (non-batch) mode of operation to fix up the source directory tree\&.
Add that the rsync version used on all destinations should be identical to
to the one used on the original destination.
> +.IP o
> +The -z/--compress option does not work in batch mode and yields a usage
> +error\&.
Add that people can instead compress the files with a separate compression
tool for transport to the destination.
Hmm, I wonder if it would be easy to use rsync's compression library to
compress the whole flist, csum, and delta files on the fly. That would
certainly be more convenient.
> +.IP o
> +The -n/--dryrun option does not work in batch mode and yields a runtime
> +error\&.
- Dave Dykstra
More information about the rsync
mailing list