Rsync signatures and incremental tape backup
c.shoemaker at cox.net
Fri Feb 25 01:49:58 GMT 2005
On Thu, Feb 24, 2005 at 04:01:52PM -0800, Richard Patterson wrote:
> Andrew Trigdell's original rsync paper (actually
> thesis) discussed the idea of using rsync to make
> incremental tape backups based not on whole files but
> rather parts of files. Sadly, this functionality is
> not actually present in the rsync program. I'd like to
> explore adding this ability.
> There are basically three things that need to be done
> to enable efficient (partial-file) incremental tape
> backups, namely:
> 1) Generation of rsync signatures from existing files.
> 2) Generation of a binary patch from existing files +
> an rsync signature.
> 3) Application of a binary patch to existing files.
> Rsync already has the facility to generate and apply
> binary patches -- namely, batch-mode operation. Thus
> all that remains to be added is read and write support
> for signatures.
> It seems to me the best way is through two new
> options: --read-signature and --write-signature.
> --read-signature would operate similarly to
> --compare-dest. --write-signature would generate
> signature files from the destination.
> Finally, one would need to be able to run rsync
> without an actual destination parameter when used with
> --write-batch. Thus we would have the following
> For a full backup:
> --write-batch --write-signature
> For a differential backup:
> --write-batch --read-signature --write-sigature
> For a leaf incremental backup:
> --write-batch --read-signature
> For a restore, one would specify --read-batch with no
> I'm willing to implement this, but wanted to get some
> feedback first. Comments?
Perhaps this would also be a useful optimization for server mode.
During a period that the servers repository is not changing it could
run from the signature file, which would effectively be a cache of the
checksums. (as long as client didn't request a different checksum seed)
Also, I think this might not be too hard to implement. Presumably,
the signature file contains exactly what a "write-batch file" to an
empty destination directory would contain, minus that actual file
blocks. Is that what you're thinking?
More information about the rsync