Use rsync's checksums to deduplicate across backups
cs at zip.com.au
Sun Nov 6 13:47:37 MST 2011
On 04Nov2011 10:27, Chris Dunlop <chris at onthe.net.au> wrote:
| On Thu, Nov 03, 2011 at 09:34:53AM -0500, Alex Waite wrote:
| >> Not a direct answer, but this may do what you want:
| >> http://gitweb.samba.org/?p=rsync-patches.git;a=blob;f=link-by-hash.diff
| >> This patch adds the --link-by-hash=DIR option, which hard
| >> links received
| >> files in a link farm arranged by MD4 file hash. The result
| >> is that the system
| >> will only store one copy of the unique contents of each
| >> file, regardless of
| >> the file's name.
| > This does look like what I was describing, though it seems it
| > was
| > never included into rsync. Is that correct?
| Yes, rsync-patches is stuff that is deemed to be not yet ready
| (i.e. it may go in after it's been polished), or not at all
| suitable (e.g. it's too esoteric for general usage), for rsync
Regarding "esoteric": I also have this kind of backup scheme. I would welcome
that functionality, probably.
BTW, how far does the --link-dest option go in this direction? I use it
a fair bit (backing up multiple hosts with the same dataset on them,
using link dest to refer to the parallel snapshots).
Cameron Simpson <cs at zip.com.au> DoD#743
Being on a Beemer and not having a wave returned by a Sportster is like
having a clipper ship's hailing not returned by an orphaned New Jersey
solid waste barge. - OTL
More information about the rsync