Use rsync's checksums to deduplicate across backups

Chris Dunlop chris at onthe.net.au
Thu Nov 3 17:27:20 MDT 2011


On Thu, Nov 03, 2011 at 09:34:53AM -0500, Alex Waite wrote:
>> Not a direct answer, but this may do what you want:
>>
>>  http://gitweb.samba.org/?p=rsync-patches.git;a=blob;f=link-by-hash.diff
>>
>>  This patch adds the --link-by-hash=DIR option, which hard
>> links received
>>  files in a link farm arranged by MD4 file hash.  The result
>> is that the system
>>  will only store one copy of the unique contents of each
>> file, regardless of
>>  the file's name.
>
> This does look like what I was describing, though it seems it
> was
> never included into rsync.  Is that correct?

Yes, rsync-patches is stuff that is deemed to be not yet ready
(i.e. it may go in after it's been polished), or not at all
suitable (e.g. it's too esoteric for general usage), for rsync
proper.

Chris


More information about the rsync mailing list