PROPOSAL: --link-hash-dest, additional linking of files to their HASH values

Wayne Davison wayned at samba.org
Fri Dec 3 16:25:46 GMT 2004


On Fri, Dec 03, 2004 at 12:29:34PM +0100, Helge Jensen wrote:
> In order to reduce the space used for each backup, I was thinking about 
> linking all files in the backups to their hash in a separate directory, 
> allowing only one storage of each file-value, reducing my backup-needs 
> with ~5Gb per machince.

One possibility is to use BackupPC, which has created a perl script that
can talk the rsync protocol (among other file-acquisition methods) with
the machines that you wish to backup.  It pools all the identical files
together (regardless of each file's attributes, which are kept separate
from the file's data) and even compresses the data.  We link to the
BackupPC project on our resources webpage:

    http://rsync.samba.org/resources.html

Another first-step toward your goal is the link-by-hash.diff file in the
patches directory.  This code modifies rsync to link multiple files
together in the specified directory based on their hash, but the newest
file's attributes are the only file attributes that are saved (thus, if
you need to restore a file, you may need to set the permissions,
ownership, etc. manually).  The patch needs some work (see the mailing
list for the prior discussion on the subject), so it is not yet ready
for use in a production environment.

..wayne..


More information about the rsync mailing list