[clug] Deduping backups from more than one workstation?

Daniel Pittman daniel at rimspace.net
Sat Nov 12 01:27:05 MST 2011


On Fri, Nov 11, 2011 at 22:08, Michael Still <mikal at stillhq.com> wrote:
>
> I have a backup scheme which involves my workstations rsyncing
> themselves to a server. At the moment there is one directory per
> workstation. However, this is kind of painful because the workstations
> have many duplicate files (source repositories, OS files, etc).
>
> So, how do other people solve this problem? Is there some fancy pants
> filesystem which can handle having duplicates of many files while only
> storing them once?

I used backuppc, which is a solution to this that uses rsync as the
transport for obtaining files, but which has an internal pooling
arrangement that does deduplication based on content in userspace.
This works fairly well, and provided good tools for scripting and
automation.

Otherwise, you might look at dupmerge
(http://www.ka9q.net/code/dupmerge/, or other versions elsewhere) to
do this after the fact.

Daniel
-- 
♲ Made with 100 percent post-consumer electrons


More information about the linux mailing list