[clug] Deduping backups from more than one workstation?

Robert Brockway robert at timetraveller.org
Sun Nov 13 21:08:56 MST 2011


On Sat, 12 Nov 2011, Michael Still wrote:

> I have a backup scheme which involves my workstations rsyncing
> themselves to a server. At the moment there is one directory per
> workstation. However, this is kind of painful because the workstations
> have many duplicate files (source repositories, OS files, etc).
>
> So, how do other people solve this problem? Is there some fancy pants
> filesystem which can handle having duplicates of many files while only
> storing them once?

I've recently written a custom external disk based backup script[1].  It 
dedupes during an rsync copy with the most recent backup from that system. 
It then parses all of the backups and dedupes any file over 100MB 
(configurable) using the cli tool 'findup' which is part of fslint.

[1] This is my 2nd custom system.  The previous one was used at various 
sites for about a decade.  Quite often off-the-shelf systems were not the 
best choice.

Cheers,

Rob

-- 
Email: robert at timetraveller.org		Linux counter ID #16440
IRC: Solver (OFTC & Freenode)
Web: http://www.practicalsysadmin.com
Director, Software in the Public Interest (http://spi-inc.org/)
Free & Open Source: The revolution that quietly changed the world
"One ought not to believe anything, save that which can be proven by nature and the force of reason" -- Frederick II (26 December 1194 – 13 December 1250)


More information about the linux mailing list