Backup scripts - recycling old backup directories

Kevin Korb kmk at
Fri Sep 12 12:04:24 MDT 2014

Hash: SHA1

I did consider that but rejected it for 2 reasons...

1. Backup run time.  We have a 4 hour window to run backups at night.
 Using recycled directories significantly extended the backup run
time.  The deletion time is eliminated but frankly, we have the other
20 hours of the day to do deletions.  We had to give up using
- --link-dest when the deletions started to actually take that long even
though the backups still ran in under 4 hours.

2. Metadata history.  If there is an existing file in the target dir
that differs only by metadata (permissions, ownership, timestamp) then
rsync will simply change that metadata.  That change affects all
instances of that file.  Of course this is better for storage space as
the alternative is storing another copy of the file with the different
metadata but we decided it was better to have that information saved.

Switching to ZFS with subvolume snapshots solved all of these issues.
 The backups still run fast.  The deletions are almost instant.  Disk
usage is less because ZFS only stores the differences at the block
level instead of the file level.  The metadata history is there
without using the additional storage of a second copy of the file.

We use dedicated backup servers so changing the filesystem and even OS
on them was no big deal.  We have been using TrueOS (FreeBSD with
PC-BSD's installer) and it has worked quite well.  The only issue was
that ZFS is very RAM heavy and we had to upgrade some hardware.  I
would say that 8GB of RAM is the minimum for this kind of work and
16GB is the minimum if you turn on the dedup feature.

On 09/12/2014 12:31 AM, Robert Bell wrote:
> Folks,
> Kevin Korb wrote:
>> Have you considered more advanced methods such as subvolume
>> snapshots provided by ZFS and BTRFS?  At work we were forced to
>> abandon rsync - --link-dest because of the amount of time it
>> takes to delete old backups when the data is primarily many
>> millions of small files (shared web hosting company).
> We don't have more advanced methods like subvolume snapshots
> available to us.
> However, we can recycle backup directories.
> When we started using rsync with --link-dest back in about 2007,
> we deleted old backups, but realised soon after that we could
> recycle old backups.
> With daily backups, we find about 1.5% of the data and 0.5% of the
> files change from one day to the next, so a directory from about 5
> days ago will typically be only 5-10% wrong and can be recycled to
> be the target of the latest directory - that's a lot better than
> recreating the whole directory tree for a new backup, and then
> deleting a whole old directory tree.
> We use --delete of course.
> Hope this helps someone.
> Rob.
> Dr Robert C. Bell HPC National Partnerships | Scientific Computing 
> Information Management and Technology CSIRO T +61 3 9669 8102 Alt
> +61 3 8601 3810 Mob +61 428 108 333 
> Robert.Bell at<mailto:Robert.Bell at> | | 
> Street: CSIRO ASC Level 11, 700 Collins
> Street, Docklands Vic 3008, Australia Postal: CSIRO ASC Level 11,
> GPO Box 1289, Melbourne Vic 3001, Australia
> PLEASE NOTE The information contained in this email may be
> confidential or privileged. Any unauthorised use or disclosure is
> prohibited.  If you have received this email in error, please
> delete it immediately and notify the sender by return email. Thank
> you.  To the extent permitted by law, CSIRO does not represent,
> warrant and/or guarantee that the integrity of this communication
> has been maintained or that the communication is free of errors,
> virus, interception or interference.
> Please consider the environment before printing this email.

- -- 
	Kevin Korb			Phone:    (407) 252-6853
	Systems Administrator		Internet:
	FutureQuest, Inc.		Kevin at  (work)
	Orlando, Florida		kmk at (personal)
	Web page:
	PGP public key available on web site.
Version: GnuPG v2


More information about the rsync mailing list