rsync - using a --files-from list to cut out scanning. How to handle deletions? (fwd)

Robert Bell Robert.Bell at csiro.au
Thu Jan 17 19:55:37 MST 2013


Kevin,

Thanks for your response.

Some observations are inter-lined below.

Rob.


Regards
Rob. Bell              e-mail: Robert.Bell at csiro.au
--
Dr Robert C. Bell, BSc (Hons) PhD
Technical Services Manager
Advanced Scientific Computing
CSIRO IM&T

Phone: +61 3 9669 8102 | Mobile: +61 428 108 333 | CSIRO 93 3810
Robert.Bell at csiro.au | http://www.csiro.au/ | http://www.hpsc.csiro.au/
Addresses:
Street: CSIRO ASC Level 11, 700 Collins Street, Docklands Vic 3008, Australia
Postal: CSIRO ASC Level 11, GPO Box 1289, Melbourne Vic 3001, Australia

PLEASE NOTE

The information contained in this email may be confidential or
privileged. Any unauthorised use or disclosure is prohibited. If you
have received this email in error, please delete it immediately and
notify the sender by return email. Thank you. To the extent permitted
by law, CSIRO does not represent, warrant and/or guarantee that the
integrity of this communication has been maintained or that the
communication is free of errors, virus, interception or interference.

Please consider the environment before printing this email.

---------- Forwarded message ----------
Date: Tue, 15 Jan 2013 09:25:05 -0500
From: Kevin Korb <kmk at sanitarium.net>
To: rsync at lists.samba.org
Subject: Re: rsync - using a --files-from list to cut out scanning. How to
     handle deletions?

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

> If you are going to do it this way please be aware of:
> https://bugzilla.samba.org/show_bug.cgi?id=8712 and
> https://bugzilla.samba.org/show_bug.cgi?id=5644

> 
> If a file exists in the target directory when using --link-dest rsync
> modifies the link rather than replacing it which means you don't have
> history for files that have been replaced rather than added or deleted.
Thanks for your astute observation about updating hard-linked
files: you had me worried for a while.

Fortunately, we are using the --whole-file option in our production
backups, since the target of our backups is an HSM system (SGI's DMF),
and we don't want rsync to start comparing files (and thus triggering a
recall).  With this option, if a file is changed between the source and
a target which contains a hard-linked version of the file, then the
rsync update replaces the file in the target, not overwrites it and all
its hard-linked cousins. 
Whew!

> 
> If you are dealing with backing up many millions of files then I
> suggest looking into a more advanced filesystem that can handle this
> functionality internally rather than using --link-dest.  Currently
> that is limited to ZFS or BTRFS (if you are brave).
> 
> Both of these filesystems have subvolumes and subvolume snapshot
> capabilities.  This means you can do something similar to an lvm2
> snapshot at the directory level instead of the whole filesystem.  You
> can rsync with the same target directory each run and do a snapshot of
> that target between runs.  The recycling concept is not needed because
> deleting an old snapshot is much faster than doing an rm -rf on a huge
> tree of hard links.  This is especially true on ZFS which usually does
> the job in <1 second regardless of size.  Unfortunately BTRFS usually
> completes the command quickly but the space is then slowly reclaimed
> by a kernel thread in the background.
We are restricted in our use of filesystems to what is chosen for
particular hosts, so smarter backups using advanced filesystems is a
long way off.

> 
> Here is something I wrote up about it a while back:
> http://sanitarium.net/golug/rsync+btrfs_backups_2011.html
Thanks - good stuff.  It parallels some of the work we have done - I
should have looked up your papers earlier.

Our recycling of old backup directories gets around the performance
issue of having to delete old backups - deletes can certainly take a
long time, and we do it only for old systems progressively over a year
or so until we finally remove the last backups.

We have added Tower of Hanoi management of the backups - great for
automatically deciding which backups to keep in a rational way, and not
having to mess with dates.

Rob.


> 
> It is a little out of date now and since I wrote it for a LUG it only
> covers BTRFS.  A FreeBSD 9 system with at least 8GB of RAM running ZFS
> will outperform pretty much any Linux system running BTRFS (currently)
> which will outperform any Linux system running ext4 and --link-dest.
> 
> - --
> ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
> 	Kevin Korb			Phone:    (407) 252-6853
> 	Systems Administrator		Internet:
> 	FutureQuest, Inc.		Kevin at FutureQuest.net  (work)
> 	Orlando, Florida		kmk at sanitarium.net (personal)
> 	Web page:			http://www.sanitarium.net/
> 	PGP public key available on web site.
> ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)

iEYEARECAAYFAlD1ZsEACgkQVKC1jlbQAQcqBwCg7AEnzQQj9vFV9WWnpIYfQS2W
EvoAoIFjtx8/CBpejNZ6jH7QYtvL+b8U
=+YcS
-----END PGP SIGNATURE-----



More information about the rsync mailing list