rsync --link-dest won't link even if existing file is out of date
Kevin Korb
kmk at sanitarium.net
Mon Apr 6 10:29:36 MDT 2015
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
It is actually pretty simple...
Instead of mkdir you run zfs create [options] /path/to/directory zfspath
When the rsync run finishes you would do: zfs snapshot zfspath at date
When you want to delete an old backup it do: zfs destroy zfspath
To list the subvolumes: zfs list [-t snapshot]
On 04/06/2015 12:12 PM, Ken Chase wrote:
> This has been a consideration. But it pains me that a tiny
> change/addition to the rsync option set would save much time and
> space for other legit use cases.
>
> We know rsync very well, we dont know ZFS very well (licensing kept
> the tech out of our linux-centric operations). We've been using it
> but we're not experts yet.
>
> Thanks for the suggestion.
>
> /kc
>
> On Mon, Apr 06, 2015 at 12:07:05PM -0400, Kevin Korb said: Since
> you are in an environment with millions of files I highly recommend
> that you move to ZFS storage and use ZFS's subvolume snapshots
> instead of --link-dest. It is much more space efficient, rsync run
> time efficient, and the old backups can be deleted in seconds.
> Rsync doesn't have to understand anything about ZFS. You just
> rsync to the same directory every time and have ZFS do a snapshot
> on that directory between runs.
>
> On 04/06/2015 01:51 AM, Ken Chase wrote:
>> Feature request: allow --link-dest dir to be linked to even if
>> file exists in target.
>
>> This statement from the man page is adhered to too strongly
>> IMHO:
>
>> "This option works best when copying into an empty destination
>> hierarchy, as rsync treats existing files as definitive (so it
>> never looks in the link-dest dirs when a destination file
>> already exists)".
>
>> I was suprised by this behaviour as generally the scheme is to
>> be efficient/save space with rsync.
>
>> When the file is out of date but exists in the --l-d target, it
>> would be great if it could be removed and linked. If an option
>> was supplied to request this behaviour, I'd actually throw some
>> money at making it happen. (And a further option to retain a
>> copy if inode permissions/ownership would otherwise be changed.)
>
>> Reasoning:
>
>> I backup many servers with --link-dest that have filesystems of
>> 10+M files on them. I do not delete old backups - which take
>> 60min per tree or more just so rsync can recreate them all in an
>> empty target dir when <1% of files change per day (takes 3-5 hrs
>> per backup!).
>
>> Instead, I cycle them in with mv $olddate $today then rsync
>> --del --link-dest over them - takes 30-60 min depending. (Yes,
>> some malleability of permissions risk there, mostly interested
>> in contents tho). Problem is, if a file exists AT ALL, even out
>> of date, a new copy is put overtop of it per the above man page
>> decree.
>
>> Thus much more disk space is used. Running this scheme with
>> moving old backups to be written overtop of accumulates many
>> copies of the exact same file over time. Running pax -rpl over
>> the copies before rsyncing to them works (and saves much space!),
>> but takes a very long time as it traverses and compares 2 large
>> backup trees thrashing the same device (in the order of 3-5x the
>> rsync's time, 3-5 hrs for pax - hardlink(1) is far worse, I
>> suspect a some non-linear algorithm therein - it ran 3-5x slower
>> than pax again).
>
>> I have detailed an example of this scenario at
>
>> http://unix.stackexchange.com/questions/193308/rsyncs-link-dest-option-does-not-link-identical-files-if-an-old-file-exists
>
>> which also indicates --delete-before and --whole-file do not
>> help at all.
>
>> /kc
>
>
>> -- Please use reply-all for most replies to avoid omitting the
>> mailing list. To unsubscribe or change options:
>> https://lists.samba.org/mailman/listinfo/rsync Before posting,
>> read: http://www.catb.org/~esr/faqs/smart-questions.html
>
- --
~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
Kevin Korb Phone: (407) 252-6853
Systems Administrator Internet:
FutureQuest, Inc. Kevin at FutureQuest.net (work)
Orlando, Florida kmk at sanitarium.net (personal)
Web page: http://www.sanitarium.net/
PGP public key available on web site.
~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iEYEARECAAYFAlUitHAACgkQVKC1jlbQAQeLYQCghRS26weHdBuYDAGBtM0mSB22
OvMAnjmLti7BqNiD9bCfjdewQQ/x2jts
=kFFB
-----END PGP SIGNATURE-----
More information about the rsync
mailing list