rsync --link-dest won't link even if existing file is out of date

Clint Olsen clint.olsen at gmail.com
Mon Apr 6 10:25:09 MDT 2015


Not to mention the fact that ZFS requires considerable hardware resources
(CPU & memory) to perform well. It also requires you to learn a whole new
terminology to wrap your head around it.

It's certainly not a trivial swap to say the least...

Thanks,

-Clint

On Mon, Apr 6, 2015 at 9:12 AM, Ken Chase <rsync-list-m829 at sizone.org>
wrote:

> This has been a consideration. But it pains me that a tiny change/addition
> to the rsync option set would save much time and space for other legit use
> cases.
>
> We know rsync very well, we dont know ZFS very well (licensing kept the
> tech out of our linux-centric operations). We've been using it but we're
> not experts yet.
>
> Thanks for the suggestion.
>
> /kc
>
> On Mon, Apr 06, 2015 at 12:07:05PM -0400, Kevin Korb said:
>   >-----BEGIN PGP SIGNED MESSAGE-----
>   >Hash: SHA1
>   >
>   >Since you are in an environment with millions of files I highly
>   >recommend that you move to ZFS storage and use ZFS's subvolume
>   >snapshots instead of --link-dest.  It is much more space efficient,
>   >rsync run time efficient, and the old backups can be deleted in
>   >seconds.  Rsync doesn't have to understand anything about ZFS.  You
>   >just rsync to the same directory every time and have ZFS do a snapshot
>   >on that directory between runs.
>   >
>   >On 04/06/2015 01:51 AM, Ken Chase wrote:
>   >> Feature request: allow --link-dest dir to be linked to even if file
>   >> exists in target.
>   >>
>   >> This statement from the man page is adhered to too strongly IMHO:
>   >>
>   >> "This option works best when copying into an empty destination
>   >> hierarchy, as rsync treats existing files as definitive (so it
>   >> never looks in the link-dest dirs when a destination file already
>   >> exists)".
>   >>
>   >> I was suprised by this behaviour as generally the scheme is to be
>   >> efficient/save space with rsync.
>   >>
>   >> When the file is out of date but exists in the --l-d target, it
>   >> would be great if it could be removed and linked. If an option was
>   >> supplied to request this behaviour, I'd actually throw some money
>   >> at making it happen.  (And a further option to retain a copy if
>   >> inode permissions/ownership would otherwise be changed.)
>   >>
>   >> Reasoning:
>   >>
>   >> I backup many servers with --link-dest that have filesystems of
>   >> 10+M files on them.  I do not delete old backups - which take 60min
>   >> per tree or more just so rsync can recreate them all in an empty
>   >> target dir when <1% of files change per day (takes 3-5 hrs per
>   >> backup!).
>   >>
>   >> Instead, I cycle them in with mv $olddate $today then rsync --del
>   >> --link-dest over them - takes 30-60 min depending. (Yes, some
>   >> malleability of permissions risk there, mostly interested in
>   >> contents tho).  Problem is, if a file exists AT ALL, even out of
>   >> date, a new copy is put overtop of it per the above man page
>   >> decree.
>   >>
>   >> Thus much more disk space is used. Running this scheme with moving
>   >> old backups to be written overtop of accumulates many copies of the
>   >> exact same file over time.  Running pax -rpl over the copies before
>   >> rsyncing to them works (and saves much space!), but takes a very
>   >> long time as it traverses and compares 2 large backup trees
>   >> thrashing the same device (in the order of 3-5x the rsync's time,
>   >> 3-5 hrs for pax - hardlink(1) is far worse, I suspect a some
>   >> non-linear algorithm therein - it ran 3-5x slower than pax again).
>   >>
>   >> I have detailed an example of this scenario at
>   >>
>   >>
> http://unix.stackexchange.com/questions/193308/rsyncs-link-dest-option-does-not-link-identical-files-if-an-old-file-exists
>   >>
>   >>  which also indicates --delete-before and --whole-file do not help
>   >> at all.
>   >>
>   >> /kc
>   >>
>   >
>   >- --
>
> >~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
>   >     Kevin Korb                      Phone:    (407) 252-6853
>   >     Systems Administrator           Internet:
>   >     FutureQuest, Inc.               Kevin at FutureQuest.net  (work)
>   >     Orlando, Florida                kmk at sanitarium.net (personal)
>   >     Web page:                       http://www.sanitarium.net/
>   >     PGP public key available on web site.
>
> >~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
>   >-----BEGIN PGP SIGNATURE-----
>   >Version: GnuPG v2
>   >
>   >iEYEARECAAYFAlUirykACgkQVKC1jlbQAQc83ACfa7lawkyPFyO9kDE/D8aztql0
>   >AkAAoIQ970yTCHB1ypScQ8ILIQR6zphl
>   >=ktEg
>   >-----END PGP SIGNATURE-----
>   >--
>   >Please use reply-all for most replies to avoid omitting the mailing
> list.
>   >To unsubscribe or change options:
> https://lists.samba.org/mailman/listinfo/rsync
>   >Before posting, read:
> http://www.catb.org/~esr/faqs/smart-questions.html
>
> --
> Ken Chase - ken att heavycomputing.ca Toronto Canada
> Heavy Computing - Clued bandwidth, colocation and managed linux VPS @151
> Front St. W.
> --
> Please use reply-all for most replies to avoid omitting the mailing list.
> To unsubscribe or change options:
> https://lists.samba.org/mailman/listinfo/rsync
> Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.samba.org/pipermail/rsync/attachments/20150406/55749c56/attachment.html>


More information about the rsync mailing list