rsync --link-dest won't link even if existing file is out of date
Clint Olsen
clint.olsen at gmail.com
Mon Apr 6 10:25:09 MDT 2015
Not to mention the fact that ZFS requires considerable hardware resources
(CPU & memory) to perform well. It also requires you to learn a whole new
terminology to wrap your head around it.
It's certainly not a trivial swap to say the least...
Thanks,
-Clint
On Mon, Apr 6, 2015 at 9:12 AM, Ken Chase <rsync-list-m829 at sizone.org>
wrote:
> This has been a consideration. But it pains me that a tiny change/addition
> to the rsync option set would save much time and space for other legit use
> cases.
>
> We know rsync very well, we dont know ZFS very well (licensing kept the
> tech out of our linux-centric operations). We've been using it but we're
> not experts yet.
>
> Thanks for the suggestion.
>
> /kc
>
> On Mon, Apr 06, 2015 at 12:07:05PM -0400, Kevin Korb said:
> >-----BEGIN PGP SIGNED MESSAGE-----
> >Hash: SHA1
> >
> >Since you are in an environment with millions of files I highly
> >recommend that you move to ZFS storage and use ZFS's subvolume
> >snapshots instead of --link-dest. It is much more space efficient,
> >rsync run time efficient, and the old backups can be deleted in
> >seconds. Rsync doesn't have to understand anything about ZFS. You
> >just rsync to the same directory every time and have ZFS do a snapshot
> >on that directory between runs.
> >
> >On 04/06/2015 01:51 AM, Ken Chase wrote:
> >> Feature request: allow --link-dest dir to be linked to even if file
> >> exists in target.
> >>
> >> This statement from the man page is adhered to too strongly IMHO:
> >>
> >> "This option works best when copying into an empty destination
> >> hierarchy, as rsync treats existing files as definitive (so it
> >> never looks in the link-dest dirs when a destination file already
> >> exists)".
> >>
> >> I was suprised by this behaviour as generally the scheme is to be
> >> efficient/save space with rsync.
> >>
> >> When the file is out of date but exists in the --l-d target, it
> >> would be great if it could be removed and linked. If an option was
> >> supplied to request this behaviour, I'd actually throw some money
> >> at making it happen. (And a further option to retain a copy if
> >> inode permissions/ownership would otherwise be changed.)
> >>
> >> Reasoning:
> >>
> >> I backup many servers with --link-dest that have filesystems of
> >> 10+M files on them. I do not delete old backups - which take 60min
> >> per tree or more just so rsync can recreate them all in an empty
> >> target dir when <1% of files change per day (takes 3-5 hrs per
> >> backup!).
> >>
> >> Instead, I cycle them in with mv $olddate $today then rsync --del
> >> --link-dest over them - takes 30-60 min depending. (Yes, some
> >> malleability of permissions risk there, mostly interested in
> >> contents tho). Problem is, if a file exists AT ALL, even out of
> >> date, a new copy is put overtop of it per the above man page
> >> decree.
> >>
> >> Thus much more disk space is used. Running this scheme with moving
> >> old backups to be written overtop of accumulates many copies of the
> >> exact same file over time. Running pax -rpl over the copies before
> >> rsyncing to them works (and saves much space!), but takes a very
> >> long time as it traverses and compares 2 large backup trees
> >> thrashing the same device (in the order of 3-5x the rsync's time,
> >> 3-5 hrs for pax - hardlink(1) is far worse, I suspect a some
> >> non-linear algorithm therein - it ran 3-5x slower than pax again).
> >>
> >> I have detailed an example of this scenario at
> >>
> >>
> http://unix.stackexchange.com/questions/193308/rsyncs-link-dest-option-does-not-link-identical-files-if-an-old-file-exists
> >>
> >> which also indicates --delete-before and --whole-file do not help
> >> at all.
> >>
> >> /kc
> >>
> >
> >- --
>
> >~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
> > Kevin Korb Phone: (407) 252-6853
> > Systems Administrator Internet:
> > FutureQuest, Inc. Kevin at FutureQuest.net (work)
> > Orlando, Florida kmk at sanitarium.net (personal)
> > Web page: http://www.sanitarium.net/
> > PGP public key available on web site.
>
> >~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
> >-----BEGIN PGP SIGNATURE-----
> >Version: GnuPG v2
> >
> >iEYEARECAAYFAlUirykACgkQVKC1jlbQAQc83ACfa7lawkyPFyO9kDE/D8aztql0
> >AkAAoIQ970yTCHB1ypScQ8ILIQR6zphl
> >=ktEg
> >-----END PGP SIGNATURE-----
> >--
> >Please use reply-all for most replies to avoid omitting the mailing
> list.
> >To unsubscribe or change options:
> https://lists.samba.org/mailman/listinfo/rsync
> >Before posting, read:
> http://www.catb.org/~esr/faqs/smart-questions.html
>
> --
> Ken Chase - ken att heavycomputing.ca Toronto Canada
> Heavy Computing - Clued bandwidth, colocation and managed linux VPS @151
> Front St. W.
> --
> Please use reply-all for most replies to avoid omitting the mailing list.
> To unsubscribe or change options:
> https://lists.samba.org/mailman/listinfo/rsync
> Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.samba.org/pipermail/rsync/attachments/20150406/55749c56/attachment.html>
More information about the rsync
mailing list