A --exclude-checksum option?
Karl O. Pinc
kop at meme.com
Tue Feb 12 14:07:57 MST 2013
On 02/12/2013 02:48:35 PM, Kevin Korb wrote:
> My first thought is why are you backing up /tmp at all?
Because I put stuff in /tmp I might want, and whatever
I put there goes away by itself. It stays very handy for a while,
then it's on backup and less handy, then it's gone....
> My second thought is why are you using atime for anything? It can be
> touched by almost anything and running a filesystem with atime
> is a huge performance detriment as it adds a directory write
> to every file read operation.
Not my box, not my choice. (I tend to like relatime....)
> My final thought is maybe you want a file verification tool (I like
> cfv) instead of rsync --checksum. Rsync's --checksum is kinda
> mindless in terms of performance. It checksums everything. This is
> rather pointless as a file that is a different size will obviously
> have a different checksum. Rsync even checksums files that only
> on one side of the transfer.
Wouldn't I have to do something with cfv as well so that checksums
only happens on files of different sizes? Sounds like complication
when the backup is on a box reachable only via ssh.
I'm lazy, it was easy to incorporate verification into the
rsync backup process. And checksumming everything means
that everything is verified -- may as well do it in rsync as
It's a backup. If it's corrupted then --checksum will fix it
and it won't be corrupted. Regardless of whether the backup
side fs is broken. (Presumably the backup side hardware/
fs will be fixed quickly.) And I don't care that --checksum means
that the rsync takes longer once a week.
Sounds like you're leaning toward "it's a niche feature
and let's not clutter up rsync (further)".
> On 02/12/13 15:42, Karl O. Pinc wrote:
> > Hi,
> > I use rsync with hardlinks for backup, once a week doing checksums
> > to ensure there's no filesystem corruption in the backed-up data.
> > I also use tmpwatch, or something similar, to clean up /tmp, it
> > removes files that have not been accessed recently. (atime older
> > than some configured limit). I backup /tmp because I throw stuff in
> > tmp that I might possibly need again but don't want to bother
> > having to remember to delete -- and if I'm expecting to have useful
> > data somewhere I want it backed up.
> > However, rsync's checksumming (naturally) updates the atimes of the
> > files in /tmp, and so tmpwatch never deletes them.
> > It occurs to me that a handy solution might be to have an rsync
> > option, similar to the --exclude option, which would allow
> > checksumming to happen throughout most of the backup process but
> > would do "regular" size/timestamp based backups on certain
> > directories.
> > What do people think of such an option? Is there a better design.
> > (E.g. an option that, er, preserves atime when checksumming?) Is
> > rsync just too overloaded with options already and it would be
> > better instead to run two instances of rsync? Is there a
> > bug/feature in process already that would address the use-case
> > above?
> > I'd like to have a sensible design before even thinking about
> > patching.
> > Thanks for the feedback.
> > Regards,
> > Karl <kop at meme.com> Free Software: "You don't pay back, you pay
> > forward." -- Robert A. Heinlein
> Kevin Korb Phone: (407) 252-6853
> Systems Administrator Internet:
> FutureQuest, Inc. Kevin at FutureQuest.net (work)
> Orlando, Florida kmk at sanitarium.net (personal)
> Web page: http://www.sanitarium.net/
> PGP public key available on web site.
Karl <kop at meme.com>
Free Software: "You don't pay back, you pay forward."
-- Robert A. Heinlein
More information about the rsync