rsync --link-dest, --delete and hard-link count

grarpamp grarpamp at gmail.com
Sat Feb 6 12:58:22 MST 2010


>> Files that are present in the previous datedir/hier/run but will
>> not be present in the currently to be made datedir/hier/run,
>> [because they are no longer present in the source hier],
>> DO NOT show up in the output of rsync -i as deletions.
>
>  As they shouldn't, as they're not deleted.
>
>> Sure, these no-longer-present source files are not technically unlinked from
>> your previous archives in the current run, but it can be MASSIVELY confusing
>> and dangerous if you're a log watcher/reviewer looking for what has changed.
>
>  Only if you don't understand what's going on.

Since it's not a documented caveat of using link-dest, and the user never sees
it in the -i output, I'm sure there are plenty of folks for which this
realization
would never occur until it's too late.

>> You'll never see the deletions from the source and if you nuke your old
>> snapshots thinking things are cool because of that, well, they're not.
>
>  The whole point of having a rotating system of images is that you can go
>  back to an old one if you need something that's deleted. Why would you
>  want to delete an old image manually just because you think nothing's
>  been deleted?

Any number of reasons. Perhaps it contains something the admin doesn't
want anymore... huge datasets, private data, aborted/corrupted runs, runs
that contained no 'deletions', etc, etc.

User's rationales are often hard to grok, giving them tools to simplify
their life is easier.

> It's not as if that old image is taking up a lot of valuable space
> (it's hardlinked, remember).

Yes, hardlinks save data block duplication... yet on filesystems with millions
of files / long pathnames, just the directory entries alone can take up gigs
per image. Multiply that out by frequency and it can quickly add up.

>  dirvish can save the list of files in each image, then it becomes a
>  simple diff (or comm -23) to find what's been deleted.

It's simply that rsync _can_ be made to do all this in one invocation.
Since it has to look at and consider all three of source, prior and current
anyways, it makes sense to enhance it with this printing capability.

I don't have much use for userfriendly bloated scripts like dirvish/etc.
Not to knock them, they're fine for those who use them. I just prefer
putting only what I need into my own along with adding other bits.

>> There may be cause to print something like
>  I don't think we need rsync to print that mailing list footer :-)

Haha, yeah, I left something off when it vacated my brain ;-) Thanks.

I think I was thinking that perhaps '^*deleting   ' wouldn't be quite
correct to print [regarding the first paragraph above]. But something
like '^*deref     ' would be more proper and convey that unique case.

Can't speak for anyone else here, or claim to know the needs of the
unheard users, but I'd certainly love to see it happen before too long.

Thanks for rsync!


More information about the rsync mailing list