rsync --link-dest and --files-from lead by a "change list" from some file system audit tool (Was: Re: cut-off time for rsync ?)

Simon Hobson linux at thehobsons.co.uk
Mon Jul 13 21:19:27 UTC 2015


Andrew Gideon <c182driver1 at gideon.org> wrote:

> However, you've made be a little 
> apprehensive about storebackup.  I like the lack of a need for a "restore 
> tool".  This permits all the standard UNIX tools to be applied to 
> whatever I might want to do over the backup, which is often *very* 
> convenient.

Well if you don't use the file splitting and compression options, you can still do that with storebackup - just be aware that some files may have different timestamps (but not contents) to the original. Specifically, consider this sequence :
- Create a file, perform a backup
- touch the file to change it's modification timestamp, perform another backup
rsync will (I think) see the new file with different timestamp and create a new file rather than lining to the old one.
storebackup will link the files )so taking (almost) zero extra space - but the second backup will show the file with the timestamp from the first file. If you just "cp -p" the file then it'll have the earlier timestamp, if you restore it with the storebackup tools then it'll come out with the later timestamp.

> On the other hand, I do confess that I am sometimes miffed at the waste 
> involved in a small change to a very large file.  Rsync is smart about 
> moving minimal data, but it still stores an entire new copy of the file.

I'm not sure as I've not used it, but storebackup has the option of splitting large files (threshold user definable). You'd need to look and see if it compares file parts (hard-lining unchanged parts) or the whole file (creates all new parts).

> What's needed is a file system that can do what hard links do, but at the 
> file page level.  I imagine that this would work using the same Copy On 
> Write logic used in managing memory pages after a fork().

Well some (all ?) enterprise grade storage boxes support de-dup - usually at the block level. So it does exist, at a price !




More information about the rsync mailing list