rsync --link-dest and --files-from lead by a "change list" from some file system audit tool (Was: Re: cut-off time for rsync ?)

Simon Hobson linux at thehobsons.co.uk
Thu Jul 16 18:35:10 UTC 2015


Andrew Gideon <c182driver1 at gideon.org> wrote:

>> btrfs has support for this: you make a backup, then create a btrfs
>> snapshot of the filesystem (or directory), then the next time you make a
>> new backup with rsync, use --inplace so that just changed parts of the
>> file are written to the same blocks and btrfs will take care of the
>> copy-on-write part.
> 
> That's interesting.  I'd considered doing something similar with LVM 
> snapshots.  I chose not to do so because of a particular failure mode: if 
> the space allocated to a snapshot filled (as a result of changes to the 
> "live" data), the snapshot would fail.  For my purposes, I'd want the new 
> write to fail instead.  Destroying snapshots holding backup data didn't 
> seem a reasonable choice.
> 
> How does btrfs deal with such issues?

I'd have expected the live write to fail. The snapshot doesn't take any space (well only some for filesystem data) at the point of making the snapshot.

Once the snapshot is made, then any further changes just don't change the snapshotted data. If you overwrite the file, then new blocks are allocated to it from the free pool, and the metadata updated to point to it. I believe ZFS works in the same way.
The only difference in fact is that without the snapshot, after the new file has been written, the old version is freed and the space returned to the free pool.


Andrew Gideon <c182driver1 at gideon.org> wrote:

> Is there a way to save cycles by offering zfs a hint as to where a 
> previous copy of a file's blocks may be found?

I would assume (and note that it is an assumption) is that rsync will only write the blocks it needs to. It's checksummed the file chunk by chunk - it only transferred changed chunks, and I assume that if you use the in-place option it shouldn't need to re-write the whole file.

So say you have a file with 5 blocks, stored in blocks ABCDE on the disk. You snapshot the volume, and update block3 of the file - you should now have a snapshot file in blocks ABCDE, and a live file in blocks ABFDE, with blocks ABDE shared.

With the caveat that I've not really studied this, but I have read a little and listened to presentations. I would really hope that both filesystems work that way.




More information about the rsync mailing list