LVM snapshots vs. --link-dest

Ed W lists at wildgooses.com
Mon Oct 5 04:34:07 MDT 2009


Andrew Gideon wrote:
> I currently do incremental backups using --link-dest.  Unchanged files 
> are hard links to the previous snapshot; changed files are new copies.
>
> Where this "fails" is for large files that have received small changes.  
> The directory containing my main IMAP account, for example, typically 
> generates between 1 and 2 G of daily backup data as I file messages in my 
> inbox.  Yesterday, though, I filed some old messages in my inbox.  This 
> touched files I don't typically touch.  They were minor changes in the 
> scheme of things, but the result was an 11G backup of that volume last 
> night.
>   

If you resist using a fileformat which has some kind of implicit 
"chunking" capability (eg maildir isn't perfect, but kind of goes in the 
opposite direction and chunks your mail into lots of smaller files - 
perhaps too many in the opinion of some...), then you really need a 
backup system which can "chunk" your backup files and only store the 
differences

There are a few like rsnapshot which might suit your needs.  Or if you 
want something different then perhaps try brackup, which is kind of 
rsync like, but has a "chunking" phase built into it which tries to be 
rather smart about dividing files up into the hopefully static bit and 
the variable bit.  eg in the case of MP3s it would create one chunk for 
the data and one chunk for the tagging, that way you can retag your mp3 
collection and it won't re-upload the whole lot

Ideally I would like to see a kind of half and half maildir/mbox format 
emerge for email (perhaps dbox from dovecot will get there?).  The idea 
is that it would use mbox for storage, but break the files up into 
chunks of about a couple of MB for each mbox file.  This way you would 
have much smaller files to defrag in the case of deletes, but you would 
gain the packing efficiency of mbox (especially if you use compressed 
mailboxes, eg dovecot)

Good luck

Ed W


More information about the rsync mailing list