Incremental backups and batch mode.
mrubel at galcit.caltech.edu
Fri Mar 29 07:07:12 EST 2002
> Something similar:
> I would like to have a first snapshot (level 0) that is a complete copy,
> and then other incremental backups that are just delta files
> (just the differences from the level 0 snapshot).
The "normal" utilities for this job would be dump and tar, especially if
you're dumping to tape. You can also use rsync, but it's somewhat
indirect if you're dumping to tape! :)
> That should be done saving the checksums of the level 0 backup locally
> and then checking the current files against those checksums to calculate the
> delta files to be saved as leve 1 backup, and so on.
Okay, one thing that takes a little getting used to here is that if you
use rsync, the backup order is reversed. Let's see if I can explain it
Using dump or tar, the 0 backup is large; it contains the whole filesystem
at the time it was made. Then the 1 backup is smaller; it contains only
the changes made between t_0 and t_1. The 2 backup would also be small,
consisting only of changes between t_1 and t_2. And so on.
Using rsync, the process is reversed. The *most recent* backup is the big
one, and *earlier* backups contain only the files that changed. So the 0
backup is the most recent one, the 1 backup contains only those files that
changed between t_1 and the most recent backup; the 2 backup contains only
those files that changed between t_2 and t_1, and so on back in time.
It's counterintuitive, but it's vastly more efficient for remote backups
since you only need to do a full dump once, then never again.
Now, how would you implement it?
For simplicity's sake, I'm going to say that you're backing up /home into
the directory /home-backup. Extending that to backup on a remote machine
is a separate (albeit easy) issue, so I won't cover that here.
Under /home-backup, you make folders like so (you'll probably find your
own names for these folders):
The idea is that current/ would contain the current image (most of the
files), current-1day/ would contain only the files that changed since
yesterday, and current-2day/ would have anything that changed between two
days ago and yesterday.
You can have as many of these as you want, and they don't have to be
evenly spaced; this is just for example.
Now, to make it work, run something like this once a day:
# delete the oldest incremental backup
rm -rf /home-backup/current-2day
# shift the intermediate incremental backups back by one
mv /home-backup/current-1day /home-backup/current-2day
# rsync into /home-backup/current, copying any changed files into the #
folder current-1day first
rsync -vab --delete --backup-dir=/home-backup/current-1day /home/ \
You can also use exclude lists and all that other stuff.
Is this clear?
More information about the rsync