[PATCH] --omit-dir-changes, qsort<>mergesort issues
Antti Tapaninen
aet at cc.hut.fi
Thu Jun 8 10:48:41 GMT 2006
On Wed, 7 Jun 2006, Matt McCutchen wrote:
> If, as I suspect, I am completely missing the point, please explain your
> problem in such a way that I have a hope of understanding!
Heh, sorry. Here's a new description of the process that hopefully makes
my goals more clear.
The index file that the tool maintains between sync's is just a textfile
that has three fields, the path/filename, mtime and the type of
path/filename (or MD5 sum, symlink dest).
/alt/root is the only directory where files get sync'ed between central
file server and client communication, either pushed or pulled.
/alt/local is an optional directory, anything on there overrides the
priority of files that might be on /alt/root.
/alt/backup/YYYYMMDD.HHMM hierarchy is primarily intended to save any
pre-existing vendor files on system, that we replace when
/alt/{local,root} gets sync'ed against the real root directory of a host.
The actual 4 stage sync process:
- Sync /alt/{local,root} against the real root directory,
backup replaced files to /alt/backup.
But because I *don't* want to backup any changed files that
previously came from /alt/{local,root}, I do a 3-way diff
against real root directory contents, old index and the
current status at /alt/{local,root}. Based on diff, the tool
generates an exclude file that prevents backing up our own
files that have recently changed "legally".
Because of this, usually the only files that get backed up are the
ones that have been overwritten by some recent OS package upgrade
operation. Occasionally some configuration files gets backed up
because an administrator has touched the file *directly* without
using the /alt hierarchy. For example, made a temporary one-liner
change to /etc/hosts.allow. If the change was that important, the
admin can probably dig it up from /alt/backup.
- Like above, but sync without generated exclude file, just to
to make sure some less important changes to files get there.
- Let the tool itself handle unlinking of files that we no longer
distribute, similarly based on index diff like at stage #1.
- See if any /alt/backup directories exists, sort them on reverse
date order. Based on index diffs, exclude all files in index except
the ones that got removed at stage #3. Sync backup directories against
the real root directory. Use --remove-sent-files to make sure
that any restored vendor file isn't at backup directory after
sync, it only removes the latest version, not all of them.
Overall, the time spend on these 4 stages is usually a matter of
second(s). Between master<>client and in local sync operations, the
mergesorting rsync really helps to achieve this task easily without doing
much of the work outside of rsync.
Perhaps there could be some -M option to alter the sort algorithm used?
So far, I've used the mergesort() function from FreeBSD and it works
great.. but rsync would probably need a GPL'd version instead?
Cheers,
-Antti
More information about the rsync
mailing list