Proposed enhancements to rsync filters
Matt McCutchen
hashproduct at verizon.net
Tue Dec 6 23:13:14 GMT 2005
I would like to propose some improvements to rsync's filters.
(1) Add a notation that makes a pattern match both a file and everything
under it if it happens to be a directory. One possibility: ending a
path in //. "+ mydir//" would be equivalent to "+ mydir" "+ mydir/**".
Just some syntactic sugar.
(2) Add a third sender behavior to "hide" and "show": "traverse" (T).
If a directory is to be "traversed", the sender scans it for files that
are "shown" and includes any such files in the file list. If
--no-implied-dirs is not given, the sender also sends their parent
directories.
With "traverse", it would be really easy to include certain trees, which
seems to be a very common desire:
S /foo/bar/wanted-tree-1//
S /foo/baz/wanted-tree-2//
T /**
(3) The "protect" (P) receiver behavior is a misnomer, since the
receiver still allows a sender file to overwrite a protected file. Add
a new receiver behavior, "lock" (L). If the receiver is asked to change
a locked file in any way or create a file at a locked path (i.e. if a
file existed there it would be locked), the receiver complains and does
nothing. Essentially, the receiver behaves as if paths matching a
"lock" rule were illegal on its filesystem and no files existed at any
of those paths. (I think "protect" should really have been "keep" and
"lock" is the true sense of "protect", but it's too late now.)
(4) "Protect" and "lock" rules should accept a modifier specifying that
it's OK to delete a protected or locked file because its parent is being
deleted. When sending files to a Subversion working copy, one might
want to let a directory's Subversion metadata perish with the directory
itself using something like "Ld .svn//" (where "d" is the new modifier).
If this modifier is not specified, rsync simply leaves the parent
directories lying around as it does now.
It turns out that "--backup" interacts in two interesting way with
receiver filters:
(1) If one runs "rsync --backup --del a/ b/" and b/ contains a file that
a/ doesn't, rsync will back the file up; if one runs the command again,
rsync will delete the backup file because it is not matched in a/. This
is not good! I feel that "--backup" should automatically generate a
filter "P *~" (or maybe "L *~"), where ~ stands for the backup suffix,
to stop this from happening.
(2) Suppose "--backup" were changed so that, when rsync needs to delete
a directory, it backs up the entire directory as a unit (by renaming or
moving it) instead of backing up individual files inside it. I'm not
saying this behavior is a good idea. Incidentally, the way a NetWare
file server keeps "deleted" directories around in case the user asks to
salvage them corresponds to this behavior, not rsync's current behavior.
Before backing up a directory, should rsync scan it for locked or
protected files? If such a file is found, should it remain at its
original path (which would require splitting the directory in two), or
should it remain in place inside the directory as the directory is
moved? The second behavior is probably appropriate if the applicable
filter had the new "d" modifier, but I'm not sure what is best in other
cases.
--
Matt McCutchen, ``hashproduct''
hashproduct at verizon.net -- http://hashproduct.metaesthetics.net/
More information about the rsync
mailing list