feedback on rsync-HEAD-20050125-1221GMT

Chris Shoemaker c.shoemaker at cox.net
Tue Feb 1 03:55:28 GMT 2005


On Mon, Jan 31, 2005 at 11:04:32AM -0500, Alberto Accomazzi wrote:
> 
> I agree that exclude/include patters can be tricky, and you have a good 
> point about familiarity versus complexity.  I think what makes them hard 
> to handle is the fact that we are dealing with filename (and directory 
> name) matching and recursion.  So matching only a subset of a file tree, 
> while simple as a concept, is non-trivial once you sit down and realize 
> that you need a well-defined syntax for it.  Can you write a find 
> expression that is simpler or more familiar to the average user than an 
> rsync's include/exclude?

	Simpler?  That depends on how complex the rules are.  For the
simplest case, `--exclude '*.bak'` is just as simple (if not simpler)
than `find . -name '*.bak'`.  But for the complex cases, I think
find's syntax is well-designed for simply and powerfully expressing
file-selection.  But, isn't this all the subjective complexity I
wasn't going to argue?

	More familar?  I think so.  If the user has never seen any
syntax/tool for specifying file-selection, then they have to learn
_something_.  If they have, then they should be able to use whatever
they know.  Note: I'm not saying that we should force the use of
'find'.  The user knows what tool they want to use -- they just want
to be able to use it.

	I just thought of an example.  I have several machines with
FC3 installed.  Today, I screwed one up by installing some buggy
software that ruined my printer setup.  It overwrote some files it
shouldn't have and deleted some files it shouldn't have.  I ended up
fixing it by downloading rpms and force reinstalling them.  But I had
those rpms installed on the other machine, with different config
files.

	I could have gone:

rsync -a --files-from=- / remote:/ < rpm --query --list mypkg1 mypkg2 mypkg3

but I know that mypkg1 has config files unique to each machine, that I
don't want to copy.  What are those files?  I have no idea, but rpm
knows.  It would be nice to say:

rsync -a --files-from=- --filter='-q rpm -Vv mypkg1 |grep " c "' / remote:/ < rpm --query --list mypkg1 mypkg2 mypkg3

where -q means "exclude the filelist returned by the following command"

I know this could be accomplished in other ways, but I just wanted to
illustrate the point that "external command" doesn't always mean
"find".  User knows best.


> 
> >(The allusion to GNU
> >tar's --exclude option which takes only a filename, not a pattern,
> >isn't really helpful in understanding rsyncs --exclude option.)
> 
> Uh?  Tar does take patters for exclusion, and has its own quirky way of 
> dealing with wildcards, directory matching and filename anchoring:
> http://www.gnu.org/software/tar/manual/html_node/tar_100.html

I stand corrected.  I didn't know about that.  It seems it's not in
the man page.  Seeing this complexity in tar makes me wonder if it
should be there either.  But, it raises an interesting question: Are
tar's and rsync's --exclude options solving the same task?  If so why
do they need to differ?  If not, why are they similar?

> 
> >It's not that pattern matching for file selection isn't complex --
> >it's just that it's such a well-defined, conceptually simple, common
> >task that other tools (like 'find' and 'bash') handle better than
> >rsync ever will.  And that's the way it should be: it's the unix way.
> 
> I agree that this is something we should be striving for as much as 
> possible: pipeline and offload tasks rather than bloating applications.
> 
> >>If you really need 
> >>complete freedom maybe the way to go is to do your file selection first 
> >>and use --files-from.  
> >
> >
> >Yes, --files-from is nice, and honestly, almost completely sufficient.
> >But in some dynamic cases, you can't keep the list updated.
> 
> Well, maybe we should go back and see if the solution to all problems 
> isn't making --files-from sufficient.  What exactly is missing from it 
> right now?

Good question.

>  The capability to delete files which are not in the 
> files-from list?

There's --include-from and --exclude-from, too.  So, included files
that aren't matched could be deleted, right?

>  Or the remote execution of a command that can generate 
> the files-from list for an rsync server?  

The use of stdin works once at the top level. ie. rsync --files-from=-
a/ b/ < cat myfilelist, but not on a per-directory, changing the rules
for each directory basis.  That's why it might be nice to have the
parsed "rule file" allow the specification of an external command.


> Maybe we ought to really 
> figure out what things cannot be achieved with the current functionality 
> before coming up with something new.

Is there anything?  I'd like to see an example.

-chris


More information about the rsync mailing list