Help with complicated heirarchy exclude syntax

Matt McCutchen matt at mattmccutchen.net
Sat Dec 22 20:45:06 GMT 2007


On Sat, 2007-12-22 at 10:09 -0600, reader at newsguy.com wrote:
> Further, how would I define files with numbers for names?

> As I understand it, exclude files do not understand regular
> expressions.

Correct, exclude patterns are globs, and there's no glob that matches
files whose names consist entirely of digits.  It would be nice if rsync
supported regular expressions for exclude patterns.  But in this case,
there is a glob for files whose names do not consist entirely of digits:
*[!0-9]* .  You can include these files and then exclude all (other)
files.  The options to do this throughout the copy would be:

--include='*[!0-9]*' --exclude='*'

> I want rsync to do this:
> Backup  all files under ~/, except in some cases.
> 
> When rsync gets to ~/News, which has an extensive, fluctuating and
> deep heirarchy of directories under it.  An example might be:
> 
>   News/agent/nntp/news.gmane.org/gmane/comp/lang/perl/beginners/
> 
> I want to skip any files that have all numbers for names which would
> be under the last directory above.  However there are at least 2 files
> in there (.overview .agentview) that I do want rsync to get.
> 
> But when rsnyc gets to ~/Mail, which has a much shallower heirarchy
> with files that have numbers for names, I want to backup all those and
> anywhere else rsync finds files with numbers for names under ~/.
> 
> Its a tall order.  I've been ducking it with various other schemes but
> would really like to get something like that to work instead of
> running several rsync backups schemes against ~/.
> 
> How can I tell rsync to go to the bottom of a heirarchy skipping any
> files with numbers for names but retrieving .[oa]*view
> 
> Yet not skip files with numbers for names anywhere else.

To get the rules to apply only in ~/News, anchor the patterns and use
the ** syntax to match anything below ~/News.  The patterns are matched
against files' full paths from ~, but the first pattern accepts a
non-numeric character only in the basename because the final * cannot
match slashes.

--include='/News/**[!0-9]*' --exclude='/News/**'

Matt



More information about the rsync mailing list