New wildmatch code in CVS
Wayne Davison
wayned at samba.org
Thu Jul 10 04:08:14 EST 2003
On Wed, Jul 09, 2003 at 10:02:16PM +1000, Donovan Baarda wrote:
> Why the name "wildmatch"?
The code is based on Rich Salz's wildmat.c, but it has been extended a
little by me, so I re-named it wildmatch.c (mainly because I reordered
the text/pattern args from the wildmat() order to the fnmatch() order).
> I have used the name "efnmatch" (extended fnmatch) for it in my Python
> implementations.
Since the code doesn't try to do some of the fnmatch-style things (like
having different flag-specified matching behaviors), calling it efnmatch
doesn't seem like the right choice to me.
> How did you implement it (I know, I should just look in CVS, but while I'm
> typing...)? Does it use regexes or a modified implementation of fnmatch? How
> does it compare performance-wise with a regex based implementation?
It's roughly comparable to an fnmatch implementation I saw -- it
reparses the match string for each comparison. The only performance
tests I ran did not reveal any significant differences between the Linux
fnmatch library and my wildmatch code, but it was not a very extensive
test.
There is no attempt to use regular expressions to implement exclude
handling at the moment. I discussed that option back when deciding what
to do, but decided against it for now. I think using cached regex
patterns would probably be a faster approach for those that have a
massive amount of include/exclude items, but for those with a normal set
of excludes, I doubt it would make a difference (since the task is
primarily I/O bound) and I didn't want to add a regex dependency. This
is certainly something that could be considered to improve things for
the future, though, and we'd be free to make the change at any time
since such an implementation would not change how the pattern matching
appears to the user (since the recent user-visible changes have all been
bug-fixes).
> I found by implementing efnmatch using regex's, it was painless to add
> the ability to use regex's in include/exclude lists.
Yeah, that was my primary reason for suggesting a possible regex
underpinning. No one seemed very interested in the idea, though, and
as you said yourself, the extra flexibility of regular expressions is
not all that necessary for filename matching.
To me, the most important part of the exclude changes has been to fix
defects in how the matching works.
Thanks for the feedback!
..wayne..
More information about the rsync
mailing list