Wildcard handling
Gerald Carter
gcarter at valinux.com
Sun Feb 11 17:10:58 GMT 2001
Andrew Tridgell wrote:
>
> > PS: Sorry about that Tridge. Will be more careful to fully
> > understand what's going on in the future
>
> no worries, I spent several weeks delving into
> the dark recesses of SMB wildcards so I know how hard
> it is to work them out.
This stuff is utterly awful!!! :-\ Wildcards are absolutely
awful! You weren't kidding, Andrew.
> the real trick with this stuff is to reproduce it
> using masktest. If masktest can't reproduce the
> problem then we need to know why (it should find
> any such problems).
ok. Here is what I have found. DOS implements a different
wilcard matching algorithm than Win98/NT. I believe that
WfWg also falls in line with the DOS client, but I do not
have a 16-bit windows box to test this right now.
In nutshell, the difference is between how the ? and .
characters in the pattern should be handled. Under DOS,
???????.?? will match a file with 1-7 or less characters
in the name and 0-2 characters in the file extension. On
Win98/NT, this same pattern would only match files with
exactly 7 characters in name and 2 characters in the extension.
We were implemting Win98/NT semantics (even for DOS clients)
and hence all the problems from WfWg/DOS clients being reported.
The new ms_fnmatch() code I'm including here works under my tests
of DOS, Win98 & NT4. The '?' code contains an ugly hack
using get_remote_arch() to determine what semantics to implement.
If anyone can think of a cleaner way to implement this, please
let me know.
With regards to the problem reported by Richard Bollinger
...With nt smb support = no, when a win98 client
types 'dir *.*', only filenames with a '.' in them
are returned by Samba.
The problem here is that when 'nt smb support' is disabled,
the Win98 client attempts to match *.* in the TRANS2FindFirst
command (as opposed to sending * when 'nt smb support = yes').
This is also fixed by the new matching code.
The problem with this code right now is that the test
if (*n == 0) ...
when dealing with .'s in the wilcard breaks bin/masktest.
However, this code works is needed by DOS clients and
Win98.
Here are some failures...
--+ +++ 64 mask=[\masktest\?.**] file=[\masktest\h.] mfile=[]
--- ++- 284 mask=[\masktest\>".] file=[\masktest\fghgmjmmficeb.]
mfile=[]
--- ++- 319 mask=[\masktest\<..] file=[\masktest\mda.al.eemhi]
mfile=[]
++- +++ 400 mask=[\masktest\<.<] file=[\masktest\fdbahmdcfaefacf]
mfile=[]
If you remove the '.' case entirely, DOS clients break. If
however, you exlude the test for all but DOS clients, then
the problem reported by Richard B. is still present on win98.
According to section 3.4 of the expired CIFS spec...
If the client is using 8.3 names, each part of
the name ( base (8) or extension (3) ) is treated
separately. For long filenames the . in the
name is significant even though there is no longer
a restriction on the size of each of the components.
which seems to imply that the test is neccessary. Like I
say, it appers to work for real clients, but breaks masktest.
Andrew, Can you comment on this?
Cheers, jerry
----------------------------------------------------------------------
/\ Gerald (Jerry) Carter Professional Services
\/ http://www.valinux.com/ VA Linux Systems gcarter at valinux.com
http://www.samba.org/ SAMBA Team jerry at samba.org
http://www.plainjoe.org/ jerry at plainjoe.org
"...a hundred billion castaways looking for a home."
- Sting "Message in a Bottle" ( 1979 )
int ms_fnmatch(char *pattern, char *string)
{
char *p = pattern, *n = string;
char c;
while ((c = *p++)) {
switch (c) {
case '?':
/* WARNING!!! Ugly hack for DOS clients. A '?' can
match a NULL character. This is different
from Windows 9x/NT. Why did MS have to match
wildcards n 2 different ways? -- jerry */
if (get_remote_arch() == RA_WFWG) {
/* If we have matched up to this point and
now have an empty string, the match only if
all that remains in the pattern is wilcards */
if ( *n == 0 ) {
for (; *p; p++) {
c = *p;
if ((c!='?') && (c!='*')) break;
}
if (*p == 0) return 0;
}
/* are we on to file extensions? If the pattern
only contains wildcards up to the extension
(if there is one), then chew up those characters */
else if ( *n == '.' ) {
for (; *p; p++) {
c = *p;
if ((c!='?') && (c!='*')) break;
}
if (*p == 0) return -1;
}
/* standard case of chewing up characters */
else {
n++;
}
}
else {
/* a '?' must match a character; do not
match a NULL */
if (! *n) return -1;
n++;
}
break;
case '>':
/* no change here... */
case '*':
/* WARNING!! ugly hack to prevent processing over a
file extension we may need to match later. (e.g. *.) */
n_ext = NULL;
for (; *n; n++) {
if (ms_fnmatch(p, n) == 0) return 0;
/* save the file extension for later */
if (*n == '.') n_ext = n;
}
/* reset to the file extension again. If we always reset
to the extension, then 'dir *' will only
match filenames with no extension. Only needed in
the top level of the recursion I think --jerry */
if (n_ext && *p == '.') n = n_ext;
break;
case '<':
/* no change here... */
case '"':
/* no change here... */
case '.':
/* a pattern with all wildcards after the '.'
should match filenames with no extension */
if (*n == 0 && ms_fnmatch(p, n) == 0) return 0;
/* match it as a character */
if (c != *n) return -1;
n++;
break;
default:
if (c != *n) return -1;
n++;
}
}
if (! *n) return 0;
return -1;
}
More information about the samba-technical
mailing list