Wildcard handling

Gerald Carter gcarter at valinux.com
Sun Feb 11 17:10:58 GMT 2001


Andrew Tridgell wrote:
> 
> > PS:  Sorry about that Tridge.  Will be more careful to fully
> > understand what's going on in the future
> 
> no worries, I spent several weeks delving into 
> the dark recesses of SMB wildcards so I know how hard 
> it is to work them out.

This stuff is utterly awful!!! :-\  Wildcards are absolutely 
awful!  You weren't kidding, Andrew.

> the real trick with this stuff is to reproduce it 
> using masktest. If masktest can't reproduce the 
> problem then we need to know why (it should find 
> any such problems).

ok.  Here is what I have found.  DOS implements a different
wilcard matching algorithm than Win98/NT.  I believe that
WfWg also falls in line with the DOS client, but I do not
have a 16-bit windows box to test this right now.

In nutshell, the difference is between how the ? and . 
characters in the pattern should be handled.  Under DOS,
???????.??  will match a file with 1-7 or less characters 
in the name and 0-2 characters in the file extension.  On 
Win98/NT, this same pattern would only match files with 
exactly 7 characters in name and 2 characters in the extension.

We were implemting Win98/NT semantics (even for DOS clients)
and hence all the problems from WfWg/DOS clients being reported.
The new ms_fnmatch() code I'm including here works under my tests
of DOS, Win98 & NT4.  The '?' code contains an ugly hack
using get_remote_arch() to determine what semantics to implement.
If anyone can think of a cleaner way to implement this, please
let me know.

With regards to the problem reported by Richard Bollinger

  ...With nt smb support = no, when a win98 client 
   types 'dir *.*', only filenames with a '.' in them
   are returned by Samba.

The problem here is that when 'nt smb support' is disabled,
the Win98 client attempts to match *.* in the TRANS2FindFirst 
command (as opposed to sending * when 'nt smb support = yes').
This is also fixed by the new matching code.

The problem with this code right now is that the test 

	if (*n == 0) ...

when dealing with .'s in the wilcard breaks bin/masktest.
However, this code works is needed by DOS clients and 
Win98.

Here are some failures...

--+ +++ 64 mask=[\masktest\?.**] file=[\masktest\h.] mfile=[]
--- ++- 284 mask=[\masktest\>".] file=[\masktest\fghgmjmmficeb.] 	
	mfile=[]
--- ++- 319 mask=[\masktest\<..] file=[\masktest\mda.al.eemhi] 
	mfile=[]
++- +++ 400 mask=[\masktest\<.<] file=[\masktest\fdbahmdcfaefacf] 
	mfile=[]

If you remove the '.' case entirely, DOS clients break.  If 
however, you exlude the test for all but DOS clients, then
the problem reported by Richard B. is still present on win98.

According to section 3.4 of the expired CIFS spec...

	If the client is using 8.3 names, each part of 
	the name ( base (8) or extension (3) ) is treated 
	separately.  For long filenames the . in the
	name is significant even though there is no longer 
	a restriction on the size of each of the components.

which seems to imply that the test is neccessary.  Like I 
say, it appers to work for real clients, but breaks masktest.


Andrew, Can you comment on this?







Cheers, jerry
----------------------------------------------------------------------
   /\  Gerald (Jerry) Carter                     Professional Services
 \/    http://www.valinux.com/  VA Linux Systems   gcarter at valinux.com
       http://www.samba.org/       SAMBA Team          jerry at samba.org
       http://www.plainjoe.org/                     jerry at plainjoe.org

       "...a hundred billion castaways looking for a home."
                                - Sting "Message in a Bottle" ( 1979 )



int ms_fnmatch(char *pattern, char *string)
{
   char *p = pattern, *n = string;
   char c;

   while ((c = *p++)) {
      switch (c) {
      case '?':
         /* WARNING!!! Ugly hack for DOS clients.  A '?' can 
            match a NULL character.  This is different 
            from Windows 9x/NT.  Why did MS have to match 
            wildcards n 2 different ways?   -- jerry */
         if (get_remote_arch() == RA_WFWG) {
	    /* If we have matched up to this point and 
	       now have an empty string, the match only if
	       all that remains in the pattern is wilcards */
	    if ( *n == 0 ) {
	       for (; *p; p++) {
	          c = *p;
		  if ((c!='?') && (c!='*')) break;
	       }
	       if (*p == 0) return 0;
         }
				

         /* are we on to file extensions?  If the pattern
            only contains wildcards up to the extension
            (if there is one), then chew up those characters */
	 else if ( *n == '.' ) {
	    for (; *p; p++) {
	       c = *p;
	       if ((c!='?') && (c!='*')) break;
	    }
	    if (*p == 0) return -1;
	 }

         /* standard case of chewing up characters */
	 else {
	    n++;
	 }
      }
      else {
         /* a '?' must match a character; do not
            match a NULL */
         if (! *n) return -1;
	 n++;
      }
			
      break;

      case '>':
         /* no change here... */

      case '*':
         /* WARNING!!  ugly hack to prevent processing over a 
            file extension we may need to match later.  (e.g. *.) */
	 n_ext = NULL;
	 for (; *n; n++) {
	    if (ms_fnmatch(p, n) == 0) return 0;
				
	    /* save the file extension for later */
	    if (*n == '.') n_ext = n;
	 }
			
	 /* reset to the file extension again.  If we always reset
	    to the extension, then 'dir *' will only
	    match filenames with no extension.  Only needed in 
            the top level of the recursion I think  --jerry */
	 if (n_ext && *p == '.') n = n_ext;
			
	 break;

      case '<':
         /* no change here... */

      case '"':
         /* no change here... */

      case '.':
         /* a pattern with all wildcards after the '.'
            should match filenames with no extension */
 	 if (*n == 0 && ms_fnmatch(p, n) == 0) return 0;

         /* match it as a character */
	 if (c != *n) return -1;
	 n++;
			
	 break;			

      default:
         if (c != *n) return -1;
	 n++;
      }
   }
	
   if (! *n) return 0;
	
   return -1;
}





More information about the samba-technical mailing list