unix_convert() and amount of stat() calls

Jeremy Allison jra at samba.org
Wed Sep 8 17:38:06 MDT 2010


On Wed, Sep 08, 2010 at 04:09:53PM -0700, Jeremy Allison wrote:
> On Wed, Sep 08, 2010 at 02:58:10PM -0700, Jeremy Allison wrote:
> > On Tue, Sep 07, 2010 at 09:40:17PM +0200, Stefan (metze) Metzmacher wrote:
> > > Hi Jeremy,
> > > 
> > > what's the reason that we loop over the components
> > > from the root to the leaf, when checking them?
> > > 
> > > Can't we start with the last component?
> > > At least if it's related to a wildcard search.
> > > 
> > > Maybe we only need to check the last 2 components,
> > > unless the name is mangled?
> > > 
> > > The problem I want to solve in this branch:
> > > http://gitweb.samba.org/?p=metze/samba/wip.git;a=shortlog;h=refs/heads/samba-3.4.7-ctdb-10
> > > is this:
> > > 
> > > FindFirst with 'path\to\some\dir\with\files\*'
> > > 
> > > triggers the following stat calls
> > > 
> > > path\to\some\dir\with\files\* => ENOENT (this should be already solved
> > > with my patches)
> > > path\
> > > path\to\
> > > path\to\some\
> > > path\to\some\dir\
> > > path\to\some\dir\with\
> > > path\to\some\dir\with\files\
> > > path\to\some\dir\with\files\* => ENOENT
> > 
> > Ok, I have a patch for this. Give me a min and I'll post it.
> 
> Ok - here's the patch that should fix it. It passes both
> RAW-CHKPATH and RAW-SEARCH. It adds an optimization into
> the wildcard lookup code such that it checks if the wildcard
> is in the last component of the lookup, and if so tries to
> directly do a stat on the parent directory. If this succeeds
> it short-circuits the pathname walk and starts the lookup at
> the last component.
> 
> This means an incoming pathname of :
> 
> FindFirst with 'path\to\some\dir\with\files\*'
> 
> should now do the following stat calls (presuming
> the directory exists of course):
> 
> path\to\some\dir\with\files\* => ENOENT
> path\to\some\dir\with\files = OK
> 
> It doesn't currently add "path\to\some\dir\with\files"
> into the stat cache (although that could be added) as
> I'd have to alloc and truncate the incoming path at
> the same position as the parent directory length, which
> might be a little tricky. Anyway that's not neccessary
> for correctness.
> 
> I'm going to do some more stream testing, and then commit
> if it looks good. Metze (or Volker) please comment.

Found some bugs myself :-). Here's version 2...

Jeremy.
-------------- next part --------------
diff --git a/source3/smbd/filename.c b/source3/smbd/filename.c
index 8f9b855..eadb977 100644
--- a/source3/smbd/filename.c
+++ b/source3/smbd/filename.c
@@ -382,10 +382,12 @@ NTSTATUS unix_convert(TALLOC_CTX *ctx,
 				 * or a missing intermediate component ?
 				 */
 				struct smb_filename parent_fname;
+				const char *last_component = NULL;
+
 				ZERO_STRUCT(parent_fname);
 				if (!parent_dirname(ctx, smb_fname->base_name,
 							&parent_fname.base_name,
-							NULL)) {
+							&last_component)) {
 					status = NT_STATUS_NO_MEMORY;
 					goto fail;
 				}
@@ -401,7 +403,40 @@ NTSTATUS unix_convert(TALLOC_CTX *ctx,
 						status = NT_STATUS_OBJECT_PATH_NOT_FOUND;
 						goto fail;
 					}
+				} else if (ret == 0) {
+					/*
+					 * stat() or lstat() of the parent dir
+					 * succeeded. So start the walk
+					 * at this point.
+					 */
+					status = check_for_dot_component(&parent_fname);
+					if (!NT_STATUS_IS_OK(status)) {
+						goto fail;
+					}
+
+		 			/*
+					 * If there was no parent component in
+					 * smb_fname->base_name then
+					 * don't do this optimization.
+					 */
+					if (smb_fname->base_name != last_component) {
+						/*
+						 * Safe to use CONST_DISCARD
+						 * here as last_component points
+						 * into our smb_fname->base_name.
+						 */
+						start = CONST_DISCARD(char *,
+							last_component);
+
+						DEBUG(5,("unix_convert optimize1: name "
+							"= %s, dirpath = %s, "
+							"start = %s\n",
+							smb_fname->base_name,
+							dirpath,
+							start));
+					}
 				}
+
 				/*
 				 * Missing last component is ok - new file.
 				 * Also deal with permission denied elsewhere.
@@ -410,6 +445,61 @@ NTSTATUS unix_convert(TALLOC_CTX *ctx,
 				goto done;
 			}
 		}
+	} else {
+		/*
+		 * We have a wildcard in the pathname.
+		 *
+		 * Optimization for common case where the wildcard
+		 * is in the last component and the client already
+		 * sent the correct case.
+		 */
+		struct smb_filename parent_fname;
+		const char *last_component = NULL;
+
+		ZERO_STRUCT(parent_fname);
+		if (!parent_dirname(ctx, smb_fname->base_name,
+					&parent_fname.base_name,
+					&last_component)) {
+			status = NT_STATUS_NO_MEMORY;
+			goto fail;
+		}
+		/*
+		 * If there was no parent component in
+		 * smb_fname->base_name then
+		 * don't do this optimization.
+		 */
+		if ((smb_fname->base_name != last_component) &&
+				!ms_has_wild(parent_fname.base_name)) {
+			/*
+			 * Wildcard isn't in the parent, i.e.
+			 * it must be in the last component.
+			 */
+			if (posix_pathnames) {
+				ret = SMB_VFS_LSTAT(conn, &parent_fname);
+			} else {
+				ret = SMB_VFS_STAT(conn, &parent_fname);
+			}
+			if (ret == 0) {
+				status = check_for_dot_component(&parent_fname);
+				if (!NT_STATUS_IS_OK(status)) {
+					goto fail;
+				}
+
+				/*
+				 * Safe to use CONST_DISCARD
+				 * here as last_component points
+				 * into our smb_fname->base_name.
+				 */
+				start = CONST_DISCARD(char *,last_component);
+
+				DEBUG(5,("unix_convert optimize2: name "
+					"= %s, dirpath = %s, "
+					"start = %s\n",
+					smb_fname->base_name,
+					dirpath,
+					start));
+			}
+		}
 	}
 
 	/*
@@ -480,6 +570,12 @@ NTSTATUS unix_convert(TALLOC_CTX *ctx,
 			goto fail;
 		}
 
+		/* Skip the stat call if it's a wildcard end. */
+		if (name_has_wildcard) {
+			DEBUG(5,("Wildcard %s\n",start));
+			goto done;
+		}
+
 		/*
 		 * Check if the name exists up to this point.
 		 */


More information about the samba-technical mailing list