unix_convert() and amount of stat() calls
Jeremy Allison
jra at samba.org
Wed Sep 8 17:38:06 MDT 2010
On Wed, Sep 08, 2010 at 04:09:53PM -0700, Jeremy Allison wrote:
> On Wed, Sep 08, 2010 at 02:58:10PM -0700, Jeremy Allison wrote:
> > On Tue, Sep 07, 2010 at 09:40:17PM +0200, Stefan (metze) Metzmacher wrote:
> > > Hi Jeremy,
> > >
> > > what's the reason that we loop over the components
> > > from the root to the leaf, when checking them?
> > >
> > > Can't we start with the last component?
> > > At least if it's related to a wildcard search.
> > >
> > > Maybe we only need to check the last 2 components,
> > > unless the name is mangled?
> > >
> > > The problem I want to solve in this branch:
> > > http://gitweb.samba.org/?p=metze/samba/wip.git;a=shortlog;h=refs/heads/samba-3.4.7-ctdb-10
> > > is this:
> > >
> > > FindFirst with 'path\to\some\dir\with\files\*'
> > >
> > > triggers the following stat calls
> > >
> > > path\to\some\dir\with\files\* => ENOENT (this should be already solved
> > > with my patches)
> > > path\
> > > path\to\
> > > path\to\some\
> > > path\to\some\dir\
> > > path\to\some\dir\with\
> > > path\to\some\dir\with\files\
> > > path\to\some\dir\with\files\* => ENOENT
> >
> > Ok, I have a patch for this. Give me a min and I'll post it.
>
> Ok - here's the patch that should fix it. It passes both
> RAW-CHKPATH and RAW-SEARCH. It adds an optimization into
> the wildcard lookup code such that it checks if the wildcard
> is in the last component of the lookup, and if so tries to
> directly do a stat on the parent directory. If this succeeds
> it short-circuits the pathname walk and starts the lookup at
> the last component.
>
> This means an incoming pathname of :
>
> FindFirst with 'path\to\some\dir\with\files\*'
>
> should now do the following stat calls (presuming
> the directory exists of course):
>
> path\to\some\dir\with\files\* => ENOENT
> path\to\some\dir\with\files = OK
>
> It doesn't currently add "path\to\some\dir\with\files"
> into the stat cache (although that could be added) as
> I'd have to alloc and truncate the incoming path at
> the same position as the parent directory length, which
> might be a little tricky. Anyway that's not neccessary
> for correctness.
>
> I'm going to do some more stream testing, and then commit
> if it looks good. Metze (or Volker) please comment.
Found some bugs myself :-). Here's version 2...
Jeremy.
-------------- next part --------------
diff --git a/source3/smbd/filename.c b/source3/smbd/filename.c
index 8f9b855..eadb977 100644
--- a/source3/smbd/filename.c
+++ b/source3/smbd/filename.c
@@ -382,10 +382,12 @@ NTSTATUS unix_convert(TALLOC_CTX *ctx,
* or a missing intermediate component ?
*/
struct smb_filename parent_fname;
+ const char *last_component = NULL;
+
ZERO_STRUCT(parent_fname);
if (!parent_dirname(ctx, smb_fname->base_name,
&parent_fname.base_name,
- NULL)) {
+ &last_component)) {
status = NT_STATUS_NO_MEMORY;
goto fail;
}
@@ -401,7 +403,40 @@ NTSTATUS unix_convert(TALLOC_CTX *ctx,
status = NT_STATUS_OBJECT_PATH_NOT_FOUND;
goto fail;
}
+ } else if (ret == 0) {
+ /*
+ * stat() or lstat() of the parent dir
+ * succeeded. So start the walk
+ * at this point.
+ */
+ status = check_for_dot_component(&parent_fname);
+ if (!NT_STATUS_IS_OK(status)) {
+ goto fail;
+ }
+
+ /*
+ * If there was no parent component in
+ * smb_fname->base_name then
+ * don't do this optimization.
+ */
+ if (smb_fname->base_name != last_component) {
+ /*
+ * Safe to use CONST_DISCARD
+ * here as last_component points
+ * into our smb_fname->base_name.
+ */
+ start = CONST_DISCARD(char *,
+ last_component);
+
+ DEBUG(5,("unix_convert optimize1: name "
+ "= %s, dirpath = %s, "
+ "start = %s\n",
+ smb_fname->base_name,
+ dirpath,
+ start));
+ }
}
+
/*
* Missing last component is ok - new file.
* Also deal with permission denied elsewhere.
@@ -410,6 +445,61 @@ NTSTATUS unix_convert(TALLOC_CTX *ctx,
goto done;
}
}
+ } else {
+ /*
+ * We have a wildcard in the pathname.
+ *
+ * Optimization for common case where the wildcard
+ * is in the last component and the client already
+ * sent the correct case.
+ */
+ struct smb_filename parent_fname;
+ const char *last_component = NULL;
+
+ ZERO_STRUCT(parent_fname);
+ if (!parent_dirname(ctx, smb_fname->base_name,
+ &parent_fname.base_name,
+ &last_component)) {
+ status = NT_STATUS_NO_MEMORY;
+ goto fail;
+ }
+ /*
+ * If there was no parent component in
+ * smb_fname->base_name then
+ * don't do this optimization.
+ */
+ if ((smb_fname->base_name != last_component) &&
+ !ms_has_wild(parent_fname.base_name)) {
+ /*
+ * Wildcard isn't in the parent, i.e.
+ * it must be in the last component.
+ */
+ if (posix_pathnames) {
+ ret = SMB_VFS_LSTAT(conn, &parent_fname);
+ } else {
+ ret = SMB_VFS_STAT(conn, &parent_fname);
+ }
+ if (ret == 0) {
+ status = check_for_dot_component(&parent_fname);
+ if (!NT_STATUS_IS_OK(status)) {
+ goto fail;
+ }
+
+ /*
+ * Safe to use CONST_DISCARD
+ * here as last_component points
+ * into our smb_fname->base_name.
+ */
+ start = CONST_DISCARD(char *,last_component);
+
+ DEBUG(5,("unix_convert optimize2: name "
+ "= %s, dirpath = %s, "
+ "start = %s\n",
+ smb_fname->base_name,
+ dirpath,
+ start));
+ }
+ }
}
/*
@@ -480,6 +570,12 @@ NTSTATUS unix_convert(TALLOC_CTX *ctx,
goto fail;
}
+ /* Skip the stat call if it's a wildcard end. */
+ if (name_has_wildcard) {
+ DEBUG(5,("Wildcard %s\n",start));
+ goto done;
+ }
+
/*
* Check if the name exists up to this point.
*/
More information about the samba-technical
mailing list