[PATCHES] - Statlite Support in Samba/GPFS vfs

Ira Cooper ira at samba.org
Mon Oct 6 17:25:58 MDT 2014

On Mon, Oct 6, 2014 at 1:47 PM, Shekhar Amlekar <samlekar at in.ibm.com> wrote:

> Hi Volker,
> Volker Lendecke <Volker.Lendecke at SerNet.DE> wrote on 09/30/2014 07:19:59
> PM:
> > From: Volker Lendecke <Volker.Lendecke at SerNet.DE>
> > To: Shekhar Amlekar/India/IBM at IBMIN
> > Cc: samba-technical <samba-technical at lists.samba.org>
> > Date: 09/30/2014 07:17 PM
> > Subject: Re: [PATCHES] - Statlite Support in Samba/GPFS vfs
> >
> > On Tue, Sep 30, 2014 at 04:38:29PM +0530, Shekhar Amlekar wrote:
> > > Please find attached patches that implement statlite support in Samba
> and
> > > gpfs vfs. Would you please review and let me know any
> comments/suggestions
> > > that you may have.
> >
> > In general, the patches look good, thanks for that!
> >
> > > It is known that in clustered file systems, a writer on a node can be
> > > slowed down significantly by exact stat calls happening on other
> nodes. A
> > > partial stat, wherever sufficient, helps in such cases. For this
> purpose,
> > > we have support for (l)statlite() in gpfs, that we would like to make
> use
> > > of.
> > >
> > > I've defined following vfs entry points -
> > >
> > > statlite(handle, dirfd, smb_fname, atflag, slitemask);
> > > fstatlite(handle, fsp, sbuf, slitemask);
> > >
> > > The existing stat/lstat/fstat calls in Samba can be mapped onto above
> lite
> > > variants wherever required, Later, if fstatatlite becomes available,
> it'd
> > > be possible to map the fstatat to statlite vfs call using the dirfd.
> I'm
> > > identifying places in the code where these lite calls can be utilized
> -
> > > and can post those patches later.
> >
> > Before we put the VFS changes in, I'd like to see some of
> > the users of the calls. The one very obvious call is in
> > unix_convert for directories that we just traverse. Here we
> > don't care about any metadata except whether the name exists
> > and whether it's a directory or something else. Apart from
> > that I think we need to take a close look. For example we
> Please find attached patches ("stlite-callers") that
> use the statlite vfs call. These apply on top of the
> ones I sent earlier ("statlite", attached here again).
> I've identified ~10 places in the code, but we could
> use it in more places. Please let me know your
> comments.
> I've defined following wrapper functions around the
> statlite vfs calls for the callers to use -
> vfs_path_exists(), and,
> vfs_path_get_partial_stat()
> > might be able to avoid asking for the write timestamp for
> > files we do have open. There we tend to store the values we
> > have to report in locking.tdb. What we don't store is the
> > current file size, so we have to ask for that. It would be
> > interesting to know from the GPFS people whether we gain
> > anything by asking only for precise st_size but not for
> > st_mtime. GPFS might have to pull the inode data anyway if
> > we have to ask for one precise field. Is GPFS finegrained
> > enough here?
> Currently, in gpfs, we have support of statlite()
> and lstatlite(), but no fstatlite(). I'm told that
> the statlite call is most beneficial when done on
> a path instead of an open file descriptor. However,
> once fstatlite is available, maybe we can look at
> the above instance that you mention?
> > This might also revive the discussion that was mentioned
> > http://lwn.net/Articles/394298/ about Linux xstat. I think
> > that never went anywhere, or am I missing something?
> Yes, infact, the statlite interface in these
> patchset is same as the one that was proposed
> for the xstat/fxstat system calls.


I'm sorry I didn't catch this thread earlier.

Jeremy asked about other systems that implement similar functionality.

Ceph: Not sure.  I strongly suspect the answer is no.
Gluster:  I'm fairly sure that Gluster doesn't do this.  (85%+)  At the
least we don't expose it at the user API level.

As far as the change goes: Changes like this are *highly* implementation
dependent, IMHO.  The performance implications will be all over the map for
various systems.

Can you provide some benchmarks supporting this change, and telling us how
much it is helping you?  Supporting data is a huge part of making an
argument like this....

The discussion usually goes: "No, that sounds like a bad idea..." "Here is
my data supporting the change, isn't it cool?  1000% better performance and
5 puppies saved." "Pushed."

Now, the talk about the architectural issues around xstat, etc... Those are
also valid.  I'd like to know why the kernel declined the interface.  So I
don't expect a "Pushed" immediately here... but it at least motivates us to
actually consider a change like this.


-Ira / ira@(samba.org|redhat.com)

More information about the samba-technical mailing list