[Samba] [lustre-discuss] Odd "File exists" behavior when copy-pasting many files to an SMB exported Lustre FS

Michael Weiser michael.weiser at atos.net
Thu Sep 22 09:21:42 UTC 2022

Hey Daniel! :)
Hi Jeremy,
Hi Bjoern,

I'm cross-posting to the samba list again as I think this might be of interest there as well and to keep the threads together.

> > That leaves the question, where that extended attribute user.lustre.lov is coming from. It
> > appears that Lustre itself exposes a number of extended attributes for every file which
> > reflect internal housekeeping data:
> >
> > $ getfattr -m - blarf
> > # file: blarf
> > lustre.lov
> > trusted.link
> > trusted.lma
> > trusted.lov

> Try adding

>   lustre.* skip

> to /etc/xattr.conf (cf.
> https://doc.lustre.org/lustre_manual.xhtml#lustre_configure_multiple_fs).
> Haven't tested yet, but Samba's EA handling seems to be libattr-based,
> so the above tweak should be sufficient to keep smbd off of these
> fs-specific attributes.

Thanks for the tip! I tried and unfortunately it didn't work. From my looking at the code it seems that samba uses libattr only as a fallback if no compatble system implementation of fgetxattr can be found. In my cases (RHEL 8.6/samba-4.15.5 and debug system debian testing/samba-4.17.0) it seems to use the system interface directly.

Running with that: From the Lustre documentation it appears that there have been problems with exposing Lustre internal data via extended attributes in the past, prompting the xattr.conf workaround:

[from the docs]:
> If a client(s) will be mounted on several file systems, add the following line to /etc/xattr.conf  > file to avoid problems when files are moved between the file systems: lustre.* skip"

What exactly were those problems hinted at in the documentation?
Is the visibility of the lustre.lov attribute for unprivileged users actually needed for anything?
Can exposing it to unprivileged users be switched off Lustre-side?

Looking at xattr.conf highlights the fact that Lustre isn't as singular as I thought in putting out those lustre.lov attributes. At least AFS and XFS seem to do the same or at least at some point in time have done so. (From the comment "obsolete" on xfsroot.* it appears that may have been changed.)

Jeremy wrote:

> Great analysis Michael ! As we're emulating NTFS CreateFile
> we can't do the 'create with EA's' atomically.

> Lutre really should not be exposing EA's to callers if
> it doesn't actually support EA's.

> An elegant solution might be to add a Samba VFS module
> vfs_lustre.c that intercepts fgetxattr/fsetxattr/flistxattr calls and simply
> strips out the lustre.lov EA's from being seen.

I had a first impelementation of that attached to my first mail. Did that get through? In my case it was enough to mask lustre.lov in flistxattr and fgetxattr so that clients never get to see it and fsetxattr is never attempted.

What's bugging me about this approach as well as the xattr.conf workaround is that the error behaviour on the client side is so very unintuitive. How will we get people to correlate some "file already exists" error with peculiar extended attributes behaviour of their file system so they become aware they need to configure a workaround? It'd certainly be nice if we could find a way for samba to "just do the right thing"[tm].

> Can you log a bug in our bugzilla and upload all this info
> so we can track it ?

That's underway (requested an account) and I'll upload everything there.

> >is this really Lustre specific? I assume we see the same effect on Linux with
> >other filesystems that don't support EAs.

> No. Lustre is returning "fictional" EA's that
> cannot be set. Linux filesystems that don't have
> EA's don't do that.

The attributes using a non-existant namespace (lustre.*) doesn't seem exactly right[tm] to me either. And it would wreak havoc if samba were actually able to set the canonicalised user.lustre.lov attribute when copying it back, duplicating and likely somewhat non-deterministically overwriting it later.

But more crucially, what seems problematic here is that Lustre supports listing and reading extended attributes for unprivileged users but does not allow setting them and returns ENOTSUP rather than EPERM or something else at that. So samba would need to take into account that not all filesystems support extended attributes as a whole but might support some operations on them but not others. I'm with Bjoern that there likely are or will be other filesystems with peculiar extended attribute behaviour.

What might be the possible fallout from removing the created file in the error code path? Shouldn't it be safe with proper locking in place as it appears to be?
Wouldn't a best-effort cleanup in the error path be better than leaving a known-to-be-incorrect state behind?

# /etc/xattr.conf
# Format:
# <pattern> <action>
# Actions:
#   permissions - copy when trying to preserve permissions.
#   skip - do not copy.

system.nfs4_acl                 permissions
system.nfs4acl                  permissions
system.posix_acl_access         permissions
system.posix_acl_default        permissions
trusted.SGI_ACL_DEFAULT         skip            # xfs specific
trusted.SGI_ACL_FILE            skip            # xfs specific
trusted.SGI_CAP_FILE            skip            # xfs specific
trusted.SGI_DMI_*               skip            # xfs specific
trusted.SGI_MAC_FILE            skip            # xfs specific
xfsroot.*                       skip            # xfs specific; obsolete
user.Beagle.*                   skip            # ignore Beagle index data
security.evm                    skip            # may only be written by kernel
afs.*                           skip            # AFS metadata and ACLs
lustre.*                        skip

More information about the samba mailing list