vfs_acl_xattr and Linux memory fragmentation

Uri Simchoni uri at samba.org
Wed Apr 12 17:09:50 UTC 2017


Hi,

Here's a patch that adds a 4K buffer step between 1K and 64K.

Starting with 1K means we keep the same behavior for the majority of
security descriptors out there - no negative impact in user mode for
what aims to be an improvement in kernel mode.

I've been wrong stating that Linux file systems can't handle an EA
larger than 4K - both XFS and BTRFS (as well as others) can do that.

About efficiency - what I've been able to discern, without going into
too much detail, is that:
a. There's no per-object overhead, you kmalloc 4K you use 4K.
b. There should be an advantage to allocating 4K over 64, but not
because it happens to be page size, but rather because it has a backing
pool (see  kmalloc-4096 in /proc/slabinfo). If that pool needs to
allocate more 4K buffers, it does so in 32K units, which is also prone
to fragmentation issues, but more often than not, you have 4K objects in
the pool. 64K allocations don't have such a pool so every time you
either have contiguous 64K, or you don't (and once memory becomes
fragmented, you normally don't).

Thanks,
Uri.

On 04/10/2017 11:26 AM, Christoph Hellwig wrote:
> On Fri, Apr 07, 2017 at 11:18:20AM +0300, Uri Simchoni via samba-technical wrote:
>> 2. In older Linux (prior to 3.4), getxattr() used to require a
>> physically-contiguous (kmalloc'd) buffer whose size equals the max size
>> requested by the user. So even though no Linux in-tree file system
>> supports EA's larger than 4K, the getxattr() system call would try to
>> allocate 64K if Samba passes this number.
> 
> XFS supports larger than 4k xattr values.
> 
> That beeing said a 4k try after 1k seems inherently reasonable to me.
> 

-------------- next part --------------
From f4514efa95371a8517e14373db5f0ad25d2a6d48 Mon Sep 17 00:00:00 2001
From: Uri Simchoni <uri at samba.org>
Date: Sun, 9 Apr 2017 00:20:40 +0300
Subject: [PATCH 1/3] selftest: test fetching a large ACL from vfs_acl_xattr

Add a test that fetches an ACL whose size is larger than 4K.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=12737

Signed-off-by: Uri Simchoni <uri at samba.org>
---
 source3/script/tests/test_large_acl.sh | 59 ++++++++++++++++++++++++++++++++++
 source3/selftest/tests.py              |  1 +
 2 files changed, 60 insertions(+)
 create mode 100755 source3/script/tests/test_large_acl.sh

diff --git a/source3/script/tests/test_large_acl.sh b/source3/script/tests/test_large_acl.sh
new file mode 100755
index 0000000..9b6901f
--- /dev/null
+++ b/source3/script/tests/test_large_acl.sh
@@ -0,0 +1,59 @@
+#!/bin/bash
+#
+# Blackbox test for fetching a large ACL
+#
+
+if [ $# -lt 5 ]; then
+cat <<EOF
+Usage: $0 SERVER USERNAME PASSWORD SMBCLIENT SMBCACLS PARAMS
+EOF
+exit 1;
+fi
+
+SERVER=${1}
+USERNAME=${2}
+PASSWORD=${3}
+SMBCLIENT=${4}
+SMBCACLS=${5}
+shift 5
+ADDARGS="$*"
+SMBCLIENT="$VALGRIND ${SMBCLIENT} ${ADDARGS}"
+SMBCACLS="$VALGRIND ${SMBCACLS} ${ADDARGS}"
+
+incdir=`dirname $0`/../../../testprogs/blackbox
+. $incdir/subunit.sh
+
+# build a file to work with
+build_files()
+{
+    touch large_acl
+    $SMBCLIENT //$SERVER/acl_xattr_ign_sysacl_windows -U $USERNAME%$PASSWORD -c 'put large_acl' > /dev/null 2>&1
+    rm -rf large_acl > /dev/null
+}
+
+cleanup()
+{
+    $SMBCLIENT //$SERVER/acl_xattr_ign_sysacl_windows -U $USERNAME%$PASSWORD -c 'rm large_acl' > /dev/null 2>&1
+}
+
+build_files
+
+test_large_acl()
+{
+    #An ACL with 200 entries, ~7K
+    new_acl=$(seq 1001 1200 | sed -r -e '1 i\D:(A;;0x001f01ff;;;WD)' -e 's/(.*)/(A;;0x001f01ff;;;S-1-5-21-11111111-22222222-33333333-\1)/' | tr -d '\n')
+    $SMBCACLS //$SERVER/acl_xattr_ign_sysacl_windows -U $USERNAME%$PASSWORD --sddl -S $new_acl large_acl
+    actual_acl=$($SMBCACLS //$SERVER/acl_xattr_ign_sysacl_windows -U $USERNAME%$PASSWORD --sddl --numeric large_acl 2>/dev/null | sed -rn 's/.*(D:.*)/\1/p' | tr -d '\n')
+    if [ ! "$new_acl" = "$actual_acl" ] ; then
+        echo -e "expected:\n$new_acl\nactual:\n$actual_acl\n"
+        return 1
+    fi
+}
+
+failed=0
+
+testit "able to retrieve a large ACL if VFS supports it" test_large_acl || failed=`expr $failed + 1`
+
+cleanup
+
+exit $failed
diff --git a/source3/selftest/tests.py b/source3/selftest/tests.py
index d0e2ae6..3959439 100755
--- a/source3/selftest/tests.py
+++ b/source3/selftest/tests.py
@@ -228,6 +228,7 @@ for env in ["fileserver"]:
     plantestsuite("samba3.blackbox.inherit_owner.default(%s)" % env, env, [os.path.join(samba3srcdir, "script/tests/test_inherit_owner.sh"), '$SERVER', '$USERNAME', '$PASSWORD', '$PREFIX', smbclient3, smbcacls, 'tmp', '0', '0', '-m', 'NT1'])
     plantestsuite("samba3.blackbox.inherit_owner.full (%s)" % env, env, [os.path.join(samba3srcdir, "script/tests/test_inherit_owner.sh"), '$SERVER', '$USERNAME', '$PASSWORD', '$PREFIX', smbclient3, smbcacls, 'inherit_owner', '1', '1', '-m', 'NT1'])
     plantestsuite("samba3.blackbox.inherit_owner.unix (%s)" % env, env, [os.path.join(samba3srcdir, "script/tests/test_inherit_owner.sh"), '$SERVER', '$USERNAME', '$PASSWORD', '$PREFIX', smbclient3, smbcacls, 'inherit_owner_u', '0', '1', '-m', 'NT1'])
+    plantestsuite("samba3.blackbox.large_acl (%s)" % env, env, [os.path.join(samba3srcdir, "script/tests/test_large_acl.sh"), '$SERVER', '$USERNAME', '$PASSWORD', smbclient3, smbcacls])
 
     #
     # tar command tests
-- 
2.9.3


From d506aa1eb1de923301d3ca0b7c05e40156ebe0e3 Mon Sep 17 00:00:00 2001
From: Uri Simchoni <uri at samba.org>
Date: Sun, 9 Apr 2017 00:40:44 +0300
Subject: [PATCH 2/3] vfs_acl_xattr: factor out fetching of an extended
 attribute

Pure refactoring - add a function that fetches an extended attribute
based on either the file descriptor or the file name.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=12737

Signed-off-by: Uri Simchoni <uri at samba.org>
---
 source3/modules/vfs_acl_xattr.c | 44 +++++++++++++++++++++++++++++------------
 1 file changed, 31 insertions(+), 13 deletions(-)

diff --git a/source3/modules/vfs_acl_xattr.c b/source3/modules/vfs_acl_xattr.c
index e1f90ff..85b9a0d 100644
--- a/source3/modules/vfs_acl_xattr.c
+++ b/source3/modules/vfs_acl_xattr.c
@@ -37,6 +37,35 @@
  Pull a security descriptor into a DATA_BLOB from a xattr.
 *******************************************************************/
 
+static ssize_t getxattr_do(vfs_handle_struct *handle,
+			   files_struct *fsp,
+			   const struct smb_filename *smb_fname,
+			   const char *xattr_name,
+			   uint8_t *val,
+			   size_t size)
+{
+	ssize_t sizeret;
+	int saved_errno = 0;
+
+	become_root();
+	if (fsp && fsp->fh->fd != -1) {
+		sizeret = SMB_VFS_FGETXATTR(fsp, xattr_name, val, size);
+	} else {
+		sizeret = SMB_VFS_GETXATTR(handle->conn, smb_fname->base_name,
+					   XATTR_NTACL_NAME, val, size);
+	}
+	if (sizeret == -1) {
+		saved_errno = errno;
+	}
+	unbecome_root();
+
+	if (saved_errno != 0) {
+		errno = saved_errno;
+	}
+
+	return sizeret;
+}
+
 static NTSTATUS get_acl_blob(TALLOC_CTX *ctx,
 			vfs_handle_struct *handle,
 			files_struct *fsp,
@@ -47,7 +76,6 @@ static NTSTATUS get_acl_blob(TALLOC_CTX *ctx,
 	uint8_t *val = NULL;
 	uint8_t *tmp;
 	ssize_t sizeret;
-	int saved_errno = 0;
 
 	ZERO_STRUCTP(pblob);
 
@@ -60,21 +88,11 @@ static NTSTATUS get_acl_blob(TALLOC_CTX *ctx,
 	}
 	val = tmp;
 
-	become_root();
-	if (fsp && fsp->fh->fd != -1) {
-		sizeret = SMB_VFS_FGETXATTR(fsp, XATTR_NTACL_NAME, val, size);
-	} else {
-		sizeret = SMB_VFS_GETXATTR(handle->conn, smb_fname->base_name,
-					XATTR_NTACL_NAME, val, size);
-	}
-	if (sizeret == -1) {
-		saved_errno = errno;
-	}
-	unbecome_root();
+	sizeret =
+	    getxattr_do(handle, fsp, smb_fname, XATTR_NTACL_NAME, val, size);
 
 	/* Max ACL size is 65536 bytes. */
 	if (sizeret == -1) {
-		errno = saved_errno;
 		if ((errno == ERANGE) && (size != 65536)) {
 			/* Too small, try again. */
 			size = 65536;
-- 
2.9.3


From 0f18115c36c625aa7cd810fc38eca5296888a7a8 Mon Sep 17 00:00:00 2001
From: Uri Simchoni <uri at samba.org>
Date: Sun, 9 Apr 2017 01:08:22 +0300
Subject: [PATCH 3/3] vfs_acl_xattr: Linux-friendly getxattr

When fetching the ACL extended attribute, try a 4K
buffer between the 1K and the 64K buffers.

The reason is Linux-specific, although other OSs may
find this advantagous as well. Under Linux, to handle
getxattr() call, the Linux VFS layer first allocates a
physically-contiguous buffer with the size of the getxattr
request, irrespective of the actual EA size. The previous
algorithm jumped from 1K to 64K. Using a buffer of more
than 8K means there's no pooled object to back it.
On kernels prior to 3.6, that could mean the call would fail with
ENOMEM as there's no physically-coniguous 64K block. Newer
kernels fall back to virtual memory allocation. Whether
the fallback exists or not, we don't want to reach it. The
4K step between 1K and 64K allocates memory that is backed
by an object pool (slab) and is sufficient for most uses.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=12737

Signed-off-by: Uri Simchoni <uri at samba.org>
---
 source3/modules/vfs_acl_xattr.c | 47 ++++++++++++++++++++++-------------------
 1 file changed, 25 insertions(+), 22 deletions(-)

diff --git a/source3/modules/vfs_acl_xattr.c b/source3/modules/vfs_acl_xattr.c
index 85b9a0d..e114b70 100644
--- a/source3/modules/vfs_acl_xattr.c
+++ b/source3/modules/vfs_acl_xattr.c
@@ -72,41 +72,44 @@ static NTSTATUS get_acl_blob(TALLOC_CTX *ctx,
 			const struct smb_filename *smb_fname,
 			DATA_BLOB *pblob)
 {
-	size_t size = 1024;
+	size_t size;
 	uint8_t *val = NULL;
 	uint8_t *tmp;
-	ssize_t sizeret;
+	ssize_t sizeret = -1;
+	size_t sizes[] = {1024, 4 * 1024, 64 * 1024};
+	int size_idx;
 
 	ZERO_STRUCTP(pblob);
 
-  again:
+	for (size_idx = 0; size_idx < ARRAY_SIZE(sizes); ++size_idx) {
+		size = sizes[size_idx];
+		tmp = talloc_realloc(ctx, val, uint8_t, size);
+		if (tmp == NULL) {
+			TALLOC_FREE(val);
+			return NT_STATUS_NO_MEMORY;
+		}
+		val = tmp;
 
-	tmp = talloc_realloc(ctx, val, uint8_t, size);
-	if (tmp == NULL) {
-		TALLOC_FREE(val);
-		return NT_STATUS_NO_MEMORY;
-	}
-	val = tmp;
+		sizeret = getxattr_do(handle, fsp, smb_fname, XATTR_NTACL_NAME,
+				      val, size);
 
-	sizeret =
-	    getxattr_do(handle, fsp, smb_fname, XATTR_NTACL_NAME, val, size);
+		if (sizeret != -1) {
+			break;
+		}
 
-	/* Max ACL size is 65536 bytes. */
-	if (sizeret == -1) {
-		if ((errno == ERANGE) && (size != 65536)) {
-			/* Too small, try again. */
-			size = 65536;
-			goto again;
+		if (errno != ERANGE) {
+			break;
 		}
+	}
 
-		/* Real error - exit here. */
+	if (sizeret >= 0) {
+		pblob->data = val;
+		pblob->length = sizeret;
+		return NT_STATUS_OK;
+	} else {
 		TALLOC_FREE(val);
 		return map_nt_error_from_unix(errno);
 	}
-
-	pblob->data = val;
-	pblob->length = sizeret;
-	return NT_STATUS_OK;
 }
 
 /*******************************************************************
-- 
2.9.3



More information about the samba-technical mailing list