[PATCHES] Issue fsync for SMB2 FLUSH asynchronously

Jeremy Allison jra at samba.org
Thu Nov 12 22:59:23 UTC 2015


On Thu, Nov 12, 2015 at 02:41:36PM -0700, Christof Schmitt wrote:
> On Thu, Nov 12, 2015 at 11:56:47AM -0800, Jeremy Allison wrote:
> > On Wed, Nov 11, 2015 at 06:05:57PM -0800, Jeremy Allison wrote:
> > > > TEST SMB2-BASIC FAILED!
> > > > 
> > > > Let me look at this some more..
> > > 
> > > Found the bug. It's my fault of old..
> > > 
> > > There is a *HIDEOUS* global variable that counts
> > > the number of outstanding aio calls:
> > > 
> > > extern int outstanding_aio_calls;
> > > 
> > > and when source3/modules/vfs_default.c
> > > wants to read the results of an async
> > > call in vfswrap_asys_finished() it
> > > uses a variable length array defined
> > > as:
> > > 
> > > struct asys_result results[outstanding_aio_calls];
> > > 
> > > ret = asys_results(asys_ctx, results, outstanding_aio_calls);
> > > 
> > > to work out how many to read. The vfswrap_fsync_send()
> > > doesn't increment outstanding_aio_calls so when
> > > the fflush finishes it calls asys_results(asys_ctx, results, 0);
> > > which doesn't work too well :-).
> > > 
> > > I need to ensure outstanding_aio_calls is kept up
> > > to date everywhere.
> > > 
> > > Going to clock off now - I'll post a fixed
> > > patch (that passes make test) tomorrow.
> > > 
> > > Sorry for the problem !
> > 
> > OK, here is the revised and working patch.
> > 
> > I removed the global access to the problematic
> > variables, created wrapper functions for them, and
> > ensure that the number of outstanding
> > aio calls is incremented and decremented
> > correctly on the async fsync calls.
> > 
> > Passes local make test ! Please review and
> > push if happy.
> 
> Looks good, pushed to autobuild.
> 
> Thanks for the patches, it would have taken me some time to figure out
> the details.

No problem, thanks for the review !

After some private conversations with Volker,
here is a follow-up patch that will apply
on top of the previous one and removes one
of the global varables completely.

Volker pointed out that we need to rethink
the aio_pending_size limit on outstanding
aio requests.

What happens when we reach this limit is
that we start processing SMB requests
synchronously, which slows down the system
further and we never get to see the replies
from the asys_results write to the internal
pipe as we're too busy reading and processing
incoming SMB requests.

So here is a patchset that does a few
things on top of the one you pushed !

a). Removes --with-aio-support. We no longer
need to run on systems that don't have
pthread support but use POSIX-RT aio_read()
aio_write() and friends. These days we
require pthreads.

This has the advantage that it removes
vfs_aio_posix as it's no longer used
(and indeed it was never even documented !).

Plus is removes a boatload of code and
options processing (as we no longer have
to check for POSIX-RT io calls).

b). Removes the fallback to processing SMB requests
synchronously which as Volker pointed out makes
the problem worse. Instead always put them onto
the internal pthreadpool queue, which will
dispatch them as a thread becomes free.

c). Change the internal aio_pending_size static variable
which is hard-coded at compile time to 100 to a
proper smb.conf global variable "aio max threads"
which is set to 100 by default but is now a
system-tunable.

It passes make test.

We still need to fix the prioritization of
the asys pipe ahead of the incoming SMB
socket problem, but this moves us along
to the point that the priority is the
only remaining issue.

Please review and comment !

Cheers,

	Jeremy.
-------------- next part --------------
From ed30f632048dc001563ebce547ad2aaeb2a22500 Mon Sep 17 00:00:00 2001
From: Jeremy Allison <jra at samba.org>
Date: Thu, 12 Nov 2015 12:44:50 -0800
Subject: [PATCH 1/3] s3: smbd: Remove --with-aio-support. We no longer would
 ever prefer POSIX-RT aio, use pthread_aio instead.

Signed-off-by: Jeremy Allison <jra at samba.org>
---
 packaging/RHEL-CTDB/configure.rpm   |   1 -
 packaging/RHEL-CTDB/samba.spec.tmpl |   2 -
 source3/modules/vfs_aio_posix.c     | 301 ------------------------------------
 source3/modules/wscript_build       |   8 -
 source3/wscript                     |  25 ---
 5 files changed, 337 deletions(-)
 delete mode 100644 source3/modules/vfs_aio_posix.c

diff --git a/packaging/RHEL-CTDB/configure.rpm b/packaging/RHEL-CTDB/configure.rpm
index c36d4bd..a2d35ea 100755
--- a/packaging/RHEL-CTDB/configure.rpm
+++ b/packaging/RHEL-CTDB/configure.rpm
@@ -62,7 +62,6 @@ CC="$CC" CFLAGS="-Wall -g -D_GNU_SOURCE -O3" ./configure -C \
 	--with-ctdb=/usr/include \
 	--without-ldb \
 	--without-dnsupdate \
-	--with-aio-support \
 	--disable-external-libtalloc \
 	--disable-external-libtdb \
 	$*
diff --git a/packaging/RHEL-CTDB/samba.spec.tmpl b/packaging/RHEL-CTDB/samba.spec.tmpl
index 0d8b5a6..6380158 100644
--- a/packaging/RHEL-CTDB/samba.spec.tmpl
+++ b/packaging/RHEL-CTDB/samba.spec.tmpl
@@ -180,7 +180,6 @@ CFLAGS="$RPM_OPT_FLAGS $EXTRA -D_GNU_SOURCE" ./configure \
 	--with-ctdb=/usr/include \
 	--without-ldb \
 	--without-dnsupdate \
-	--with-aio-support \
 	--disable-external-libtalloc \
 	--disable-external-libtdb
 
@@ -428,7 +427,6 @@ exit 0
 %{_libarchdir}/samba/vfs/tsmsm.so
 %endif
 %{_libarchdir}/samba/vfs/xattr_tdb.so
-%{_libarchdir}/samba/vfs/aio_posix.so
 %{_libarchdir}/samba/vfs/aio_pthread.so
 %{_libarchdir}/samba/vfs/media_harmony.so
 
diff --git a/source3/modules/vfs_aio_posix.c b/source3/modules/vfs_aio_posix.c
deleted file mode 100644
index bca69b4..0000000
--- a/source3/modules/vfs_aio_posix.c
+++ /dev/null
@@ -1,301 +0,0 @@
-/*
- * Simulate pread_send/recv and pwrite_send/recv using posix aio
- *
- * Copyright (C) Volker Lendecke 2012
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 3 of the License, or
- * (at your option) any later version.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program; if not, write to the Free Software
- * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
- */
-
-#include "includes.h"
-#include "system/filesys.h"
-#include "system/shmem.h"
-#include "smbd/smbd.h"
-#include "smbd/globals.h"
-#include "lib/util/tevent_unix.h"
-#include "lib/util/sys_rw.h"
-#include <aio.h>
-
-/* The signal we'll use to signify aio done. */
-#ifndef RT_SIGNAL_AIO
-#define RT_SIGNAL_AIO	(SIGRTMIN+3)
-#endif
-
-#ifndef HAVE_STRUCT_SIGEVENT_SIGEV_VALUE_SIVAL_PTR
-#ifdef HAVE_STRUCT_SIGEVENT_SIGEV_VALUE_SIGVAL_PTR
-#define sival_int	sigval_int
-#define sival_ptr	sigval_ptr
-#endif
-#endif
-
-static struct tevent_signal *aio_signal_event = NULL;
-
-struct aio_posix_state {
-	struct aiocb acb;
-	ssize_t ret;
-	int err;
-};
-
-static int aio_posix_state_destructor(struct aio_posix_state *s)
-{
-	int ret;
-
-	/*
-	 * We could do better here. This destructor is run when a
-	 * request is prematurely cancelled. We wait for the aio to
-	 * complete, so that we do not have to maintain aiocb structs
-	 * beyond the life of an aio_posix_state. Possible, but not
-	 * sure the effort is worth it right now.
-	 */
-
-	do {
-		const struct aiocb *a = &s->acb;
-		ret = aio_suspend(&a, 1, NULL);
-	} while ((ret == -1) && (errno == EINTR));
-
-	return 0;
-}
-
-static struct tevent_req *aio_posix_pread_send(
-	struct vfs_handle_struct *handle,
-	TALLOC_CTX *mem_ctx, struct tevent_context *ev,
-	struct files_struct *fsp, void *data, size_t n, off_t offset)
-{
-	struct tevent_req *req;
-	struct aio_posix_state *state;
-	struct aiocb *a;
-	int ret;
-
-	req = tevent_req_create(mem_ctx, &state, struct aio_posix_state);
-	if (req == NULL) {
-		return NULL;
-	}
-
-	a = &state->acb;
-
-	a->aio_fildes = fsp->fh->fd;
-	a->aio_buf = data;
-	a->aio_nbytes = n;
-	a->aio_offset = offset;
-	a->aio_sigevent.sigev_notify = SIGEV_SIGNAL;
-	a->aio_sigevent.sigev_signo  = RT_SIGNAL_AIO;
-	a->aio_sigevent.sigev_value.sival_ptr = req;
-
-	ret = aio_read(a);
-	if (ret == 0) {
-		talloc_set_destructor(state, aio_posix_state_destructor);
-		return req;
-	}
-
-	if (errno == EAGAIN) {
-		/*
-		 * aio overloaded, do the sync fallback
-		 */
-		state->ret = sys_pread(fsp->fh->fd, data, n, offset);
-		if (state->ret == -1) {
-			state->err = errno;
-		}
-		tevent_req_done(req);
-		return tevent_req_post(req, ev);
-	}
-
-	tevent_req_error(req, errno);
-	return tevent_req_post(req, ev);
-}
-
-static struct tevent_req *aio_posix_pwrite_send(
-	struct vfs_handle_struct *handle,
-	TALLOC_CTX *mem_ctx, struct tevent_context *ev,
-	struct files_struct *fsp, const void *data, size_t n, off_t offset)
-{
-	struct tevent_req *req;
-	struct aio_posix_state *state;
-	struct aiocb *a;
-	int ret;
-
-	req = tevent_req_create(mem_ctx, &state, struct aio_posix_state);
-	if (req == NULL) {
-		return NULL;
-	}
-
-	a = &state->acb;
-
-	a->aio_fildes = fsp->fh->fd;
-	a->aio_buf = discard_const(data);
-	a->aio_nbytes = n;
-	a->aio_offset = offset;
-	a->aio_sigevent.sigev_notify = SIGEV_SIGNAL;
-	a->aio_sigevent.sigev_signo  = RT_SIGNAL_AIO;
-	a->aio_sigevent.sigev_value.sival_ptr = req;
-
-	ret = aio_write(a);
-	if (ret == 0) {
-		talloc_set_destructor(state, aio_posix_state_destructor);
-		return req;
-	}
-
-	if (errno == EAGAIN) {
-		/*
-		 * aio overloaded, do the sync fallback
-		 */
-		state->ret = sys_pwrite(fsp->fh->fd, data, n, offset);
-		if (state->ret == -1) {
-			state->err = errno;
-		}
-		tevent_req_done(req);
-		return tevent_req_post(req, ev);
-	}
-
-	tevent_req_error(req, errno);
-	return tevent_req_post(req, ev);
-}
-
-static void aio_posix_signal_handler(struct tevent_context *ev,
-				     struct tevent_signal *se,
-				     int signum, int count,
-				     void *_info, void *private_data)
-{
-	siginfo_t *info;
-	struct tevent_req *req;
-	struct aio_posix_state *state;
-	int err;
-
-	info = (siginfo_t *)_info;
-	req = talloc_get_type_abort(info->si_value.sival_ptr,
-				    struct tevent_req);
-	state = tevent_req_data(req, struct aio_posix_state);
-
-	err = aio_error(&state->acb);
-	if (err == EINPROGRESS) {
-		DEBUG(10, ("aio_posix_signal_handler: operation req %p "
-			   "still in progress\n", req));
-		return;
-	}
-	if (err == ECANCELED) {
-		DEBUG(10, ("aio_posix_signal_handler: operation req %p "
-			   "canceled\n", req));
-		return;
-	}
-
-	/*
-	 * No need to suspend for this in the destructor anymore
-	 */
-	talloc_set_destructor(state, NULL);
-
-	state->ret = aio_return(&state->acb);
-	state->err = err;
-	tevent_req_done(req);
-}
-
-static ssize_t aio_posix_recv(struct tevent_req *req, int *err)
-{
-	struct aio_posix_state *state = tevent_req_data(
-		req, struct aio_posix_state);
-
-	if (tevent_req_is_unix_error(req, err)) {
-		return -1;
-	}
-	*err = state->err;
-	return state->ret;
-}
-
-static struct tevent_req *aio_posix_fsync_send(
-	struct vfs_handle_struct *handle, TALLOC_CTX *mem_ctx,
-	struct tevent_context *ev, struct files_struct *fsp)
-{
-	struct tevent_req *req;
-	struct aio_posix_state *state;
-	struct aiocb *a;
-	int ret;
-
-	req = tevent_req_create(mem_ctx, &state, struct aio_posix_state);
-	if (req == NULL) {
-		return NULL;
-	}
-
-	a = &state->acb;
-
-	a->aio_fildes = fsp->fh->fd;
-	a->aio_sigevent.sigev_notify = SIGEV_SIGNAL;
-	a->aio_sigevent.sigev_signo  = RT_SIGNAL_AIO;
-	a->aio_sigevent.sigev_value.sival_ptr = req;
-
-	ret = aio_fsync(O_SYNC, a);
-	if (ret == 0) {
-		talloc_set_destructor(state, aio_posix_state_destructor);
-		return req;
-	}
-
-	if (errno == EAGAIN) {
-		/*
-		 * aio overloaded, do the sync fallback
-		 */
-		state->ret = fsync(fsp->fh->fd);
-		if (state->ret == -1) {
-			state->err = errno;
-		}
-		tevent_req_done(req);
-		return tevent_req_post(req, ev);
-	}
-
-	tevent_req_error(req, errno);
-	return tevent_req_post(req, ev);
-}
-
-static int aio_posix_int_recv(struct tevent_req *req, int *err)
-{
-	struct aio_posix_state *state = tevent_req_data(
-		req, struct aio_posix_state);
-
-	if (tevent_req_is_unix_error(req, err)) {
-		return -1;
-	}
-	*err = state->err;
-	return state->ret;
-}
-
-static int aio_posix_connect(vfs_handle_struct *handle, const char *service,
-			     const char *user)
-{
-	if (aio_signal_event == NULL) {
-		struct tevent_context *ev = handle->conn->sconn->ev_ctx;
-
-		aio_signal_event = tevent_add_signal(
-			ev, ev, RT_SIGNAL_AIO, SA_SIGINFO,
-			aio_posix_signal_handler, NULL);
-
-		if (aio_signal_event == NULL) {
-			DEBUG(1, ("tevent_add_signal failed\n"));
-			return -1;
-		}
-	}
-	return SMB_VFS_NEXT_CONNECT(handle, service, user);
-}
-
-static struct vfs_fn_pointers vfs_aio_posix_fns = {
-	.connect_fn = aio_posix_connect,
-	.pread_send_fn = aio_posix_pread_send,
-	.pread_recv_fn = aio_posix_recv,
-	.pwrite_send_fn = aio_posix_pwrite_send,
-	.pwrite_recv_fn = aio_posix_recv,
-	.fsync_send_fn = aio_posix_fsync_send,
-	.fsync_recv_fn = aio_posix_int_recv,
-};
-
-NTSTATUS vfs_aio_posix_init(void);
-NTSTATUS vfs_aio_posix_init(void)
-{
-	return smb_register_vfs(SMB_VFS_INTERFACE_VERSION,
-				"aio_posix", &vfs_aio_posix_fns);
-}
diff --git a/source3/modules/wscript_build b/source3/modules/wscript_build
index 635b780..4dc6653 100644
--- a/source3/modules/wscript_build
+++ b/source3/modules/wscript_build
@@ -321,14 +321,6 @@ bld.SAMBA3_MODULE('vfs_aio_pthread',
                  internal_module=bld.SAMBA3_IS_STATIC_MODULE('vfs_aio_pthread'),
                  enabled=bld.SAMBA3_IS_ENABLED_MODULE('vfs_aio_pthread'))
 
-bld.SAMBA3_MODULE('vfs_aio_posix',
-                 subsystem='vfs',
-                 source='vfs_aio_posix.c',
-                 deps='samba-util tevent',
-                 init_function='',
-                 internal_module=bld.SAMBA3_IS_STATIC_MODULE('vfs_aio_posix'),
-                 enabled=bld.SAMBA3_IS_ENABLED_MODULE('vfs_aio_posix'))
-
 bld.SAMBA3_MODULE('vfs_aio_linux',
                  subsystem='vfs',
                  source='vfs_aio_linux.c',
diff --git a/source3/wscript b/source3/wscript
index 82cb858..990afcd 100644
--- a/source3/wscript
+++ b/source3/wscript
@@ -52,7 +52,6 @@ def set_options(opt):
     opt.SAMBA3_ADD_OPTION('dnsupdate')
     opt.SAMBA3_ADD_OPTION('syslog')
     opt.SAMBA3_ADD_OPTION('automount')
-    opt.SAMBA3_ADD_OPTION('aio-support')
     opt.SAMBA3_ADD_OPTION('dmapi', default=None) # None means autodetection
     opt.SAMBA3_ADD_OPTION('fam', default=None) # None means autodetection
     opt.SAMBA3_ADD_OPTION('profiling-data', default=False)
@@ -547,27 +546,6 @@ return acl_get_perm_np(permset_d, perm);
 		msg='Checking for openat',
 		headers='fcntl.h')
 
-    if Options.options.with_aio_support:
-        conf.CHECK_FUNCS_IN('aio_read', 'aio')
-        conf.CHECK_FUNCS_IN('aio_read', 'rt')
-        conf.CHECK_CODE('struct aiocb a; return aio_read(&a);',
-                        'HAVE_AIO',
-                        msg='Checking for asynchronous io support',
-                        headers='sys/types.h aio.h',
-                        lib='aio rt')
-        if conf.CONFIG_SET('HAVE_AIO'):
-            conf.CHECK_CODE('struct aiocb a; return aio_read(&a);', 'HAVE_AIO_READ', msg='Checking for aio_read', headers='aio.h', lib='aio rt')
-            conf.CHECK_CODE('struct aiocb a; return aio_write(&a);', 'HAVE_AIO_WRITE', msg='Checking for aio_write', headers='aio.h', lib='aio rt')
-            conf.CHECK_CODE('struct aiocb a; return aio_fsync(1, &a);', 'HAVE_AIO_FSYNC', msg='Checking for aio_fsync', headers='aio.h', lib='aio rt')
-            conf.CHECK_CODE('struct aiocb a; return aio_return(&a);', 'HAVE_AIO_RETURN', msg='Checking for aio_return', headers='aio.h', lib='aio rt')
-            conf.CHECK_CODE('struct aiocb a; return aio_error(&a);', 'HAVE_AIO_ERROR', msg='Checking for aio_error', headers='aio.h', lib='aio rt')
-            conf.CHECK_CODE('struct aiocb a; return aio_cancel(1, &a);', 'HAVE_AIO_CANCEL', msg='Checking for aio_cancel', headers='aio.h', lib='aio rt')
-            conf.CHECK_CODE('const struct aiocb * const a[1]; struct timespec t; return aio_suspend(a, 1, &t);', 'HAVE_AIO_SUSPEND', msg='Checking for aio_suspend', headers='aio.h', lib='aio rt')
-        if not conf.CONFIG_SET('HAVE_AIO'):
-            conf.DEFINE('HAVE_NO_AIO', '1')
-    else:
-        conf.DEFINE('HAVE_NO_AIO', '1')
-
     if host_os.rfind('linux') > -1:
 	conf.CHECK_FUNCS_IN('io_submit', 'aio')
 	conf.CHECK_CODE('''
@@ -1633,9 +1611,6 @@ main() {
     if Options.options.with_pthreadpool:
         default_shared_modules.extend(TO_LIST('vfs_aio_pthread'))
 
-    if conf.CONFIG_SET('HAVE_AIO'):
-        default_shared_modules.extend(TO_LIST('vfs_aio_posix'))
-
     if conf.CONFIG_SET('HAVE_LINUX_KERNEL_AIO'):
         default_shared_modules.extend(TO_LIST('vfs_aio_linux'))
 
-- 
2.6.0.rc2.230.g3dd15c0


From 15ba1f61c788f5348b7d394925c3bcc1f98cb810 Mon Sep 17 00:00:00 2001
From: Jeremy Allison <jra at samba.org>
Date: Thu, 12 Nov 2015 13:07:21 -0800
Subject: [PATCH 2/3] s3: smbd: Remove checks causing fallback to sync on
 pread/pwrite/fsync.

Rely on pthreadpool queueing instead of falling back.

Signed-off-by: Jeremy Allison <jra at samba.org>
---
 source3/smbd/aio.c        | 36 +-----------------------------------
 source3/smbd/smb2_flush.c | 14 --------------
 2 files changed, 1 insertion(+), 49 deletions(-)

diff --git a/source3/smbd/aio.c b/source3/smbd/aio.c
index 1216a0d..3923eac 100644
--- a/source3/smbd/aio.c
+++ b/source3/smbd/aio.c
@@ -29,7 +29,7 @@
  Statics plus accessor functions.
 *****************************************************************************/
 
-static int aio_pending_size = 100;  /* tevent supports 100 signals SA_SIGINFO */
+static int aio_pending_size = 100; /* Current max threads. */
 static int outstanding_aio_calls;
 
 int get_aio_pending_size(void)
@@ -218,13 +218,6 @@ NTSTATUS schedule_aio_read_and_X(connection_struct *conn,
 		return NT_STATUS_RETRY;
 	}
 
-	if (get_outstanding_aio_calls() >= get_aio_pending_size()) {
-		DEBUG(10,("schedule_aio_read_and_X: Already have %d aio "
-			  "activities outstanding.\n",
-			  get_outstanding_aio_calls() ));
-		return NT_STATUS_RETRY;
-	}
-
 	/* The following is safe from integer wrap as we've already checked
 	   smb_maxcnt is 128k or less. Wct is 12 for read replies */
 
@@ -484,19 +477,6 @@ NTSTATUS schedule_aio_write_and_X(connection_struct *conn,
 		return NT_STATUS_RETRY;
 	}
 
-	if (get_outstanding_aio_calls() >= get_aio_pending_size()) {
-		DEBUG(3,("schedule_aio_write_and_X: Already have %d aio "
-			 "activities outstanding.\n",
-			  get_outstanding_aio_calls() ));
-		DEBUG(10,("schedule_aio_write_and_X: failed to schedule "
-			  "aio_write for file %s, offset %.0f, len = %u "
-			  "(mid = %u)\n",
-			  fsp_str_dbg(fsp), (double)startpos,
-			  (unsigned int)numtowrite,
-			  (unsigned int)smbreq->mid ));
-		return NT_STATUS_RETRY;
-	}
-
 	bufsize = smb_size + 6*2;
 
 	if (!(aio_ex = create_aio_extra(NULL, fsp, bufsize))) {
@@ -744,13 +724,6 @@ NTSTATUS schedule_smb2_aio_read(connection_struct *conn,
 		return NT_STATUS_RETRY;
 	}
 
-	if (get_outstanding_aio_calls() >= get_aio_pending_size()) {
-		DEBUG(10,("smb2: Already have %d aio "
-			"activities outstanding.\n",
-			get_outstanding_aio_calls() ));
-		return NT_STATUS_RETRY;
-	}
-
 	/* Create the out buffer. */
 	*preadbuf = data_blob_talloc(ctx, NULL, smb_maxcnt);
 	if (preadbuf->data == NULL) {
@@ -900,13 +873,6 @@ NTSTATUS schedule_aio_smb2_write(connection_struct *conn,
 		return NT_STATUS_RETRY;
 	}
 
-	if (get_outstanding_aio_calls() >= get_aio_pending_size()) {
-		DEBUG(3,("smb2: Already have %d aio "
-			"activities outstanding.\n",
-			get_outstanding_aio_calls() ));
-		return NT_STATUS_RETRY;
-	}
-
 	if (smbreq->unread_bytes) {
 		/* Can't do async with recvfile. */
 		return NT_STATUS_RETRY;
diff --git a/source3/smbd/smb2_flush.c b/source3/smbd/smb2_flush.c
index d26707a..00b0535 100644
--- a/source3/smbd/smb2_flush.c
+++ b/source3/smbd/smb2_flush.c
@@ -162,20 +162,6 @@ static struct tevent_req *smbd_smb2_flush_send(TALLOC_CTX *mem_ctx,
 		return tevent_req_post(req, ev);
 	}
 
-	if (get_outstanding_aio_calls() >= get_aio_pending_size()) {
-		/* No more allowed aio. Synchronous flush. */
-		NTSTATUS status = sync_file(smbreq->conn, fsp, true);
-		if (!NT_STATUS_IS_OK(status)) {
-			DEBUG(5,("sync_file for %s returned %s\n",
-				fsp_str_dbg(fsp),
-				nt_errstr(status)));
-			tevent_req_nterror(req, status);
-			return tevent_req_post(req, ev);
-		}
-		tevent_req_done(req);
-		return tevent_req_post(req, ev);
-	}
-
 	ret = flush_write_cache(fsp, SAMBA_SYNC_FLUSH);
 	if (ret == -1) {
 		tevent_req_nterror(req,  map_nt_error_from_unix(errno));
-- 
2.6.0.rc2.230.g3dd15c0


From e0b45b96c58d4dddf4a4960125f118b50b8eaaa1 Mon Sep 17 00:00:00 2001
From: Jeremy Allison <jra at samba.org>
Date: Thu, 12 Nov 2015 13:23:30 -0800
Subject: [PATCH 3/3] s3: smbd: Change aio_pending_size static variable to a
 new "aio max threads" smb.conf parameter.

Removes accessor functions as now this parameter is set
under user control in smb.conf. Default is 100.

Note that this doesn't limit the number of outstanding
aio requests, it just causes them to go onto the
pthreadpool queue.

Now we need to prioritize pthreadpool pipe replies
ahead of incoming SMB2 requests, but that's a patch
for another day.

Based on ideas from Volker.

Signed-off-by: Jeremy Allison <jra at samba.org>
---
 docs-xml/smbdotconf/tuning/aiomaxthreads.xml | 19 +++++++++++++++++++
 source3/modules/vfs_aio_fork.c               | 11 -----------
 source3/modules/vfs_aio_linux.c              | 22 ++--------------------
 source3/modules/vfs_aio_pthread.c            |  4 ++--
 source3/modules/vfs_default.c                |  2 +-
 source3/smbd/aio.c                           | 11 -----------
 source3/smbd/proto.h                         |  2 --
 7 files changed, 24 insertions(+), 47 deletions(-)
 create mode 100644 docs-xml/smbdotconf/tuning/aiomaxthreads.xml

diff --git a/docs-xml/smbdotconf/tuning/aiomaxthreads.xml b/docs-xml/smbdotconf/tuning/aiomaxthreads.xml
new file mode 100644
index 0000000..df06a2b
--- /dev/null
+++ b/docs-xml/smbdotconf/tuning/aiomaxthreads.xml
@@ -0,0 +1,19 @@
+<samba:parameter name="aio max threads"
+                 type="integer"
+                 context="G"
+                 xmlns:samba="http://www.samba.org/samba/DTD/samba-doc">
+<description>
+  <para>
+    The integer parameter specifies the maximum number of
+    threads Samba will create when doing parallel asynchronous IO
+    calls. If the number of outstanding calls is greater than this
+    number the requests will not be refused but go onto a queue
+    and will be scheduled in turn as outstanding requests complete.
+  </para>
+
+  <related>aio read size</related>
+  <related>aio write size</related>
+</description>
+
+<value type="default">100</value>
+</samba:parameter>
diff --git a/source3/modules/vfs_aio_fork.c b/source3/modules/vfs_aio_fork.c
index 25a72c6..472ef0c 100644
--- a/source3/modules/vfs_aio_fork.c
+++ b/source3/modules/vfs_aio_fork.c
@@ -899,17 +899,6 @@ static int aio_fork_connect(vfs_handle_struct *handle, const char *service,
 				NULL, struct aio_fork_config,
 				return -1);
 
-	/*********************************************************************
-	 * How many threads to initialize ?
-	 * 100 per process seems insane as a default until you realize that
-	 * (a) Threads terminate after 1 second when idle.
-	 * (b) Throttling is done in SMB2 via the crediting algorithm.
-	 * (c) SMB1 clients are limited to max_mux (50) outstanding
-	 *     requests and Windows clients don't use this anyway.
-	 * Essentially we want this to be unlimited unless smb.conf
-	 * says different.
-	 *********************************************************************/
-	set_aio_pending_size(100);
 	return 0;
 }
 
diff --git a/source3/modules/vfs_aio_linux.c b/source3/modules/vfs_aio_linux.c
index 599272e..4f6230a 100644
--- a/source3/modules/vfs_aio_linux.c
+++ b/source3/modules/vfs_aio_linux.c
@@ -113,12 +113,12 @@ static bool init_aio_linux(struct vfs_handle_struct *handle)
 		goto fail;
 	}
 
-	if (io_queue_init(get_aio_pending_size(), &io_ctx)) {
+	if (io_queue_init(lp_aio_max_threads(), &io_ctx)) {
 		goto fail;
 	}
 
 	DEBUG(10,("init_aio_linux: initialized with up to %d events\n",
-		  get_aio_pending_size()));
+		  (int)lp_aio_max_threads()));
 
 	return true;
 
@@ -321,25 +321,7 @@ static int aio_linux_int_recv(struct tevent_req *req, int *err)
 	return aio_linux_recv(req, err);
 }
 
-static int aio_linux_connect(vfs_handle_struct *handle, const char *service,
-			       const char *user)
-{
-	/*********************************************************************
-	 * How many io_events to initialize ?
-	 * 128 per process seems insane as a default until you realize that
-	 * (a) Throttling is done in SMB2 via the crediting algorithm.
-	 * (b) SMB1 clients are limited to max_mux (50) outstanding
-	 *     requests and Windows clients don't use this anyway.
-	 * Essentially we want this to be unlimited unless smb.conf
-	 * says different.
-	 *********************************************************************/
-	set_aio_pending_size(lp_parm_int(
-		SNUM(handle->conn), "aio_linux", "aio num events", 128));
-	return SMB_VFS_NEXT_CONNECT(handle, service, user);
-}
-
 static struct vfs_fn_pointers vfs_aio_linux_fns = {
-	.connect_fn = aio_linux_connect,
 	.pread_send_fn = aio_linux_pread_send,
 	.pread_recv_fn = aio_linux_recv,
 	.pwrite_send_fn = aio_linux_pwrite_send,
diff --git a/source3/modules/vfs_aio_pthread.c b/source3/modules/vfs_aio_pthread.c
index 72c812f..10a3a23 100644
--- a/source3/modules/vfs_aio_pthread.c
+++ b/source3/modules/vfs_aio_pthread.c
@@ -51,7 +51,7 @@ static bool init_aio_threadpool(struct tevent_context *ev_ctx,
 		return true;
 	}
 
-	ret = pthreadpool_init(get_aio_pending_size(), pp_pool);
+	ret = pthreadpool_init(lp_aio_max_threads(), pp_pool);
 	if (ret) {
 		errno = ret;
 		return false;
@@ -69,7 +69,7 @@ static bool init_aio_threadpool(struct tevent_context *ev_ctx,
 	}
 
 	DEBUG(10,("init_aio_threadpool: initialized with up to %d threads\n",
-		  get_aio_pending_size()));
+		  (int)lp_aio_max_threads()));
 
 	return true;
 }
diff --git a/source3/modules/vfs_default.c b/source3/modules/vfs_default.c
index f3ebb89..819a1a1 100644
--- a/source3/modules/vfs_default.c
+++ b/source3/modules/vfs_default.c
@@ -716,7 +716,7 @@ static bool vfswrap_init_asys_ctx(struct smbd_server_connection *conn)
 		return true;
 	}
 
-	ret = asys_context_init(&ctx, get_aio_pending_size());
+	ret = asys_context_init(&ctx, lp_aio_max_threads());
 	if (ret != 0) {
 		DEBUG(1, ("asys_context_init failed: %s\n", strerror(ret)));
 		return false;
diff --git a/source3/smbd/aio.c b/source3/smbd/aio.c
index 3923eac..32a1ce0 100644
--- a/source3/smbd/aio.c
+++ b/source3/smbd/aio.c
@@ -29,19 +29,8 @@
  Statics plus accessor functions.
 *****************************************************************************/
 
-static int aio_pending_size = 100; /* Current max threads. */
 static int outstanding_aio_calls;
 
-int get_aio_pending_size(void)
-{
-	return aio_pending_size;
-}
-
-void set_aio_pending_size(int newsize)
-{
-	aio_pending_size = newsize;
-}
-
 int get_outstanding_aio_calls(void)
 {
 	return outstanding_aio_calls;
diff --git a/source3/smbd/proto.h b/source3/smbd/proto.h
index 95414e6..7926dd6 100644
--- a/source3/smbd/proto.h
+++ b/source3/smbd/proto.h
@@ -66,8 +66,6 @@ void srv_set_signing(struct smbXsrv_connection *conn,
 
 /* The following definitions come from smbd/aio.c  */
 
-int get_aio_pending_size(void);
-void set_aio_pending_size(int newsize);
 int get_outstanding_aio_calls(void);
 void increment_outstanding_aio_calls(void);
 void decrement_outstanding_aio_calls(void);
-- 
2.6.0.rc2.230.g3dd15c0



More information about the samba-technical mailing list