[SCM] Samba Shared Repository - branch v4-10-test updated

Wed Sep 4 12:51:02 UTC 2019

The branch, v4-10-test has been updated
       via  7e07bc4f289 vfs_glusterfs: Use pthreadpool for scheduling aio operations
       via  f5017935a7b ctdb-recoverd: Fix typo in previous fix
       via  25dacde5c8f ctdb-tests: Clear deleted record via recovery instead of vacuuming
       via  f39a9c2a4be ctdb-tests: Strengthen volatile DB traverse test
       via  530119888c6 ctdb-recoverd: Only check for LMASTER nodes in the VNN map
       via  9cbb50d2e9d ctdb-tests: Don't retrieve the VNN map from target node for notlmaster
       via  3e0205ec026 ctdb-tests: Handle special cases first and return
       via  576f5e30351 ctdb-tests: Inline handling of recovered and notlmaster statuses
       via  d0b666a1a8d ctdb-tests: Drop unused node statuses frozen/unfrozen
       via  594a2a95cea ctdb-tests: Reformat node_has_status()
      from  981f8b164d3 VERSION: Bump version up to 4.10.9.

https://git.samba.org/?p=samba.git;a=shortlog;h=v4-10-test


- Log -----------------------------------------------------------------
commit 7e07bc4f2893ceedeb9f01537be2770f5bf6dda9
Author: Poornima G <pgurusid at redhat.com>
Date:   Wed Jul 24 15:15:33 2019 +0530

    vfs_glusterfs: Use pthreadpool for scheduling aio operations
    
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=14098
    
    Signed-off-by: Poornima G <pgurusid at redhat.com>
    Reviewed-by: Guenther Deschner <gd at samba.org>
    Reviewed-by: Jeremy Allison <jra at samba.org>
    
    Autobuild-User(master): Jeremy Allison <jra at samba.org>
    Autobuild-Date(master): Fri Aug 23 18:40:08 UTC 2019 on sn-devel-184
    
    (cherry picked from commit d8863dd8cb74bb0534457ca930a71e77c367d994)
    
    Autobuild-User(v4-10-test): Karolin Seeger <kseeger at samba.org>
    Autobuild-Date(v4-10-test): Wed Sep  4 12:49:59 UTC 2019 on sn-devel-144

commit f5017935a7b517d982310097322f98681a8a1608
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Aug 27 12:13:51 2019 +1000

    ctdb-recoverd: Fix typo in previous fix
    
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=14085
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>
    
    Autobuild-User(master): Amitay Isaacs <amitay at samba.org>
    Autobuild-Date(master): Tue Aug 27 15:29:11 UTC 2019 on sn-devel-184
    
    (cherry picked from commit 8190993d99284162bd8699780248bb2edfec2673)

commit 25dacde5c8f6bf6d94fe0e3bada4108003418412
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Aug 13 14:45:33 2019 +1000

    ctdb-tests: Clear deleted record via recovery instead of vacuuming
    
    This test has been flapping because sometimes the record is not
    vacuumed within the expected time period, perhaps even because the
    check for the record can interfere with vacuuming.  However, instead
    of waiting for vacuuming the record can be cleared by doing a
    recovery.  This should be much more reliable.
    
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=14085
    RN: Fix flapping CTDB tests
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>
    
    Autobuild-User(master): Martin Schwenke <martins at samba.org>
    Autobuild-Date(master): Wed Aug 21 13:06:57 UTC 2019 on sn-devel-184
    
    (cherry picked from commit 71ad473ba805abe23bbe6c1a1290612e448e73f3)

commit f39a9c2a4be089e4a3c99357f5494bb03362abf2
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Jul 29 17:22:50 2019 +1000

    ctdb-tests: Strengthen volatile DB traverse test
    
    Check the record count more often, from multiple nodes.  Add a case
    with multiple records.
    
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=14085
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>
    (cherry picked from commit ca4df06080709adf0cbebc95b0a70b4090dad5ba)

commit 530119888c68c3951a084844a3c4fbd92b52a0ec
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Aug 21 14:35:09 2019 +1000

    ctdb-recoverd: Only check for LMASTER nodes in the VNN map
    
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=14085
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>
    (cherry picked from commit 5d655ac6f2ff82f8f1c89b06870d600a1a3c7a8a)

commit 9cbb50d2e9d11bc7314cf5685e1f5af9bb70bbb0
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Jul 29 16:45:07 2019 +1000

    ctdb-tests: Don't retrieve the VNN map from target node for notlmaster
    
    Use the VNN map from the node running node_has_status().
    
    This means that
    
      wait_until_node_has_status 1 notlmaster 10 0
    
    will run "ctdb status" on node 0 and check (for up to 10 seconds) if
    node 1 is in the VNN map.
    
    If the LMASTER capability has been dropped on node 1 then the above
    will wait for the VNN map to be updated on node 0.  This will happen
    as part of the recovery that is triggered by the change of LMASTER
    capability.  The next command will then only be able to attach to
    $TESTDB after the recovery is complete thus guaranteeing a sane state
    for the test to continue.
    
    This stops simple/79_volatile_db_traverse.sh from going into recovery
    during the traverse or at some other inconvenient time.
    
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=14085
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>
    (cherry picked from commit 53daeb2f878af1634a26e05cb86d87e2faf20173)

commit 3e0205ec0263104bd31ba3da1efc9dedae41fff3
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Jul 29 16:43:09 2019 +1000

    ctdb-tests: Handle special cases first and return
    
    All the other cases involve matching bits.
    
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=14085
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>
    (cherry picked from commit bff1a3a548a2cace997b767d78bb824438664cb7)

commit 576f5e3035114bef2822249495269bdcbb8cf5e6
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Jul 29 15:45:41 2019 +1000

    ctdb-tests: Inline handling of recovered and notlmaster statuses
    
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=14085
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>
    (cherry picked from commit bb59073515ee5f7886b5d9a20d7b2805857c2708)

commit d0b666a1a8d893eafeb67727323d013f7d784aa6
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Jul 29 15:40:16 2019 +1000

    ctdb-tests: Drop unused node statuses frozen/unfrozen
    
    Silently drop unused local variable mpat.
    
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=14085
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>
    (cherry picked from commit 9b09a87326af28877301ad27bcec5bb13744e2b6)

commit 594a2a95ceac56b4e7a85b7adee703a04c04f908
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Jul 29 15:31:55 2019 +1000

    ctdb-tests: Reformat node_has_status()
    
    Re-indent and drop non-POSIX left-parenthesis from case labels.
    
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=14085
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>
    (cherry picked from commit 52227d19735a3305ad633672c70385f443f222f0)

-----------------------------------------------------------------------

Summary of changes:
 ctdb/server/ctdb_recoverd.c                        |  14 +-
 ctdb/tests/scripts/integration.bash                |  80 +--
 ctdb/tests/simple/69_recovery_resurrect_deleted.sh |  17 +-
 ctdb/tests/simple/79_volatile_db_traverse.sh       |  67 ++-
 source3/modules/vfs_glusterfs.c                    | 562 +++++++++++----------
 5 files changed, 401 insertions(+), 339 deletions(-)


Changeset truncated at 500 lines:

diff --git a/ctdb/server/ctdb_recoverd.c b/ctdb/server/ctdb_recoverd.c
index 3e63bd1e7a5..31e72f139ff 100644
--- a/ctdb/server/ctdb_recoverd.c
+++ b/ctdb/server/ctdb_recoverd.c
@@ -2981,13 +2981,19 @@ static void main_loop(struct ctdb_context *ctdb, struct ctdb_recoverd *rec,
 		return;
 	}
 
-	/* verify that all active nodes in the nodemap also exist in 
-	   the vnnmap.
+	/*
+	 * Verify that all active lmaster nodes in the nodemap also
+	 * exist in the vnnmap
 	 */
 	for (j=0; j<nodemap->num; j++) {
 		if (nodemap->nodes[j].flags & NODE_FLAGS_INACTIVE) {
 			continue;
 		}
+		if (! ctdb_node_has_capabilities(rec->caps,
+						 nodemap->nodes[j].pnn,
+						 CTDB_CAP_LMASTER)) {
+			continue;
+		}
 		if (nodemap->nodes[j].pnn == pnn) {
 			continue;
 		}
@@ -2998,8 +3004,8 @@ static void main_loop(struct ctdb_context *ctdb, struct ctdb_recoverd *rec,
 			}
 		}
 		if (i == vnnmap->size) {
-			DEBUG(DEBUG_ERR, (__location__ " Node %u is active in the nodemap but did not exist in the vnnmap\n", 
-				  nodemap->nodes[j].pnn));
+			D_ERR("Active LMASTER node %u is not in the vnnmap\n",
+			      nodemap->nodes[j].pnn);
 			ctdb_set_culprit(rec, nodemap->nodes[j].pnn);
 			do_recovery(rec, mem_ctx, pnn, nodemap, vnnmap);
 			return;
diff --git a/ctdb/tests/scripts/integration.bash b/ctdb/tests/scripts/integration.bash
index 30725c48e53..a4d45fb9ac2 100644
--- a/ctdb/tests/scripts/integration.bash
+++ b/ctdb/tests/scripts/integration.bash
@@ -311,53 +311,53 @@ wait_until_ready ()
 # This function is becoming nicely overloaded.  Soon it will collapse!  :-)
 node_has_status ()
 {
-    local pnn="$1"
-    local status="$2"
-
-    local bits fpat mpat rpat
-    case "$status" in
-	(unhealthy)    bits="?|?|?|1|*" ;;
-	(healthy)      bits="?|?|?|0|*" ;;
-	(disconnected) bits="1|*" ;;
-	(connected)    bits="0|*" ;;
-	(banned)       bits="?|1|*" ;;
-	(unbanned)     bits="?|0|*" ;;
-	(disabled)     bits="?|?|1|*" ;;
-	(enabled)      bits="?|?|0|*" ;;
-	(stopped)      bits="?|?|?|?|1|*" ;;
-	(notstopped)   bits="?|?|?|?|0|*" ;;
-	(frozen)       fpat='^[[:space:]]+frozen[[:space:]]+1$' ;;
-	(unfrozen)     fpat='^[[:space:]]+frozen[[:space:]]+0$' ;;
-	(recovered)    rpat='^Recovery mode:RECOVERY \(1\)$' ;;
-	(notlmaster)   rpat="^hash:.* lmaster:${pnn}\$" ;;
+	local pnn="$1"
+	local status="$2"
+
+	case "$status" in
+	recovered)
+		! $CTDB status -n "$pnn" | \
+			grep -Eq '^Recovery mode:RECOVERY \(1\)$'
+		return
+		;;
+	notlmaster)
+		! $CTDB status | grep -Eq "^hash:.* lmaster:${pnn}\$"
+		return
+		;;
+	esac
+
+	local bits
+	case "$status" in
+	unhealthy)    bits="?|?|?|1|*" ;;
+	healthy)      bits="?|?|?|0|*" ;;
+	disconnected) bits="1|*" ;;
+	connected)    bits="0|*" ;;
+	banned)       bits="?|1|*" ;;
+	unbanned)     bits="?|0|*" ;;
+	disabled)     bits="?|?|1|*" ;;
+	enabled)      bits="?|?|0|*" ;;
+	stopped)      bits="?|?|?|?|1|*" ;;
+	notstopped)   bits="?|?|?|?|0|*" ;;
 	*)
-	    echo "node_has_status: unknown status \"$status\""
-	    return 1
-    esac
-
-    if [ -n "$bits" ] ; then
+		echo "node_has_status: unknown status \"$status\""
+		return 1
+	esac
 	local out x line
 
 	out=$($CTDB -X status 2>&1) || return 1
 
 	{
-            read x
-            while read line ; do
-		# This needs to be done in 2 steps to avoid false matches.
-		local line_bits="${line#|${pnn}|*|}"
-		[ "$line_bits" = "$line" ] && continue
-		[ "${line_bits#${bits}}" != "$line_bits" ] && return 0
-            done
-	    return 1
+		read x
+		while read line ; do
+			# This needs to be done in 2 steps to
+			# avoid false matches.
+			local line_bits="${line#|${pnn}|*|}"
+			[ "$line_bits" = "$line" ] && continue
+			[ "${line_bits#${bits}}" != "$line_bits" ] && \
+				return 0
+		done
+		return 1
 	} <<<"$out" # Yay bash!
-    elif [ -n "$fpat" ] ; then
-	$CTDB statistics -n "$pnn" | egrep -q "$fpat"
-    elif [ -n "$rpat" ] ; then
-        ! $CTDB status -n "$pnn" | egrep -q "$rpat"
-    else
-	echo 'node_has_status: unknown mode, neither $bits nor $fpat is set'
-	return 1
-    fi
 }
 
 wait_until_node_has_status ()
diff --git a/ctdb/tests/simple/69_recovery_resurrect_deleted.sh b/ctdb/tests/simple/69_recovery_resurrect_deleted.sh
index 8126c49b83c..f6c72c59f2a 100755
--- a/ctdb/tests/simple/69_recovery_resurrect_deleted.sh
+++ b/ctdb/tests/simple/69_recovery_resurrect_deleted.sh
@@ -54,18 +54,11 @@ database_has_zero_records ()
 	return 0
 }
 
-echo "Get vacuum interval"
-try_command_on_node -v $second $CTDB getvar VacuumInterval
-vacuum_interval="${out#* = }"
-
-echo "Wait until vacuuming deletes the record on active nodes"
-# Why 4?  Steps are:
-# 1. Original node processes delete queue, asks lmaster to fetch
-# 2. lmaster recoverd fetches
-# 3. lmaster processes delete queue
-# If vacuuming is just missed then need an extra interval
-t=$((vacuum_interval * 4))
-wait_until "${t}/10" database_has_zero_records
+echo "Trigger a recovery"
+try_command_on_node "$second" $CTDB recover
+
+echo "Checking that database has 0 records"
+database_has_zero_records
 
 echo "Continue node ${first}"
 try_command_on_node $first $CTDB continue
diff --git a/ctdb/tests/simple/79_volatile_db_traverse.sh b/ctdb/tests/simple/79_volatile_db_traverse.sh
index af7e962f579..7f3007d5105 100755
--- a/ctdb/tests/simple/79_volatile_db_traverse.sh
+++ b/ctdb/tests/simple/79_volatile_db_traverse.sh
@@ -42,11 +42,56 @@ try_command_on_node 0 $CTDB writekey "$TESTDB" "foo" "bar0"
 echo "write foo=bar1 on node 1"
 try_command_on_node 1 $CTDB writekey "$TESTDB" "foo" "bar1"
 
-echo "do traverse on node 0"
-try_command_on_node -v 0 $CTDB catdb "$TESTDB"
+echo
 
-echo "do traverse on node 1"
-try_command_on_node -v 1 $CTDB catdb "$TESTDB"
+check_db_num_records ()
+{
+	local node="$1"
+	local db="$2"
+	local n="$3"
+
+	echo "Checking on node ${node} to ensure ${db} has ${n} records..."
+	try_command_on_node "$node" "${CTDB} catdb ${db}"
+
+	num=$(sed -n -e 's|^Dumped \(.*\) records$|\1|p' "$outfile")
+	if [ "$num" = "$n" ] ; then
+		echo "OK: Number of records=${num}"
+		echo
+	else
+		echo "BAD: There were ${num} (!= ${n}) records"
+		cat "$outfile"
+		exit 1
+	fi
+}
+
+check_db_num_records 0 "$TESTDB" 1
+check_db_num_records 1 "$TESTDB" 1
+
+cat <<EOF
+
+Again, this time with 10 records, rewriting 5 of them on the 2nd node
+
+EOF
+
+echo "wipe test database $TESTDB"
+try_command_on_node 0 $CTDB wipedb "$TESTDB"
+
+for i in $(seq 0 9) ; do
+	k="foo${i}"
+	v="bar${i}@0"
+	echo "write ${k}=${v} on node 0"
+	try_command_on_node 0 "${CTDB} writekey ${TESTDB} ${k} ${v}"
+done
+
+for i in $(seq 1 5) ; do
+	k="foo${i}"
+	v="bar${i}@1"
+	echo "write ${k}=${v} on node 1"
+	try_command_on_node 1 "${CTDB} writekey ${TESTDB} ${k} ${v}"
+done
+
+check_db_num_records 0 "$TESTDB" 10
+check_db_num_records 1 "$TESTDB" 10
 
 cat <<EOF
 
@@ -63,8 +108,6 @@ try_command_on_node 1 $CTDB setlmasterrole off
 try_command_on_node -v 1 $CTDB getcapabilities
 
 wait_until_node_has_status 1 notlmaster 10 0
-# Wait for recovery and new VNN map to be pushed
-#sleep_for 10
 
 echo "write foo=bar0 on node 0"
 try_command_on_node 0 $CTDB writekey "$TESTDB" "foo" "bar0"
@@ -72,16 +115,10 @@ try_command_on_node 0 $CTDB writekey "$TESTDB" "foo" "bar0"
 echo "write foo=bar1 on node 1"
 try_command_on_node 1 $CTDB writekey "$TESTDB" "foo" "bar1"
 
-echo "do traverse on node 0"
-try_command_on_node -v 0 $CTDB catdb "$TESTDB"
+echo
 
-num=$(sed -n -e 's|^Dumped \(.*\) records$|\1|p' "$outfile")
-if [ "$num" = 1 ] ; then
-	echo "OK: There was 1 record"
-else
-	echo "BAD: There were ${num} (!= 1) records"
-	exit 1
-fi
+check_db_num_records 0 "$TESTDB" 1
+check_db_num_records 1 "$TESTDB" 1
 
 if grep -q "^data(4) = \"bar1\"\$" "$outfile" ; then
 	echo "OK: Data from node 1 was returned"
diff --git a/source3/modules/vfs_glusterfs.c b/source3/modules/vfs_glusterfs.c
index 16f8d004294..f07ad233934 100644
--- a/source3/modules/vfs_glusterfs.c
+++ b/source3/modules/vfs_glusterfs.c
@@ -45,14 +45,11 @@
 #include "lib/util/sys_rw.h"
 #include "smbprofile.h"
 #include "modules/posixacl_xattr.h"
+#include "lib/pthreadpool/pthreadpool_tevent.h"
 
 #define DEFAULT_VOLFILE_SERVER "localhost"
 #define GLUSTER_NAME_MAX 255
 
-static int read_fd = -1;
-static int write_fd = -1;
-static struct tevent_fd *aio_read_event = NULL;
-
 /**
  * Helper to convert struct stat to struct stat_ex.
  */
@@ -700,326 +697,283 @@ static ssize_t vfs_gluster_pread(struct vfs_handle_struct *handle,
 	return ret;
 }
 
-struct glusterfs_aio_state;
-
-struct glusterfs_aio_wrapper {
-	struct glusterfs_aio_state *state;
-};
-
-struct glusterfs_aio_state {
+struct vfs_gluster_pread_state {
 	ssize_t ret;
-	struct tevent_req *req;
-	bool cancelled;
+	glfs_fd_t *fd;
+	void *buf;
+	size_t count;
+	off_t offset;
+
 	struct vfs_aio_state vfs_aio_state;
-	struct timespec start;
 	SMBPROFILE_BYTES_ASYNC_STATE(profile_bytes);
 };
 
-static int aio_wrapper_destructor(struct glusterfs_aio_wrapper *wrap)
-{
-	if (wrap->state != NULL) {
-		wrap->state->cancelled = true;
-	}
-
-	return 0;
-}
+static void vfs_gluster_pread_do(void *private_data);
+static void vfs_gluster_pread_done(struct tevent_req *subreq);
+static int vfs_gluster_pread_state_destructor(struct vfs_gluster_pread_state *state);
 
-/*
- * This function is the callback that will be called on glusterfs
- * threads once the async IO submitted is complete. To notify
- * Samba of the completion we use a pipe based queue.
- */
-#ifdef HAVE_GFAPI_VER_7_6
-static void aio_glusterfs_done(glfs_fd_t *fd, ssize_t ret,
-			       struct glfs_stat *prestat,
-			       struct glfs_stat *poststat,
-			       void *data)
-#else
-static void aio_glusterfs_done(glfs_fd_t *fd, ssize_t ret, void *data)
-#endif
+static struct tevent_req *vfs_gluster_pread_send(struct vfs_handle_struct
+						  *handle, TALLOC_CTX *mem_ctx,
+						  struct tevent_context *ev,
+						  files_struct *fsp,
+						  void *data, size_t n,
+						  off_t offset)
 {
-	struct glusterfs_aio_state *state = NULL;
-	int sts = 0;
-	struct timespec end;
-
-	state = (struct glusterfs_aio_state *)data;
+	struct vfs_gluster_pread_state *state;
+	struct tevent_req *req, *subreq;
 
-	PROFILE_TIMESTAMP(&end);
+	glfs_fd_t *glfd = vfs_gluster_fetch_glfd(handle, fsp);
+	if (glfd == NULL) {
+		DBG_ERR("Failed to fetch gluster fd\n");
+		return NULL;
+	}
 
-	if (ret < 0) {
-		state->ret = -1;
-		state->vfs_aio_state.error = errno;
-	} else {
-		state->ret = ret;
+	req = tevent_req_create(mem_ctx, &state, struct vfs_gluster_pread_state);
+	if (req == NULL) {
+		return NULL;
 	}
-	state->vfs_aio_state.duration = nsec_time_diff(&end, &state->start);
 
-	SMBPROFILE_BYTES_ASYNC_END(state->profile_bytes);
+	state->ret = -1;
+	state->fd = glfd;
+	state->buf = data;
+	state->count = n;
+	state->offset = offset;
 
-	/*
-	 * Write the state pointer to glusterfs_aio_state to the
-	 * pipe, so we can call tevent_req_done() from the main thread,
-	 * because tevent_req_done() is not designed to be executed in
-	 * the multithread environment, so tevent_req_done() must be
-	 * executed from the smbd main thread.
-	 *
-	 * write(2) on pipes with sizes under _POSIX_PIPE_BUF
-	 * in size is atomic, without this, the use op pipes in this
-	 * code would not work.
-	 *
-	 * sys_write is a thin enough wrapper around write(2)
-	 * that we can trust it here.
-	 */
+	SMBPROFILE_BYTES_ASYNC_START(syscall_asys_pread, profile_p,
+				     state->profile_bytes, n);
+	SMBPROFILE_BYTES_ASYNC_SET_IDLE(state->profile_bytes);
 
-	sts = sys_write(write_fd, &state, sizeof(struct glusterfs_aio_state *));
-	if (sts < 0) {
-		DEBUG(0,("\nWrite to pipe failed (%s)", strerror(errno)));
+	subreq = pthreadpool_tevent_job_send(
+		state, ev, handle->conn->sconn->pool,
+		vfs_gluster_pread_do, state);
+	if (tevent_req_nomem(subreq, req)) {
+		return tevent_req_post(req, ev);
 	}
+	tevent_req_set_callback(subreq, vfs_gluster_pread_done, req);
+
+	talloc_set_destructor(state, vfs_gluster_pread_state_destructor);
 
-	return;
+	return req;
 }
 
-/*
- * Read each req off the pipe and process it.
- */
-static void aio_tevent_fd_done(struct tevent_context *event_ctx,
-				struct tevent_fd *fde,
-				uint16_t flags, void *data)
+static void vfs_gluster_pread_do(void *private_data)
 {
-	struct tevent_req *req = NULL;
-	struct glusterfs_aio_state *state = NULL;
-	int sts = 0;
+	struct vfs_gluster_pread_state *state = talloc_get_type_abort(
+		private_data, struct vfs_gluster_pread_state);
+	struct timespec start_time;
+	struct timespec end_time;
 
-	/*
-	 * read(2) on pipes is atomic if the needed data is available
-	 * in the pipe, per SUS and POSIX.  Because we always write
-	 * to the pipe in sizeof(struct tevent_req *) chunks, we can
-	 * always read in those chunks, atomically.
-	 *
-	 * sys_read is a thin enough wrapper around read(2) that we
-	 * can trust it here.
-	 */
+	SMBPROFILE_BYTES_ASYNC_SET_BUSY(state->profile_bytes);
 
-	sts = sys_read(read_fd, &state, sizeof(struct glusterfs_aio_state *));
+	PROFILE_TIMESTAMP(&start_time);
 
-	if (sts < 0) {
-		DEBUG(0,("\nRead from pipe failed (%s)", strerror(errno)));
-	}
+	do {
+#ifdef HAVE_GFAPI_VER_7_6
+		state->ret = glfs_pread(state->fd, state->buf, state->count,
+					state->offset, 0, NULL);
+#else
+		state->ret = glfs_pread(state->fd, state->buf, state->count,
+					state->offset, 0);
+#endif
+	} while ((state->ret == -1) && (errno == EINTR));
 
-	/* if we've cancelled the op, there is no req, so just clean up. */
-	if (state->cancelled == true) {
-		TALLOC_FREE(state);
-		return;
+	if (state->ret == -1) {
+		state->vfs_aio_state.error = errno;
 	}
 
-	req = state->req;
+	PROFILE_TIMESTAMP(&end_time);
 
-	if (req) {
-		tevent_req_done(req);
-	}
-	return;
+	state->vfs_aio_state.duration = nsec_time_diff(&end_time, &start_time);
+
+	SMBPROFILE_BYTES_ASYNC_SET_IDLE(state->profile_bytes);
 }
 
-static bool init_gluster_aio(struct vfs_handle_struct *handle)
+static int vfs_gluster_pread_state_destructor(struct vfs_gluster_pread_state *state)
 {
-	int fds[2];
-	int ret = -1;
+	return -1;
+}
 
-	if (read_fd != -1) {
+static void vfs_gluster_pread_done(struct tevent_req *subreq)
+{
+	struct tevent_req *req = tevent_req_callback_data(
+		subreq, struct tevent_req);
+	struct vfs_gluster_pread_state *state = tevent_req_data(
+		req, struct vfs_gluster_pread_state);
+	int ret;
+
+	ret = pthreadpool_tevent_job_recv(subreq);
+	TALLOC_FREE(subreq);
+	SMBPROFILE_BYTES_ASYNC_END(state->profile_bytes);
+	talloc_set_destructor(state, NULL);
+	if (ret != 0) {
+		if (ret != EAGAIN) {
+			tevent_req_error(req, ret);
+			return;
+		}
 		/*
-		 * Already initialized.
+		 * If we get EAGAIN from pthreadpool_tevent_job_recv() this
+		 * means the lower level pthreadpool failed to create a new
+		 * thread. Fallback to sync processing in that case to allow
+		 * some progress for the client.
 		 */
-		return true;
-	}


-- 
Samba Shared Repository