First fixes for https://bugzilla.samba.org/show_bug.cgi?id=11141

Stefan (metze) Metzmacher metze at samba.org
Wed May 20 09:27:02 MDT 2015


Hi,

here're the first patches for
https://bugzilla.samba.org/show_bug.cgi?id=11141.

From the bugreport:

> What happens is the following:
>
> - The related winbindd_cli_state has a wb_req_read_send()
>   or wb_resp_write_send() pending. That means there's
>   tevent_fd registered with the kernel via epoll_ctl(...ADD)
> - A new winbindd child is fork'ed and winbindd_reinit_after_fork()
>   -> close_conns_after_fork() starts to run, but didn't
>   reach the related winbindd_cli_state yet.
> - The parent calls remove_client() on the related winbindd_cli_state
>   maybe via remove_timed_out_clients().
> - remove_client() calls close() before TALLOC_FREE(state)
>   the close() doesn't remove the fd from the parents epoll
>   set yet, as the socket fd is still open in the child.
>   TALLOC_FREE(state) triggers tevent_req_received() on the pending
>   wb_req_read or wb_req_write, which then triggers
>   epoll_event_fd_destructor() -> epoll_update_event() -> epoll_del_event().
>   The results in epoll_ctl(DEL) returning EBADF because the fd is already
>   closed in the parent.
> - As the socket fd was still open in the child, epoll_wait returns
>   on event in the parent epoll queue, but the pointer in events.data.ptr
>   is not valid anymore.
>
> So we need to make sure the close() comes after epoll_ctl(DEL)
>
> As a safe-guard epoll_del_event() should also call epoll_panic()
> if it gets EBADF, this would result in a fallback to the poll backend
> instead of crashing later.

We may need to fix other places with similar problems,
we need to basically audit all direct and indirect callers
of writev_send() and read_packet_send(), we need to free the related
subrequests before calling close() on the socket.

But the winbind timeout cleanup together with forking new winbindd
childs seems to be the most common case to trigger crashes.

It would be cool to have this in 4.2.2 :-)

4.0 doesn't use the epoll backend, but 4.1.next should also get the fixes.

Please review and push.
Thanks!
metze
-------------- next part --------------
From 08336259eaaabb59b82de60659391fe64016bdbd Mon Sep 17 00:00:00 2001
From: Stefan Metzmacher <metze at samba.org>
Date: Mon, 18 May 2015 13:25:33 +0200
Subject: [PATCH 1/2] tevent: add a note to tevent_add_fd()

BUG: https://bugzilla.samba.org/show_bug.cgi?id=11141

Signed-off-by: Stefan Metzmacher <metze at samba.org>
---
 lib/tevent/tevent.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/lib/tevent/tevent.h b/lib/tevent/tevent.h
index c54cbe2..206d98e 100644
--- a/lib/tevent/tevent.h
+++ b/lib/tevent/tevent.h
@@ -176,6 +176,10 @@ void tevent_set_default_backend(const char *backend);
  *
  * @note To cancel the monitoring of a file descriptor, call talloc_free()
  * on the object returned by this function.
+ *
+ * @note The caller should avoid closing the file descriptor before
+ * calling talloc_free()! Otherwise the behaviour is undefined which
+ * might result in crashes.
  */
 struct tevent_fd *tevent_add_fd(struct tevent_context *ev,
 				TALLOC_CTX *mem_ctx,
-- 
1.9.1


From dc131b12c370880b5609164a073cba4dbc2f45c0 Mon Sep 17 00:00:00 2001
From: Stefan Metzmacher <metze at samba.org>
Date: Mon, 18 May 2015 13:17:40 +0200
Subject: [PATCH 2/2] s3:winbindd: make sure we remove pending io requests
 before closing client sockets

This avoids a crash inside the tevent epoll backend.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=11141

Signed-off-by: Stefan Metzmacher <metze at samba.org>
---
 source3/winbindd/winbindd.c | 26 ++++++++++++++++++++++++++
 source3/winbindd/winbindd.h |  1 +
 2 files changed, 27 insertions(+)

diff --git a/source3/winbindd/winbindd.c b/source3/winbindd/winbindd.c
index 87bc532..4ffe79c 100644
--- a/source3/winbindd/winbindd.c
+++ b/source3/winbindd/winbindd.c
@@ -824,6 +824,7 @@ static void request_finished(struct winbindd_cli_state *state)
 		return;
 	}
 	tevent_req_set_callback(req, winbind_client_response_written, state);
+	state->io_req = req;
 }
 
 static void winbind_client_response_written(struct tevent_req *req)
@@ -833,6 +834,8 @@ static void winbind_client_response_written(struct tevent_req *req)
 	ssize_t ret;
 	int err;
 
+	state->io_req = NULL;
+
 	ret = wb_resp_write_recv(req, &err);
 	TALLOC_FREE(req);
 	if (ret == -1) {
@@ -859,6 +862,7 @@ static void winbind_client_response_written(struct tevent_req *req)
 		return;
 	}
 	tevent_req_set_callback(req, winbind_client_request_read, state);
+	state->io_req = req;
 }
 
 void request_error(struct winbindd_cli_state *state)
@@ -929,6 +933,7 @@ static void new_connection(int listen_sock, bool privileged)
 		return;
 	}
 	tevent_req_set_callback(req, winbind_client_request_read, state);
+	state->io_req = req;
 
 	/* Add to connection list */
 
@@ -942,6 +947,8 @@ static void winbind_client_request_read(struct tevent_req *req)
 	ssize_t ret;
 	int err;
 
+	state->io_req = NULL;
+
 	ret = wb_req_read_recv(req, state, &state->request, &err);
 	TALLOC_FREE(req);
 	if (ret == -1) {
@@ -973,6 +980,25 @@ static void remove_client(struct winbindd_cli_state *state)
 		return;
 	}
 
+	/*
+	 * We need to remove a pending wb_req_read_*
+	 * or wb_resp_write_* request before closing the
+	 * socket.
+	 *
+	 * This is important as they might have used tevent_add_fd() and we
+	 * use the epoll * backend on linux. So we must remove the tevent_fd
+	 * before closing the fd.
+	 *
+	 * Otherwise we might hit a race with close_conns_after_fork() (via
+	 * winbindd_reinit_after_fork()) where a file description
+	 * is still open in a child, which means it's still active in
+	 * the parents epoll queue, but the related tevent_fd is already
+	 * already gone in the parent.
+	 *
+	 * See bug #11141.
+	 */
+	TALLOC_FREE(state->io_req);
+
 	if (state->sock != -1) {
 		/* tell client, we are closing ... */
 		nwritten = write(state->sock, &c, sizeof(c));
diff --git a/source3/winbindd/winbindd.h b/source3/winbindd/winbindd.h
index f676b1e..b2105e3 100644
--- a/source3/winbindd/winbindd.h
+++ b/source3/winbindd/winbindd.h
@@ -67,6 +67,7 @@ struct winbindd_cli_state {
 	struct winbindd_request *request;         /* Request from client */
 	struct tevent_queue *out_queue;
 	struct winbindd_response *response;        /* Respose to client */
+	struct tevent_req *io_req; /* wb_req_read_* or wb_resp_write_* */
 
 	struct getpwent_state *pwent_state; /* State for getpwent() */
 	struct getgrent_state *grent_state; /* State for getgrent() */
-- 
1.9.1
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: OpenPGP digital signature
URL: <http://lists.samba.org/pipermail/samba-technical/attachments/20150520/2e2138aa/attachment.pgp>


More information about the samba-technical mailing list