How to detect a client-closed connection during a write from our LDAP server?

Stefan Metzmacher metze at samba.org
Thu Oct 13 15:23:14 UTC 2022


Hi Andrew,

>> Ah, we register tstream_bsd_fde_handler() with
>> TEVENT_FD_READ|TEVENT_FD_WRITE
>> and have this logic:
>>
>> static void tstream_bsd_fde_handler(struct tevent_context *ev,
>>                                       struct tevent_fd *fde,
>>                                       uint16_t flags,
>>                                       void *private_data)
>> {
>>           struct tstream_bsd *bsds =
>> talloc_get_type_abort(private_data,
>>                                      struct tstream_bsd);
>>
>>           if (flags & TEVENT_FD_WRITE) {
>>                   bsds->writeable_handler(bsds->writeable_private);
>>                   return;
>>           }
>>           if (flags & TEVENT_FD_READ) {
>>                   if (!bsds->readable_handler) {
>>                           if (bsds->writeable_handler) {
>>
>> =============> here we have the loop
>>
> 
> Awesome debugging!


I created https://bugzilla.samba.org/show_bug.cgi?id=15202 for the problem.

And I understand what happens in the kernel...

tcp_fin() has this:
         struct tcp_sock *tp = tcp_sk(sk);

         inet_csk_schedule_ack(sk);

         sk->sk_shutdown |= RCV_SHUTDOWN;
         sock_set_flag(sk, SOCK_DONE);

         switch (sk->sk_state) {
         case TCP_SYN_RECV:
         case TCP_ESTABLISHED:
                 /* Move to CLOSE_WAIT */
                 tcp_set_state(sk, TCP_CLOSE_WAIT);
                 inet_csk_enter_pingpong_mode(sk);
                 break;


It means RCV_SHUTDOWN gets set as well as TCP_CLOSE_WAIT, but
sk->sk_err is not changed to indicate an error.

tcp_sendmsg_locked has this:
...
         err = -EPIPE;
	if (sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN))
		goto do_error;

	while (msg_data_left(msg)) {
		int copy = 0;

		skb = tcp_write_queue_tail(sk);
		if (skb)
			copy = size_goal - skb->len;

		if (copy <= 0 || !tcp_skb_can_collapse_to(skb)) {
			bool first_skb;

new_segment:
			if (!sk_stream_memory_free(sk))
				goto wait_for_space;

...

wait_for_space:
		set_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
		if (copied)
			tcp_push(sk, flags & ~MSG_MORE, mss_now,
				 TCP_NAGLE_PUSH, size_goal);

		err = sk_stream_wait_memory(sk, &timeo);
		if (err != 0)
			goto do_error;

It means if (sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN)) doesn't
hit as we only have RCV_SHUTDOWN and sk_stream_wait_memory returns -EAGAIN.

And tcp_poll has this:

         if (sk->sk_shutdown & RCV_SHUTDOWN)
                 mask |= EPOLLIN | EPOLLRDNORM | EPOLLRDHUP;

So we'll get EPOLLIN | EPOLLRDNORM | EPOLLRDHUP triggering TEVENT_FD_READ
and writev/sendmsg keep getting EAGAIN.

metze


>>                                   bsds->writeable_handler(bsds-
>>> writeable_private);
>>                                   return;
>>                           }
>>                           TEVENT_FD_NOT_READABLE(bsds->fde);
>>                           return;
>>                   }
>>                   bsds->readable_handler(bsds->readable_private);
>>                   return;
>>           }
>> }
>>
>> We call the writeable handler (which is non-blocking) when we get
>> TEVENT_FD_READ
>> because we assume it will get an error if TEVENT_FD_READ was
>> generated by
>> EPOLLERR. I think moving TEVENT_FD_NOT_READABLE(bsds->fde); before
>> if (bsds->writeable_handler) would make sure we only try that path
>> once
>> and not many times in a busy loop.
> 
> How would we then, in this case, detect that the socket is closed?
> 
> While we would start processing other connections, wouldn't we change
> to a not-busy wait never closing the socket as there is still data to
> send but never getting to read the EOF?
> 
> How can we proceed to closing the socket?
> 
>> And in ldapsrv_call_writev_start() we may want to use
>> tevent_req_set_endtime() on
>> the subreq of tstream_writev_queue_send(), so that
>> ldapsrv_call_writev_done()
>> will get ETIMEDOUT from tstream_writev_queue_recv() and call
>> ldapsrv_terminate_connection().
> 
> Do we have a good precedent for what the send timeout should be?
> 
> I can't see any good examples elsewhere in the code sadly.
> 
> One option is to reuse the search timeout, so that a query could at
> most take 240sec, eg 2x 120sec, once for the search itself and once to
> send over the network.  This would give only one knob to change (which
> is good and bad).
> 
> In the meantime I'm going to try and mock this up in a self-contained
> cmocka test.
> 
> Andrew Bartlett




More information about the samba-technical mailing list