How to detect a client-closed connection during a write from our LDAP server?
tom at talpey.com
Fri Oct 14 13:52:16 UTC 2022
On 10/14/2022 9:45 AM, Stefan Metzmacher wrote:
> Hi Tom,
>>> It means RCV_SHUTDOWN gets set as well as TCP_CLOSE_WAIT, but
>>> sk->sk_err is not changed to indicate an error.
>> This is correct, because the TCP connection is in "half-closed" state.
>> The peer has closed, but the outgoing stream is still open. The TCP
>> protocol has supported this since forever.
>> This is not a transitory state. The connection can remain in it forever.
>> The peer is now in FIN_WAIT_2 and will send no further data. It's
>> waiting for our FIN, and in turn the local socket is waiting for a
>> close() call to do so. But pretty much any other socket operation
>> can still be performed.
> Thanks for the explanation!
>>> It means if (sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN)) doesn't
>>> hit as we only have RCV_SHUTDOWN and sk_stream_wait_memory returns
>> Probably because the peer has stopped reading the socket. FIN_WAIT_2 is
>> a super-problematic state, because the only way to exit it is to receive
>> a FIN or RST, which we're evidently not sending. Most implementations
>> run a timer as failsafe, but it's always rather long (minutes).
> Yes, we need 'socket options' with TCP_KEEPCNT, TCP_KEEPIDLE,
> TCP_KEEPINTVL and/or TCP_USER_TIMEOUT
> and/or a user space timer in order to have lower timeouts.
That won't help. The peer is there, and the connection is up.
The keepalive will succeed! Even if it failed, it's not prompt,
and reducing the KEEPINTVL is a very bad idea. Servers should not
be pinging their clients in any event.
What peer is doing this? Most Windows clients will perform an
abortive close, but this one is doing it gracefully. The
server should deal with either, of course, so I'm mostly just
>>> And tcp_poll has this:
>>> if (sk->sk_shutdown & RCV_SHUTDOWN)
>>> mask |= EPOLLIN | EPOLLRDNORM | EPOLLRDHUP;
>>> So we'll get EPOLLIN | EPOLLRDNORM | EPOLLRDHUP triggering
>>> and writev/sendmsg keep getting EAGAIN.
>> I think the code needs to detect the half-close and give up. It's not
>> going to happen promptly any other way.
>> I may have missed some other message - is a fix proposed?
> It was in the next message:
More information about the samba-technical