How to detect a client-closed connection during a write from our LDAP server?
Stefan Metzmacher
metze at samba.org
Thu Oct 13 16:21:02 UTC 2022
Am 13.10.22 um 17:23 schrieb Stefan Metzmacher via samba-technical:
> Hi Andrew,
>
>>> Ah, we register tstream_bsd_fde_handler() with
>>> TEVENT_FD_READ|TEVENT_FD_WRITE
>>> and have this logic:
>>>
>>> static void tstream_bsd_fde_handler(struct tevent_context *ev,
>>> struct tevent_fd *fde,
>>> uint16_t flags,
>>> void *private_data)
>>> {
>>> struct tstream_bsd *bsds =
>>> talloc_get_type_abort(private_data,
>>> struct tstream_bsd);
>>>
>>> if (flags & TEVENT_FD_WRITE) {
>>> bsds->writeable_handler(bsds->writeable_private);
>>> return;
>>> }
>>> if (flags & TEVENT_FD_READ) {
>>> if (!bsds->readable_handler) {
>>> if (bsds->writeable_handler) {
>>>
>>> =============> here we have the loop
>>>
>>
>> Awesome debugging!
>
>
> I created https://bugzilla.samba.org/show_bug.cgi?id=15202 for the problem.
And here's a merge request that tries to fix it
I haven't done any runtime testing with it...
>>> bsds->writeable_handler(bsds-
>>>> writeable_private);
>>> return;
>>> }
>>> TEVENT_FD_NOT_READABLE(bsds->fde);
>>> return;
>>> }
>>> bsds->readable_handler(bsds->readable_private);
>>> return;
>>> }
>>> }
>>>
>>> We call the writeable handler (which is non-blocking) when we get
>>> TEVENT_FD_READ
>>> because we assume it will get an error if TEVENT_FD_READ was
>>> generated by
>>> EPOLLERR. I think moving TEVENT_FD_NOT_READABLE(bsds->fde); before
>>> if (bsds->writeable_handler) would make sure we only try that path
>>> once
>>> and not many times in a busy loop.
>>
>> How would we then, in this case, detect that the socket is closed?
>>
>> While we would start processing other connections, wouldn't we change
>> to a not-busy wait never closing the socket as there is still data to
>> send but never getting to read the EOF?
>>
>> How can we proceed to closing the socket?
The first time we get TEVENT_FD_READ with just a writeable handler,
we'll use poll() in order to detect POLLRDHUP which indicates
we're in CLOSE_WAIT state and map it to ECONNRESET.
Other than that we just wait for the kernel to detect the fully broken
connection, which may take a while. This can be altered via
the 'socket options' option in smb.conf setting
TCP_KEEPCNT, TCP_KEEPIDLE, TCP_KEEPINTVL and/or TCP_USER_TIMEOUT
(or changing the kernel defaults.
>>> And in ldapsrv_call_writev_start() we may want to use
>>> tevent_req_set_endtime() on
>>> the subreq of tstream_writev_queue_send(), so that
>>> ldapsrv_call_writev_done()
>>> will get ETIMEDOUT from tstream_writev_queue_recv() and call
>>> ldapsrv_terminate_connection().
>>
>> Do we have a good precedent for what the send timeout should be?
>>
>> I can't see any good examples elsewhere in the code sadly.
>>
>> One option is to reuse the search timeout, so that a query could at
>> most take 240sec, eg 2x 120sec, once for the search itself and once to
>> send over the network. This would give only one knob to change (which
>> is good and bad).
I used the idle timeout (which is the closed logical match)
which is 900s.
>> In the meantime I'm going to try and mock this up in a self-contained
>> cmocka test.
reading unix_poll() from the Linux kernel, shutdown() on the client
might be able to reproduce this (hopefully) with and without socket_wrapper.
Are you able to test this, even if it's just by hand?
metze
More information about the samba-technical
mailing list