tevent: fd events do not work correctly for UDP socket?

Pavel Březina pbrezina at redhat.com
Wed Aug 26 09:54:07 UTC 2020


On 8/19/20 9:24 PM, Stefan Metzmacher wrote:
> Hi Pavel,
> 
>>> Do strace ./client to see why this is:
>>>
>>> The key is here:
>>>
>>> write(1, "File descriptor is readable!\n", 29File descriptor is readable!
>>> ) = 29
>>> read(4, 0x7fffeee02750, 254)            = -1 ECONNREFUSED (Connection refused)
>>>
>>> Your connect call succeeds, as it's setting up the local
>>> binding to the remote address, but as it hasn't sent any
>>> data yet the client hasn't noticed there's no one listening.
>>>
>>> Once you do the:
>>>
>>>        const char *msg = "I AM CONNECTED\n";
>>>        write(fd, msg, strlen(msg));
>>>
>>> call then the kernel tries to send the data, notices
>>> there's nothing listening and so the read fd becomes
>>> ready via EPOLL - it needs to return the error
>>> ECONNREFUSED (we select for EPOLLIN|EPOLLERR|EPOLLHUP).
>>>
>>> So when you call the read() in the tevent handler,
>>> that's when you'd pick up the errno = ECONNREFUSED
>>> error.
>>
>> I see. If I understand it correctly epoll returns EPOLLERR and the code hits this [1] line?
>>
>>> I don't think this is tevent specific behavior.
>>
>> If the above is true then tevent should provide way for the handler to check for errors or don't call a read handler on an error so read does not get called.
>>
>> My use case is that I'm trying to implement a CLDAP ping over UDP in SSSD and when a Domain Controller is unreachable the read handler is fired, then ldap tries to receive
>> a reply and blocks until network timeout is reached which is undesirable.
>>
>> [1] https://github.com/samba-team/samba/blob/master/lib/tevent/tevent_epoll.c#L707
> 
> You need to mark the socket non-blocking in order to avoid any blocking.
> 
> I have some patches to add TEVENT_FD_ERROR, but they are not upstream as there wasn't a strict need for.
> In the beginning select was the main backend of tevent, there was no POLLERR,POLLHUP.
> So errors are reported with TEVENT_FD_READ, as a read/recv is typically required to get the error.

Thank you for all the hints. It turned out to be a bug in libldap [1] 
which calls recvfrom twice in a row for UDP calls. The first call will 
consume the error (ECONNREFUSED) and the other call will then block 
because there's nothing to receive.

[1] https://bugs.openldap.org/show_bug.cgi?id=9328




More information about the samba-technical mailing list