tevent: fd events do not work correctly for UDP socket?
pbrezina at redhat.com
Wed Aug 26 09:54:07 UTC 2020
On 8/19/20 9:24 PM, Stefan Metzmacher wrote:
> Hi Pavel,
>>> Do strace ./client to see why this is:
>>> The key is here:
>>> write(1, "File descriptor is readable!\n", 29File descriptor is readable!
>>> ) = 29
>>> read(4, 0x7fffeee02750, 254) = -1 ECONNREFUSED (Connection refused)
>>> Your connect call succeeds, as it's setting up the local
>>> binding to the remote address, but as it hasn't sent any
>>> data yet the client hasn't noticed there's no one listening.
>>> Once you do the:
>>> const char *msg = "I AM CONNECTED\n";
>>> write(fd, msg, strlen(msg));
>>> call then the kernel tries to send the data, notices
>>> there's nothing listening and so the read fd becomes
>>> ready via EPOLL - it needs to return the error
>>> ECONNREFUSED (we select for EPOLLIN|EPOLLERR|EPOLLHUP).
>>> So when you call the read() in the tevent handler,
>>> that's when you'd pick up the errno = ECONNREFUSED
>> I see. If I understand it correctly epoll returns EPOLLERR and the code hits this  line?
>>> I don't think this is tevent specific behavior.
>> If the above is true then tevent should provide way for the handler to check for errors or don't call a read handler on an error so read does not get called.
>> My use case is that I'm trying to implement a CLDAP ping over UDP in SSSD and when a Domain Controller is unreachable the read handler is fired, then ldap tries to receive
>> a reply and blocks until network timeout is reached which is undesirable.
>>  https://github.com/samba-team/samba/blob/master/lib/tevent/tevent_epoll.c#L707
> You need to mark the socket non-blocking in order to avoid any blocking.
> I have some patches to add TEVENT_FD_ERROR, but they are not upstream as there wasn't a strict need for.
> In the beginning select was the main backend of tevent, there was no POLLERR,POLLHUP.
> So errors are reported with TEVENT_FD_READ, as a read/recv is typically required to get the error.
Thank you for all the hints. It turned out to be a bug in libldap 
which calls recvfrom twice in a row for UDP calls. The first call will
consume the error (ECONNREFUSED) and the other call will then block
because there's nothing to receive.
More information about the samba-technical