select semantics on unix domain sockets?

tridge at tridge at
Thu Sep 29 00:37:52 GMT 2005


 > Watch the select call telling us socket 16 is writable and the write call at
 > timestamp 16:29:55.870372 block for some seconds.

I think this can be explained by the fact that a select return saying
a fd is writeable only guarantees that 1 byte can be written according
to posix. With stream based sockets this isn't a problem, as the write
can happily send just 1 byte then return. With dgram based sockets it
is not so useful, as they can't be split up.

What I don't understand is how this tallies with sock_writeable() in
Linux. It looks like this:

 static inline int sock_writeable(const struct sock *sk) 
	return atomic_read(&sk->sk_wmem_alloc) < (sk->sk_sndbuf / 2);

and that function is used to determine if select() should see a
datagram socket as writeable. It looks to me as though it should only
say its writeable when the socket is at least half-empty. How big is
your send buffer? Try a getsockopt() with SO_SNDBUF.

I guess we could add a userspace backoff mechanism if we need to. We
would use non-blocking sockets and when the write indicates that
nothing can be written we would backoff for a short time.

btw, I tend to use "strace -Tt" so you see exactly how much time was
spent in the call itself, as well as the timestamp. That means you
don't have to infer the time spent in the call by subtracting two
times (which is error prone, as you don't know how much time is spent
running code in between).

Cheers, Tridge

More information about the samba-technical mailing list