[SCM] Samba Shared Repository - branch master updated

Volker Lendecke Volker.Lendecke at SerNet.DE
Wed Dec 17 19:31:12 MST 2014


Sorry for the long mail, I'm really passionate about this, even at 3am
while I could not find sleep, but not over this topic :-)

On Wed, Dec 17, 2014 at 02:31:37PM -0500, Ira Cooper wrote:
> I don't believe there's any real standard on the use of threads in VFS
> modules.  In fact there is a vfs_aio_pthread.c module which tends to make
> me think that threads ARE allowed at the VFS layer, as long as you don't
> expose them to the rest of Samba.

If you look at the vfs_aio_pthread code, it does not make any thread
calls itself. All it uses is the pthreadpool code, and I would like to
keep thread use hidden behind that abstraction or a similar one if that
does not match at all.  It took me quite some time to get the pthreadpool
code right. Just take a look at 1c4284c7395f23, I discovered this quite
some time after it was initially developed. I also tried to make sure
that pthreadpool is fork-safe, something that is from my point of view
pretty difficult to achive in a library that uses threads.

> But to your actual question, about Gluster's AIO.  libgfapi, the Gluster
> userland interface library,  is threaded and the callback we get can come
> in from any thread.  We have no control over that.  The code you see is
> code to actually queue the events, and handle them in the main thread,
> safely.

There is no other way to get this done? Glusterfs is a pure networked file
system with a library, right? Can't we implement implement the protocol
for read/write ourselves using sockets and tevent_fds?  Forcing threads
upon the users of the library just to do asynchronous I/O is some pretty
heavy burden on the library user.

Is it really the case that the result of an async call pops up in another
library-created thread? That's at least unusual I'd say :-)

If the library is thread-safe, I'd rather go and wrap the normal pread
calls in a pthreadpool. Sync calls should not trigger new threads
to appear. This way we have control over the threads and can use
pthreadpool. Would that be a way to do it?

> I can see ways to eliminate the locking and eventfd code, using pipes, but
> beyond that, I don't see much we can do here.  The issue there will be
> making sure we maintain the performance of the current implementation.

pthreadpool might not have the best performance for queueing, but at least
it does not malloc per job. It uses condition variables and mutexes,
the standard technique here. I've tried to get something lock-free to
queue jobs, but this seems pretty difficult to do. I'm willing to steal
ideas from libgfapi here :-)

Also, in case libpfapi does caching itself, we might benefit from a
similar idea that is happening in linux kernel land with preadv2 right
now: Try a nonblocking sync call first and only if it blocks use the
thread pool.

In the discussion about nested event loops Tridge once told me that I
don't understand the nature of concurrency. There might be some truth
to it: I'm scared of threads and want to keep away from them as far as
possible. For example take a look at d8af3e76a362: This makes the job
executed in the thread a simple syscall again. unix_dgram_send_job()
is so obviously simple and free of any data locking problems that it's
easy to reason about. Same for asys_pread_do and the other jobs in
lib/asys. I'm willing to go a long way to keep it that way.


SerNet GmbH, Bahnhofsallee 1b, 37081 Göttingen
phone: +49-551-370000-0, fax: +49-551-370000-9
AG Göttingen, HRB 2816, GF: Dr. Johannes Loxen
http://www.sernet.de, mailto:kontakt at sernet.de

More information about the samba-technical mailing list