A Modest Proposal for Preventing the Event loops of Samba DCs From Being a Burthen to Their Implementers or Users, and for Making Them Beneficial to the Publick

Thu Apr 17 07:53:17 MDT 2014

Am 17.04.2014 15:25, schrieb Simo:
> On Thu, 2014-04-17 at 14:49 +0200, Stefan (metze) Metzmacher wrote:
>> Hi Simo,
>>
>>>> It does make ldb less 'async' as far as the caller is concerned, but we
>>>> simply don't use ldb in an async way in Samba.  (It is very unfortunate
>>>> we carry the great complexity and risk of an async ldb without any
>>>> significant use). 
>>>>
>>>> I have this under a private autobuild, and I would appreciate your
>>>> thoughts. 
>>>
>>> I do not really see the point of a separate event context honestly.
>>> All you need is clearly some locking, so that a new toplevel ldb
>>> operation can only be started from within the transaction, while any
>>> other is received but not scheduled until a previous transaction is
>>> finished.
>>> Blocking for long periods on heavy I/O in an async server will provide
>>> terrible outcomes.
>>
>> We're only doing local tdb operations during a transaction, so I think
>> it's really good to use a separate event context and do everything isolated
>> without any side effects.
>>
>> All locking hacks will result in deadlocks.
> 
> Note that a separate context is the same thing, it is just locking with
> a bigger hammer, and called differently. You can still have deadlocks if
> in the transaction you create an operation that decides to wait on the
> global event loop (which is now stopped), or create a new event loop and
> blocking there.

The difference is that we know that we don't use the global event loop.

> I don't see much difference from the point of view of possible
> deadlocks, but I see issues with the main event loop being blocked for
> long period of times making the whole server completely non-responsive.

It isn't blocked any longer than needed, the single local transaction
should be very fast, otherwise we have other problems.

There's a big difference.

This a nested a event context (with its isolated loop), we won't even start
unrelated operations, but finish the transaction as fast as possible.
Any theoretical deadlock in this situation is based on a bug in the code,
where we somehow use the wrong event context.

With a nested event loop (on the global context), we might start
processing an unrelated rpc request, which calls a sync ldb function,
how do you want to avoid a deadlock in that situation?
This can be triggered by special request order from a client.

> I would rather strip away the ldb async layer if the aim is to avoid
> looping in a context, so then you cannot at all create new calls and
> wait on them because you have no event loop to pass at all.
> It will also greatly simplify some code.

Removing the ldb async layer requires some work, but might be a good idea,
then we can use a sane async ldap library is we need async remote ldap
calls.

> The main problem I see here is the LDAP server, if we can agree
> officially to move to OpenLDAP + overlays (ie fully threaded LDAP
> server) and throw away our own home grown thing, then we'll be in a much
> better position, and a fully sync LDB will be just fine.

This has nothing to do with OpenLDAP, even if it would it would mean
that we need a full async interface to avoid blocking waiting for
external processes.

metze