A Modest Proposal for Preventing the Event loops of Samba DCs From Being a Burthen to Their Implementers or Users, and for Making Them Beneficial to the Publick
abartlet at samba.org
Thu Apr 17 18:04:02 MDT 2014
On Thu, 2014-04-17 at 10:06 -0400, Simo wrote:
> On Thu, 2014-04-17 at 15:53 +0200, Stefan (metze) Metzmacher wrote:
> > Am 17.04.2014 15:25, schrieb Simo:
> > > On Thu, 2014-04-17 at 14:49 +0200, Stefan (metze) Metzmacher wrote:
> > >> Hi Simo,
> > >>
> > >>>> It does make ldb less 'async' as far as the caller is concerned, but we
> > >>>> simply don't use ldb in an async way in Samba. (It is very unfortunate
> > >>>> we carry the great complexity and risk of an async ldb without any
> > >>>> significant use).
> > >>>>
> > >>>> I have this under a private autobuild, and I would appreciate your
> > >>>> thoughts.
> > >>>
> > >>> I do not really see the point of a separate event context honestly.
> > >>> All you need is clearly some locking, so that a new toplevel ldb
> > >>> operation can only be started from within the transaction, while any
> > >>> other is received but not scheduled until a previous transaction is
> > >>> finished.
> > >>> Blocking for long periods on heavy I/O in an async server will provide
> > >>> terrible outcomes.
> > >>
> > >> We're only doing local tdb operations during a transaction, so I think
> > >> it's really good to use a separate event context and do everything isolated
> > >> without any side effects.
> > >>
> > >> All locking hacks will result in deadlocks.
> > >
> > > Note that a separate context is the same thing, it is just locking with
> > > a bigger hammer, and called differently. You can still have deadlocks if
> > > in the transaction you create an operation that decides to wait on the
> > > global event loop (which is now stopped), or create a new event loop and
> > > blocking there.
> > The difference is that we know that we don't use the global event loop.
> > > I don't see much difference from the point of view of possible
> > > deadlocks, but I see issues with the main event loop being blocked for
> > > long period of times making the whole server completely non-responsive.
> > It isn't blocked any longer than needed, the single local transaction
> > should be very fast, otherwise we have other problems.
> > There's a big difference.
> > This a nested a event context (with its isolated loop), we won't even start
> > unrelated operations, but finish the transaction as fast as possible.
> > Any theoretical deadlock in this situation is based on a bug in the code,
> > where we somehow use the wrong event context.
> > With a nested event loop (on the global context), we might start
> > processing an unrelated rpc request, which calls a sync ldb function,
> > how do you want to avoid a deadlock in that situation?
> > This can be triggered by special request order from a client.
> The idea I had was that if you see there is already a transaction going
> but this operation is not a child of that transaction, you simply defer
> starting it.
How would you defer starting it? Enter yet another nested event loop
waiting for the transaction to finish? Remember, this can happen inside
almost any Samba DC code, it is all tied to LDB one way or the other.
> Ideally this would be done in LDB by adding a transaction
> handle that you must be passed down from your parent, or you don't have.
> If you do not have one you need to create one, and while a transaction
> handle is active all operations bearing another one are skipped in the
> event loop and go back to sleep.
How do you identify those operations?
A new event context, containing only events related to the transaction,
is the only safe way I can see to ensure no other operations start. The
alternative would be to remove all other events from the event context,
save them and restore them - and I think that's the same thing in the
Andrew Bartlett http://samba.org/~abartlet/
Authentication Developer, Samba Team http://samba.org
Samba Developer, Catalyst IT http://catalyst.net.nz/services/samba
More information about the samba-technical