[PATCH] CTDB protocol and client fixes
Amitay Isaacs
amitay at gmail.com
Tue Nov 24 06:33:23 UTC 2015
On Sat, Nov 21, 2015 at 8:59 AM, Amitay Isaacs <amitay at gmail.com> wrote:
> On Fri, Nov 20, 2015 at 11:43 PM, Volker Lendecke <
> Volker.Lendecke at sernet.de> wrote:
>
>> On Fri, Nov 20, 2015 at 11:25:58AM +1100, Amitay Isaacs wrote:
>> > The premise here is the "nested event loops are bad". With single event
>> > context nesting is definitely evil. However, using multiple event
>> contexts
>> > also leads to nesting, which is not identified as nesting in the tevent
>> > framework.
>>
>> The idea is that with what we call nested event contexts is
>> that in tevent_loop_once only the one event source can fire.
>>
>> > Creating temporary event context in the sync api essentially creates
>> nested
>> > event loops. If the code requires the creation of nested event
>> contexts,
>>
>> Just to get a common understanding: When I talk about nested
>> event loops I always mean multiple tevent_loop_once calls on
>> the stack for the same event context.
>>
>> > then it's just bad code and it should be rewritten using async api to
>> get
>> > rid of the nested event loops.
>>
>> 100% ack. If we were there, we would not need the sync
>> wrappers anymore.
>>
>> > Also, the abstractions in the new ctdb client code (pkt_read, comm) are
>> > designed with single event context in mind and they will not work with
>> > temporary contexts. To support nested event loops, I will have to
>> > redesign those abstractions.
>>
>> Just picking a random sync wrapper sample from source3/libsmb:
>>
>> if (smbXcli_conn_has_async_calls(cli->conn)) {
>> /*
>> * Can't use sync call while an async call
>> * is in flight
>> */
>> status = NT_STATUS_INVALID_PARAMETER;
>> goto fail;
>> }
>>
>> It's right that it is *very* hard to support both sync and
>> async calls simultaneously. In source3/libsmb we took the
>> path to just bail if this happens.
>>
>> The real problem in the ctdb protocol is async messages. You
>> do a sync call on a nested event *context*, and an async
>> message comes in. What do you do? We need some provision for
>> that. But apart from this, I do not see why having nested
>> event contexts is really a problem.
>>
>
> Exactly. I have written the whole new client API to take care of the
> async messages which can be received while in sync call. This requires
> that you use the same event context where the callbacks are registered
> correctly to handle all types of packets that can be received. Question is
> what do you want to do in the callback functions for these async messages
> when using sync API. If you want to do simple things like setting flags or
> change some state, then you are in luck since that's how the new ctdb tool
> re-implementation using new client API looks like. However, if you want to
> do more messaging from the callbacks then you are doing it wrong. You
> should not be using sync API in the first place.
>
> If you are trying to solve the problem of how to allow the simultaneous
> use of sync and async API, then I say don't do it, that's a very bad idea.
> Either use sync API or async API. I don't want to support consumers of
> client API using both sync and async versions simultaneously. From your
> example above, you are also trying to avoid the mix of sync and async API
> simultaneously.
>
> CTDB client API is not yet a library for others to consume. Only consumer
> will be samba (for now). So we can stick to using async API. The only
> reason for providing sync API is to make code very easy for trivial
> applications (e.g. pcp-pmda sensor or ctdb tool re-implementation).
> Anything more complex requires async API (e.g. recovery helper and
> re-implementation of test code).
>
>
>>
>> > I don't see any benefit of using temporary event contexts. Can you
>> > elaborate why using nested event loops is a good thing?
>>
>> Nested event *loops* on the same event *context* are evil,
>> essentially that's pthreads without mutexes. Nested event
>> *contexts* are okay, they are a good way to limit the events
>> that can fire to just the ones we expect and silence
>> everything else.
>>
>> What I find difficult to wrap my head around is that you have a single fd
> as a connection to ctdbd. You cannot separate events by creating multiple
> contexts on that single fd. Worst case is you might end up ignoring some
> incoming async messages because you are not interested in processing them
> at a nested event context. Also, there is the question of re-registering
> callbacks for nested contexts. If you are registering the same callbacks
> at the nested context level, then you are going to process all the async
> messages even in the nested event context. So why bother? There is no
> difference between top-level event context and a nested event context as
> far as processing packets from ctdb is concerned. In fact using nested
> contexts is more expensive since you have to create a new context and
> re-register the callbacks.
>
> If you want to keep a separate event context exclusively for dealing with
> ctdb messaging, then that's different and that's fine. You might not want
> to keep any smb processing separate from ctdb messaging. Then I fully
> agree that using nested contexts can be helpful. However, I am not
> convinced that using nested event contexts on a single fd gets you any
> benefit.
>
> Hope that helps,
>>
>
> It doesn't. I am happy to accept the terminology and not argue over
> semantics of nested event loops. However, I still don't understand how
> nested contexts offer any benefit for ctdb client API.
>
> If you ignore the programming idioms and focus on the state transitions,
> then it doesn't really matter how the events happen. You have to take care
> of all events. Whether you consider all the states globally (using a
> single event context) or partition them into groups based on types of
> events (e.g. ctdb messaging), you still have to deal with the same state
> transitions.
>
>
>> Volker
>>
>
> Amitay.
>
Since no one has replied to this email, let me summarize.
1. For some reason I cannot fathom, async processing is treated *different*
from sync processing. A global event context for all async processing is
ok, but for sync processing need lots of temporary contexts.
2. I can drop the tevent_context from sync API, but it will still use the
cached event context rather than creating a new one for every sync API
call. This is because of the current design of the client API. It was
designed to not use tevent_add_fd() for reading each individual packet from
an fd.
Till we figure out the way ahead, I will split this patch set and post the
unrelated header and protocol fixes separately.
Amitay.
More information about the samba-technical
mailing list