[PATCH] CTDB protocol and client fixes

Wed Nov 25 04:55:56 UTC 2015

On Tue, Nov 24, 2015 at 10:17 PM, Stefan Metzmacher <metze at samba.org> wrote:

> Am 24.11.2015 um 10:15 schrieb Volker Lendecke:
> > On Tue, Nov 24, 2015 at 07:35:32PM +1100, Amitay Isaacs wrote:
> >>> If the sync APIs are only ever called inside ctdb tool or
> >>> tests, why is the additional overhead a problem?
> >>>
> >>
> >> The problem is not of the overhead.  Since the API is designed around
> >> single tevent_fd, you always need the same event context around.
> Otherwise
> >> you end up having to do tevent_add_fd() in each sync API.   This is even
> >> trickier since the tevent_add_fd() is burried in a lower level
> abstraction
> >> (comm).
> >
> > In the smb libraries and others, look at tsocket, we have started from
> > the basic socket calls to not cache a tevent_fd.
>
> In tstream_bsd_set_writeable_handler and tstream_bsd_set_readable_handler
> we do some tevent_fd caching.
>
> If the same tevent_context is used multiple times in a row,
> we reuse the tevent_fd, we're also lazy in changing its flags.
>
> That might be also a possible mode for the ctdb client code.
>
> metze
>
>
I will take a look at tsocket implementation.  From the brief look at the
tsocket guide, it looks very useful for re-implementation of CTDB server
code (as separate daemons).

As far as the CTDB client code is concerned, I have not heard any arguments
against tevent_fd caching in the async API.  I have spent quite some time
to come up with tevent_req infrastructure which avoids tevent_add_fd() for
each packet read (tevent_fd caching).  If there are serious shortcomings of
this approach, I would really like to know since I plan to the same
approach for re-implementing separate daemons.  I would not hesitate to
re-work the entire framework if someone convinces me (may be using small
words) of the failings of tevent_fd caching.

I would like to ensure that there are no nested event loops in the CTDB
client async API.  I did realize the problem of mixing async and sync API
in the client code.  The fixes in this patch set get rid of all the sync
API calls from within async API implementation.  So apart from any such
oversights I guarantee that using the async client API will never result in
a nested event loop.  However, there is an assumption (in the
implementation) of using single event context for all of the client async
API, specifically the same event context used to initialize the client code
using ctdb_client_init().  This raises the following issue:

If you have server code using only async APIs, would you need to use more
than one event contexts?

Consider server code as an implementation of a state transition graph.  The
beauty of the tevent_req framework is that it allows to track the state
information (global or partitioned) correctly even when all the events are
considered at global level (i.e. single event context).  Someone please
correct me if I have misunderstood.  If that's the case, then what's wrong
with the assumption of using single event context in CTDB client async API?

I have understood the concerns with passing event context to sync API.  As
proposed earlier, I can separate the client sync API for the use only in
the ctdb tool and the test binaries (and NOT in the samba code).  However,
the only way (currently) I can think of getting rid of event context from
the sync API is to use the cached event context.  (The reason for needing
cached event context is the assumption of single event context in the
design of async API.)  Is that acceptable?

The other approach is to get rid of the sync API completely and use only
async API everywhere.  I am fine with that too.

I would appreciate any clear direction that I can follow.

Amitay.