[RFC] [WIP] tevent/glib integration

Mon Dec 28 11:01:22 UTC 2015

Hi Ralph,

thanks *alot* for taking the time to have a look at this (and I look
forward to your findings when you test more) I see already you have some
positive results, I'll just comment on the your mail below for the
moment (maybe some of this is already out of date, sorry I haven't had a
chance to look at your latest code you mention in your next mail, I hope
to look at in a couple of days (will be travelling)

On 22/12/15 19:55, Ralph Boehme wrote:
> Hi Noel,
>
> On Mon, Dec 21, 2015 at 03:07:36PM +0000, Noel Power wrote:
>> On 16/12/15 14:08, Stefan Metzmacher wrote:
>>> very interesting work! It would be really good to have this
>>> problem solved cleanly in future.
>> [...]
>>> Instead we can use a tracecallback and hook into the
>>> TEVENT_TRACE_BEFORE_WAIT (and maybe TEVENT_TRACE_AFTER_WAIT)
>>> events.
> I believe you meant the TEVENT_TRACE_BEFORE_LOOP_ONCE and
> TEVENT_TRACE_BEFORE_AFTER_ONCE trace points.
I also looked at using those but they stretch the scope of the
prepare_poll/check_poll much much longer than I wanted. e.g. in this
specific implementation case the glib context being acquired and held
for the entire event_loop_once call. I was purposely trying to keep that
to a minimum (and additionally now with your patch  the glib context
could be held even while normal tevent events are processed, I don't
know if that specifically can cause a problem but it looked like it
might have potential for unwanted/unexpected sideaffects).

Also now there is more possibility for useless prepare_poll/check_poll
cycles where the poll function might not even be called (note: in 
tevent_poll.c the 'poll_event_loop_once' function processes timers,
immediate events and signals first and can return before the poll
function is even called ) In the 's3' backend effectively the same thing
can happen because 'run_events_poll' will also process timeouts/signals
etc. first and may not process fds at all (until a subsequent iteration)
If you want flexibility around when/where you call
prepare_poll/check_poll the trace points imo just don't cut it.

If it was possible to ignore the points above (and maybe it is, perhaps
I am being too careful or extreme) using the trace points imo only makes
sense if it is possible to use them in external code, by external code I
mean code that links to the tevent library and uses the public headers
available from tevent. In your code the 'prepare_poll' and 'check_poll'
code is internal to the 's3' tevent backend and thus for example the
timeout information can be easily transferred. But.. how could the code
called by TEVENT_TRACE_BEFORE and TEVENT_TRACE_BEFORE transfer timeout
information to generic 'tevent' backend event_loop_once function(s), I
can't see how it can be done without introducing some new api, if we had
to introduce new api then I would say it would be easier to allow the
new api to pass the callbacks similar to how I suggested before and
leave it up to each backend to prepare it's own code to trigger the
callbacks at the best call sites appropriate for each backend. If we
don't do this in a generic way (and e.g. just do it for s3) I don't see
the point in using the trace points (I think it makes the code
unnecessarily complicated)
>>> For samba this could be done within smbd_tevent_trace_callback().
>>>
>>> In order to make this more generic we could add a helper function
>>> which can be called in a TEVENT_TRACE_BEFORE_WAIT hook.
>>>
>>> This function would maintain some metadata, it needs an array
>>> of tevent_fd pointers and old array returned by g_main_context_query()
>>> and have a 2nd array to get the new g_main_context_query() result.
>>> Then it uses this information to work out which tevent_fd pointers need
>>> to be added/modified/removed.
>> so, I played around with the TRACE stuff for a bit, my impression is
>> that it makes things more awkward and more complicated, unfortunately
>> trying to do this generically via the trace_callback functionality does
>> not seem to work easily for a number of reasons
>>
>> + imho it is not at all an intuitive api to use, that the fds of the
>> poll function could be modified by these trace points isn't at all
>> obvious even by name
> afaict we're not modifyint the pollfds, are we?
well not directly but indirectly we are, calling tevent_add_fd
essentially is the same as modifying the pollfds isn't it ? Also now
because you removed process_foreign_fds we no longer can get
FOREIGN_FD_XYZ flags set (the code only works now because 
FOREIGN_FD_READ  &  FOREIGN_FD_WRITE have the same values as
TEVENT_FD_READ & TEVENT_FD_WRITE)  Also 'run_events_poll besides' not
being aware of the FOREIGN_FD_XYZ flags additionally doesn't monitor
POLL or ERR flags (and glib for example at least in the api wants to
have the possibility to register interest in those events) It's true
that the example (as you have seen) and perhaps even libtracker-sparql
in general doesn't need to know about POLL or ERR events (I wouldn't
like to bet on that though) But like I mentioned earlier I would like to
see the relevant results from the poll function for foreign fds
processed separately, I know the way I do it is inefficient (I iterate
all fds to find the foreign ones) but it should be possible to do better
and iterate only the foreign fds

[...]
>
>> e.g. in the s3
>> tevent backend at least 'event_add_to_poll_args' and maybe even
>> 'run_events_poll'. In anycase I would prefer to process the foreign
>> events separately for 2 reasons and that activity also needs to be done
>> inbetween the 'BEFORE' and 'AFTER' trace points. My reasons for
>> processing *all* foreign fds at once are
>>    a) I want the behaviour and results of the poll to be as close as
>> possible to what the external code would expect if it polled the fds
>> itself, that means if events for more than one fd are available I want
>> them all and not one at a time
>>    b) I think it could be wise to ensure the sequence (add foreign fds,
>> poll foreign fds, report foreign fd poll results, react to foreign fds)
>> happens all together without normal 'tevent' events happening inbetween
> See attached patch, I think the glib loop is only entered once in
> s3_tevent_foreign_loop_process().
not sure what you mean, what I meant was I wanted to avoid the following
scenario

prepare_poll (add some glib fds)
poll  (some glib fds have events reported by poll)
check_poll (is called but doesn't receive all the available events
available from the glib fds because)

   a) some tevent has happened e.g. a normal fd monitored by tevent, a
tevent timeout or a tevent signal
   b) only a single glib fd event is available (because tevent loop only
processes a single fd at a time) but the poll actually reported multiple
glib fds have available events

Instead I wanted glib (or whatever foreign loop) to get the same results
as if they called the poll function themselves. I know perhaps this is
breaking the semantics of the tevent loop_once function but I look at
this as something as 'outside' of tevent, I'm not really looking to
impose tevent behaviour around polling foreign fds but rather just
gather the results from whatever 'poll/select/epoll' function the 
tevent backend is using and pass that on to whatever integration code is
plugged in.
[...]
>>> Instead of using g_main_context_default(), we should make use of
>>> g_main_context_new(), g_main_loop_new() and
>>> g_main_context_{push,pop}_thread_default()
>> I wonder do I miss something about needing to use the above (note: I
>> don't have any deep knowledge of glib, I'm pretty clueless about it
>> really) but it seems to me from reading
>> https://developer.gnome.org/glib/stable/glib-The-Main-Event-Loop.html
>> and the sections referenced from 'Customizing the main loop iteration'
>> that I am doing the correct thing so I am wondering what the specific
>> concern is and why I should use those methods
> We'll see. It's all about getting the callbacks coming in back in the
> right context. Ymmv.
>
I can only go by the docs, like I said I've little knowledge of glib,
I'm hoping your real life implementation will show it works or if not
point to what needs to be changed

Noel