[RFC] [WIP] tevent/glib integration

Mon Jan 25 19:07:41 UTC 2016

Hi Ralph,

On 22/01/16 15:56, Ralph Boehme wrote:
> Hi Noel,
>
> On Mon, Jan 11, 2016 at 12:56:12PM +0000, Noel Power wrote:
>> ... long snip ...
> ok, so we both agree that both solutions are ugly. ;)
>
> Metze had a better idea in a private conversation: use epoll to create
> an epoll instance and add the returned fd to tevent via
> tevent_add_fd(). Later add all glib fds to the new epoll instance via
> epoll_ctl(). epoll_wait() will then return all glib fds with events
> pending. This can all be done on top of tevent.
I'm am/was little unsure what you mean't, I've never used epoll myself,
reading a bit I was interested to see that it is possible to add the
returned_fd from an epoll instance A to another epoll instance B,
presumably the epoll instance B that got the epoll instance fd will
itself monitor the fds associated with A,  is that correct?, so this
means it will work only when 'epoll' is the default tevent backend ? 
or.... does epoll instance A not actually need anybody to call
epoll_wait for it's fd(s) to be monitored, meaning you can just pass an
epoll instance fd to select or poll and get an event on the epoll
instance fd if any of the fd(s) monitored by that instance has an
available event, so... if I get it right that is a great idea and offers
a really cool way to separate and distinguish the glib from normal
tevent_fds
> In case epoll is not available, in the glue code as a fallback use a
> tevent_fd per glib fd and in the handler call poll() a second time on
> the glib fds. That way we get the raw revents and can handle all
> pending events in one swoop.
>
> This would work without any modifications to the tevent code.
>
> I have a wip branch here:
> <https://git.samba.org/?p=slow/samba.git;a=log;h=refs/heads/tevent-glib-glue>
>
> This just adds the fallback method without any optimisations,
> particulariy epoll is left out.
ok, I had a look at least at the fallback code, this is really really
clever and self contained, the use of the timers and and even the second
poll is very neat.
But doesn't it still suffer from the same problem of being throttled by
tevent(s) one event processed per loop iteration policy?
e.g. it depends on either a timer or one of the glib fd(s) (with
available events) being monitored to fire, but any number of ready
non-glib related timers or fds could be queued for processing before and
in the s3 case a tevent_fd that fires is put to the back of the list
after firing :/. So I still fear it could quite easily take quite a few
tevent_loop iterations (and that situation worsening as more & more
normal events & timers are added by a samba process) before available
glib events will get processed and the glib context released, granted
when it does get processed it does process *all* glib events which is a
significant improvement. Presumably this would be the same with the
extra epoll source tacked on.  I don't know :/ in a busy smbd process
for example glib won't afaics get a fair slice of the action, I don't
see any wriggle room for improving things if that is the case.  At least
if you process them in the ugly glib-fds & normal tevent fds together
then glib processing doesn't affect tevent's own event processing and
vice-versa. Each event loop is getting a fair share of the event
processing action. Anyway that's my worry about this, how it will scale,
I suppose you don't have the same opinion (and I'd even reluctantly
sacrifice a generic solution if it meant in s3/lib for example would
have that flexibility)

>
> The good:
> - completely on top of tevent
>
> The bad:
> - calls poll() twice (once in tevent, once in the glue code)
[...]
> This patch puts the patch into tevent itself, we could also put it
> into source3/lib/.
well, I guess source3/lib is where I would have started because of
spotlight (and my toy), but when you will use your approach above you
won't need to do anything with s3

On 25/01/16 16:49, Ralph Boehme wrote:
> updated the branch. Two changes:
>
> - cache tevent_fds
>
> - run g_main_context_check/g_main_context_dispatch in a loop until
>   g_main_context_check returns false
>
> With these changes the test binary completes as fast with the tevent
> glue as with the native glib loop. 

well, that's not what I encountered previously where tevent was a little
(but not much) slower,  now in fact on my machine the test is a little
(but consistently) faster than with native glib and that doesn't sound
right,  I am suspicious of

  >- run g_main_context_check/g_main_context_dispatch in a loop until
g_main_context_check returns false'

that sounds suspiciously like you are artificially tweaking the priority
(in other word the priority passed into g_main_context_check probably
determines whether an event source is marked as ready for dispatch or
not) if that is the case I'd say you are playing with fire,  I bet
without the change above the performance sucks ?

Noel