[RFC] [WIP] tevent/glib integration

Tue Jan 26 10:34:51 UTC 2016

Hi Ralph
On 26/01/16 08:05, Ralph Boehme wrote:
> Hi Noel,
>
> On Mon, Jan 25, 2016 at 07:07:41PM +0000, Noel Power wrote:
[...]
>> But doesn't it still suffer from the same problem of being throttled by
>> tevent(s) one event processed per loop iteration policy?
>> e.g. it depends on either a timer or one of the glib fd(s) (with
>> available events) being monitored to fire, but any number of ready
>> non-glib related timers or fds could be queued for processing before and
>> in the s3 case a tevent_fd that fires is put to the back of the list
>> after firing :/.
> timer, immediate and signal events are rare compared to fd events, so
> I wouldn't worry about those.
>
> fd event sources are maintained in a list in tevent. When checking
> pending fd events the list is walked from the start and the first
> pending event source is processed and then migrated to the end of the
> list. This prevents us from a single fd starving other event sources.
I understood that but it is for precisely this reason that I think in a
busy server with many clients hitting it that the glib fd (or the epoll
fd) could be shifted as you describe & not get fired for many iterations
>
> And when one of the glib fds is active, we process all pending glibs
sure, as it should be otherwise there is a further throttling
> *and* we keep polling the g_main_context until the returned timeout is
> greater then 0 and g_main_context_check() returns false.
like I mentioned that worries me, it's not what glib does itself (iirc
it wasn't what 'g_main_context_iteration' in whatever glib source
version I looked at back a couple of weeks ago) and it is not the way it
seems to suggest the steps for integrating an external loop should work,
I would be very wary of depending on using the glib api in such a  way,
I would expect such event loop processing code to be very sensitive to
how it is used (and using it in an unexpected way might work now, might
not work later or may introduce some bad behaviour in some scenarios)
either way there is a risk for the rug being pulled. It's even possible
that you might favour glib unfairly against tevent by looping many times
per outer tevent loop. I guess I should step back and stop harping on
now  I feel that I am just continually putting negative opinions and
that isn't my intention at all and that just makes me feel bad that is
the extent of my contribution

>
>> So I still fear it could quite easily take quite a few
>> tevent_loop iterations (and that situation worsening as more & more
>> normal events & timers are added by a samba process) before available
>> glib events will get processed and the glib context released, granted
>> when it does get processed it does process *all* glib events which is a
>> significant improvement. Presumably this would be the same with the
>> extra epoll source tacked on.  I don't know :/ in a busy smbd process
>> for example glib won't afaics get a fair slice of the action,
> I'm quite confident it well for the reasons described above and I'm
> willing to test-drive it with the Spotlight RPC server. :)
is that a single client per rpc instance model ? if so yeah, might work
well, at the moment my own use case (at the moment) is a single daemon
where there could be many simultaneous clients [...]
>> well, I guess source3/lib is where I would have started because of
>> spotlight (and my toy), but when you will use your approach above you
>> won't need to do anything with s3
> This was more about whether tevent upstream (yuck, that's us :)) is
> willing to include this new feature in tevent or whether we maintain
> it outside. I think it's a nice addition to tevent so it would be nice
> to have it in, but maybe we want to give it some time and testing
> outside before including it.
well locking down api is always best deferred until some run in period
has passed

[...]
>   >- run g_main_context_check/g_main_context_dispatch in a loop until
> g_main_context_check returns false'
>
> that sounds suspiciously like you are artificially tweaking the priority
> (in other word the priority passed into g_main_context_check probably
> determines whether an event source is marked as ready for dispatch or
> not) if that is the case I'd say you are playing with fire,  I bet
> without the change above the performance sucks ?
> no it doesn't suck, it just takes three times as long.
well that sounds like it sucks to me :-)
>
> The plan was to fix this short circuiting anyway, by calling the
> tevent_glib_prepare(). I've updated the branch with this change, it
> now takes twice as long as native glib (~0.8 ms compared to ~0.4 ms on
> my system where the tracker SPARQL query got ~220 results).
>
do I get you correctly that you intend to changed the

    do {
        gok = g_main_context_check(...)
    } while(true);


it's just with the updated branch I don't see such a change


Noel