[PATCH] Restarting cleanupd when ctdb-messaging is down
Ralph Böhme
slow at samba.org
Mon Jul 18 16:20:49 UTC 2016
On Mon, Jul 18, 2016 at 12:49:13PM +0200, Volker Lendecke wrote:
> Hi!
>
> On Fri, Jul 15, 2016 at 10:34:27AM +0200, Ralph Boehme wrote:
>
> > +static void cleanupd_init_timer(struct tevent_context *ev,
> > + struct tevent_timer *te,
> > + struct timeval current_time,
> > + void *private_data);
> > +
> > +struct cleanup_init_state {
> > + bool ok;
> > + struct tevent_context *ev;
> > + struct messaging_context *msg;
> > + struct tevent_timer *te;
> > + struct server_id *ppid;
> > +};
> > +
> > +static struct tevent_req *cleanupd_init_send(struct tevent_context *ev,
> > + TALLOC_CTX *mem_ctx,
> > + struct messaging_context *msg,
> > + struct server_id *ppid)
> > +{
> > + struct tevent_req *req = NULL;
> > + struct cleanup_init_state *state = NULL;
> > + struct timeval tv;
> > +
> > + req = tevent_req_create(mem_ctx, &state, struct cleanup_init_state);
> > + if (req == NULL) {
> > + return NULL;
> > + }
> > +
> > + *state = (struct cleanup_init_state) {
> > + .msg = msg,
> > + .ev = ev,
> > + .ppid = ppid
> > + };
> > +
> > + tv = tevent_timeval_current_ofs(0, 0);
> > +
> > + state->te = tevent_add_timer(ev, state, tv, cleanupd_init_timer, req);
> > + if (tevent_req_nomem(state->te, req)) {
> > + tevent_req_post(req, ev);
> > + }
>
> From my point of view tevent_wakeup_send/recv is more in line with the
> _send/_recv pattern we're using lately. I try to avoid dealing with
> direct events wherever possible. Any reason why you did not use it
> here?
Good idea, tevent_wakeup_send/recv is the right thing to use here.
> > +
> > + return req;
> > +}
> > +
> > +static void cleanupd_init_timer(struct tevent_context *ev,
> > + struct tevent_timer *te,
> > + struct timeval current_time,
> > + void *private_data)
> > +{
> > + struct tevent_req *req = talloc_get_type_abort(
> > + private_data, struct tevent_req);
> > + struct cleanup_init_state *state = tevent_req_data(
> > + req, struct cleanup_init_state);
> > + struct timeval tv;
> > +
> > + DBG_NOTICE("Initializing cleanupd from restart timer()\n");
> > +
> > + state->ok = cleanupd_init(state->msg, false, state->ppid);
> > + if (state->ok) {
> > + DBG_WARNING("cleanupd restarted\n");
> > + tevent_req_done(req);
> > + return;
> > + }
> > +
> > + DBG_ERR("Scheduled cleanupd restart failed, rescheduling\n");
> > +
> > + tv = tevent_timeval_current_ofs(1, 0);
> > +
> > + state->te = tevent_add_timer(ev, state, tv, cleanupd_init_timer, req);
> > + if (tevent_req_nomem(state->te, req)) {
> > + tevent_req_post(req, ev);
>
> Is this tevent_req_post really correct? We're in a callback and went
> through the event loop already.
good catch, thanks! That part was copy&pasted from the send function.
As discussed off-list, this will also need some sort of protection
against cleanup messages lost in-flight, like an ACK message or
possibly use a local tbd for posting cleanup events. Looking into
it...
Cheerio!
-slow
More information about the samba-technical
mailing list