[PATCH] ctdb-daemon: Make STOP_NODE control wait until complete (bug 14087)

Amitay Isaacs amitay at gmail.com
Tue Aug 20 07:14:29 UTC 2019


On Tue, Aug 20, 2019 at 5:06 PM Martin Schwenke via samba-technical
<samba-technical at lists.samba.org> wrote:
>
> On Mon, 19 Aug 2019 14:34:31 +1000, Martin Schwenke via samba-technical
> <samba-technical at lists.samba.org> wrote:
>
> > On Sat, 17 Aug 2019 10:24:47 +1000, Martin Schwenke via samba-technical
> > <samba-technical at lists.samba.org> wrote:
> >
> > > STOP_NODE is supported by a periodic check in the recovery daemon's
> > > main_loop(), which notices the flag change, and schedules a recovery
> > > and freezes databases.  If STOP_NODE returns immediately then the
> > > associated recovery can complete and the node can be continued before
> > > databases are actually frozen.  This means that the databases on the
> > > stopped node will node never be marked invalid and the recovery
> > > following CONTINUE_NODE can resurrect deleted records.
> > >
> > > CONTINUE_NODE must wait for an in-progress STOP_NODE to complete
> > > before commencing.
> > >
> > > Multiple STOP_NODE controls are also serialised.  This isn't strictly
> > > necessary but will stop more deeply nested event loops.
> > >
> > > Went through this pipelines with a slightly different commit message:
> > >
> > >   https://gitlab.com/samba-team/devel/samba/pipelines/76501176
> > >
> > > Now running in this one:
> > >
> > >   https://gitlab.com/samba-team/devel/samba/pipelines/76849217
> > >
> > > Please review and maybe push...
> >
> > Now with patch!  :-)
>
> After further discussion with Amitay, we determined that this was the
> wrong approach because things could loop forever. So, it either needed
> timeouts and the associated timeout handling, which all adds a lot of
> complication and is hard to get right, or it needs to be done
> properly.  ;-)
>
> New patch set attached to do it properly.  This does the steps to make
> the node inactive in the NODE_STOP control.  For consistency it also
> does the same thing to the SET_BAN_STATE control.
>
> A variation (different commit order/messages, slightly different
> logging, didn't drop unused code) passed the following GitLab CI
> pipeline:
>
>   https://gitlab.com/samba-team/devel/samba/pipelines/77082113
>
> Please review and maybe push...
>
> peace & happiness,
> martin

Pushed to autobuild.

Amitay.



More information about the samba-technical mailing list