[PATCH] ctdb-daemon: Make STOP_NODE control wait until complete (bug 14087)

Martin Schwenke martin at meltin.net
Tue Aug 20 07:05:39 UTC 2019


On Mon, 19 Aug 2019 14:34:31 +1000, Martin Schwenke via samba-technical
<samba-technical at lists.samba.org> wrote:

> On Sat, 17 Aug 2019 10:24:47 +1000, Martin Schwenke via samba-technical
> <samba-technical at lists.samba.org> wrote:
> 
> > STOP_NODE is supported by a periodic check in the recovery daemon's
> > main_loop(), which notices the flag change, and schedules a recovery
> > and freezes databases.  If STOP_NODE returns immediately then the
> > associated recovery can complete and the node can be continued before
> > databases are actually frozen.  This means that the databases on the
> > stopped node will node never be marked invalid and the recovery
> > following CONTINUE_NODE can resurrect deleted records.
> > 
> > CONTINUE_NODE must wait for an in-progress STOP_NODE to complete
> > before commencing.
> > 
> > Multiple STOP_NODE controls are also serialised.  This isn't strictly
> > necessary but will stop more deeply nested event loops.
> > 
> > Went through this pipelines with a slightly different commit message:
> > 
> >   https://gitlab.com/samba-team/devel/samba/pipelines/76501176
> > 
> > Now running in this one:
> > 
> >   https://gitlab.com/samba-team/devel/samba/pipelines/76849217
> > 
> > Please review and maybe push...  
> 
> Now with patch!  :-)

After further discussion with Amitay, we determined that this was the
wrong approach because things could loop forever. So, it either needed
timeouts and the associated timeout handling, which all adds a lot of
complication and is hard to get right, or it needs to be done
properly.  ;-)

New patch set attached to do it properly.  This does the steps to make
the node inactive in the NODE_STOP control.  For consistency it also
does the same thing to the SET_BAN_STATE control.

A variation (different commit order/messages, slightly different
logging, didn't drop unused code) passed the following GitLab CI
pipeline:

  https://gitlab.com/samba-team/devel/samba/pipelines/77082113

Please review and maybe push...

peace & happiness,
martin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ctdb-stop.patch
Type: text/x-patch
Size: 7483 bytes
Desc: not available
URL: <http://lists.samba.org/pipermail/samba-technical/attachments/20190820/3003b70a/ctdb-stop.bin>


More information about the samba-technical mailing list