[PATCH] ctdb-daemon: Make STOP_NODE control wait until complete (bug 14087)

Martin Schwenke martin at meltin.net
Sat Aug 17 00:24:47 UTC 2019

STOP_NODE is supported by a periodic check in the recovery daemon's
main_loop(), which notices the flag change, and schedules a recovery
and freezes databases.  If STOP_NODE returns immediately then the
associated recovery can complete and the node can be continued before
databases are actually frozen.  This means that the databases on the
stopped node will node never be marked invalid and the recovery
following CONTINUE_NODE can resurrect deleted records.

CONTINUE_NODE must wait for an in-progress STOP_NODE to complete
before commencing.

Multiple STOP_NODE controls are also serialised.  This isn't strictly
necessary but will stop more deeply nested event loops.

Went through this pipelines with a slightly different commit message:


Now running in this one:


Please review and maybe push...

peace & happiness,

More information about the samba-technical mailing list