CTDB_RECOVERY_ACTIVE while in CTDB_RUNSTATE_STARTUP

Kenny Dinh kdinh at peaxy.net
Mon May 23 20:10:56 UTC 2016


Hello,

I saw a one off error in CTDB that I was not able to reproduce it. My setup
has 3 CTDB nodes.  Attached is the ctdb log from failed node.  In the
failed CTDB node, ctdb process (17202) was starting up and its
ctdb->runstate is CTDB_RUNSTATE_STARTUP.


   1.  At 18:42:55, it tried to invoke "ctdb_run_startup" but the
   "49.winbind startup" script timed out.  If you look at the attached log,
   the recovery mode has been set to ACTIVE just a few seconds after
   "ctdb_run_startup" was invoked.
   2. At 18:44:48, the "ctdb_run_startup" was rescheduled but the script
   was not allowed to run while in recovery mode. The error was "*Refusing
   to run event scripts call 'startup' while in recovery*"
   3. From then on, this error kept on repeating.

As for the first issue, I don't know why winbind failed to start because I
lost winbind log from that time.

For the second issue, it occurs to me that we should not allow recovery
mode to be set to ACTIVE if the run state is still in
CTDB_RUNSTATE_STARTUP.  Does anyone see why the following would have any
issue?


diff --git a/server/ctdb_recover.c b/server/ctdb_recover.c
index 21e0427..4c5030f 100644
--- a/server/ctdb_recover.c
+++ b/server/ctdb_recover.c
@@ -595,6 +595,12 @@ int32_t ctdb_control_set_recmode(struct ctdb_context
*ctdb,
        struct ctdb_set_recmode_state *state;
        pid_t parent = getpid();

+       if (ctdb->runstate < CTDB_STATE_RUNNING &&
+           recmode == CTDB_RECOVERY_ACTIVE) {
+               DEBUG(DEBUG_ERR, (__location__ " Not setting state to
ACTIVE when runstate (%d) is < CTDB_STATE_RUNNING\n"));
+               return -1;
+       }
+
        /* if we enter recovery but stay in recovery for too long
           we will eventually drop all our ip addresses
        */
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mail_list_ctdbd.log
Type: application/octet-stream
Size: 76592 bytes
Desc: not available
URL: <http://lists.samba.org/pipermail/samba-technical/attachments/20160523/ef3a3068/mail_list_ctdbd.obj>


More information about the samba-technical mailing list