CTDB_RECOVERY_ACTIVE while in CTDB_RUNSTATE_STARTUP

Kenny Dinh kdinh at peaxy.net
Mon May 23 22:38:08 UTC 2016


I found out my previous change was not correct.

I traced the history of the file server/eventscript.c.  At commit "
fd06167caa2c194e74c651e1374047213c6cd9d5", we updated the function
ctdb_event_script_callback_v() to allow CTDB_EVENT_INIT to be called while
ctdb->recovery_mode is ACTIVE.  I believe that CTDB_EVENT_STARTUP should
also be allowed to be executed while ctdb->recovery_mode is ACTIVE.

Attach is what I think should be the correct fix.

Thanks,
Kenny


On Mon, May 23, 2016 at 1:10 PM, Kenny Dinh <kdinh at peaxy.net> wrote:

> Hello,
>
> I saw a one off error in CTDB that I was not able to reproduce it. My
> setup has 3 CTDB nodes.  Attached is the ctdb log from failed node.  In the
> failed CTDB node, ctdb process (17202) was starting up and its
> ctdb->runstate is CTDB_RUNSTATE_STARTUP.
>
>
>    1.  At 18:42:55, it tried to invoke "ctdb_run_startup" but the
>    "49.winbind startup" script timed out.  If you look at the attached log,
>    the recovery mode has been set to ACTIVE just a few seconds after
>    "ctdb_run_startup" was invoked.
>    2. At 18:44:48, the "ctdb_run_startup" was rescheduled but the script
>    was not allowed to run while in recovery mode. The error was "*Refusing
>    to run event scripts call 'startup' while in recovery*"
>    3. From then on, this error kept on repeating.
>
> As for the first issue, I don't know why winbind failed to start because I
> lost winbind log from that time.
>
> For the second issue, it occurs to me that we should not allow recovery
> mode to be set to ACTIVE if the run state is still in
> CTDB_RUNSTATE_STARTUP.  Does anyone see why the following would have any
> issue?
>
>
> diff --git a/server/ctdb_recover.c b/server/ctdb_recover.c
> index 21e0427..4c5030f 100644
> --- a/server/ctdb_recover.c
> +++ b/server/ctdb_recover.c
> @@ -595,6 +595,12 @@ int32_t ctdb_control_set_recmode(struct ctdb_context
> *ctdb,
>         struct ctdb_set_recmode_state *state;
>         pid_t parent = getpid();
>
> +       if (ctdb->runstate < CTDB_STATE_RUNNING &&
> +           recmode == CTDB_RECOVERY_ACTIVE) {
> +               DEBUG(DEBUG_ERR, (__location__ " Not setting state to
> ACTIVE when runstate (%d) is < CTDB_STATE_RUNNING\n"));
> +               return -1;
> +       }
> +
>         /* if we enter recovery but stay in recovery for too long
>            we will eventually drop all our ip addresses
>         */
>
>
>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: look
Type: application/octet-stream
Size: 806 bytes
Desc: not available
URL: <http://lists.samba.org/pipermail/samba-technical/attachments/20160523/5d28f670/look.obj>


More information about the samba-technical mailing list