ctdb event scripts

Mon Nov 14 04:53:40 UTC 2016

On Mon, Nov 14, 2016 at 1:17 PM, Martin Schwenke <martin at meltin.net> wrote:

> On Mon, 14 Nov 2016 12:42:28 +1100, Amitay Isaacs <amitay at gmail.com>
> wrote:
>
> > On Sat, Nov 12, 2016 at 9:07 PM, Martin Schwenke <martin at meltin.net>
> wrote:
>
> > > > > Cool feature idea: implement an option to scriptstatus that shows
> the
> > > > > output of the last failure.  This would require daemon support,
> since
> > > > > that's where the information comes from.
> > >
> > > > Yes - that is a cool idea.
> > >
> > > Let's see if Amitay agrees and wants to add it to his new event
> > > daemon.  :-)
>
> > What's so cool about it?  It's in the logs already when the node
> > becomes UNHEALTHY.  I don't think there is any point adding it to
> > eventd.
>
> So you don't need to grovel through the logs for the simple cases.
>
> > If we want to provide a hook to capture that information, then it can be
> > done via notify script.
>
> You could do it for any event.  It doesn't have to be tied to something
> that generates a notification.
>
> If you've had "startup" problem and you think they're now solved then
> you can check with "ctdb scriptstatus startup lastfail" or similar.
> Alternatively, if some higher-level monitoring software has logged an
> unexpected fail-over then you could use "ctdb scriptstatus monitor
> lastfail" for an initial look.  It might tell you enough...
>
>
What about timeout? Is that always a failure from eventd's point of view?

We have weird rules for handling timeout as failure in CTDB.  Not all
timeouts are considered failures.

Amitay.