CTDB IP takeover/failover tunables - do you use them?
martin at meltin.net
Thu Nov 10 03:01:56 UTC 2016
I'm currently hacking on CTDB's IP takeover/failover code. For Samba
4.6, I would like to rationalise the IP takeover-related tunable
I would like to know if there are any users who set the values of these
tunables to non-default values. The tunables in question are:
When set to non-zero, ctdb will not perform failover or failback. Even
if a node fails while holding public IPs, ctdb will not recover the IPs
or assign them to another node.
When this tunable is enabled, ctdb will no longer attempt to recover
the cluster by failing IP addresses over to other nodes. This leads to
a service outage until the administrator has manually performed IP
failover to replacement nodes using the 'ctdb moveip' command.
When set to 1, ctdb will not perform failback of IP addresses when a
node becomes healthy. When a node becomes UNHEALTHY, ctdb WILL perform
failover of public IP addresses, but when the node becomes HEALTHY
again, ctdb will not fail the addresses back.
Use with caution! Normally when a node becomes available to the cluster
ctdb will try to reassign public IP addresses onto the new node as a
way to distribute the workload evenly across the clusternode. Ctdb
tries to make sure that all running nodes have approximately the same
number of public addresses it hosts.
When you enable this tunable, ctdb will no longer attempt to rebalance
the cluster by failing IP addresses back to the new nodes. An
unbalanced cluster will therefore remain unbalanced until there is
manual intervention from the administrator. When this parameter is set,
you can manually fail public IP addresses over to the new node(s) using
the 'ctdb moveip' command.
If no nodes are HEALTHY then by default ctdb will happily host public
IPs on disabled (unhealthy or administratively disabled) nodes. This
can cause problems, for example if the underlying cluster filesystem is
not mounted. When set to 1 on a node and that node is disabled, any IPs
hosted by this node will be released and the node will not takeover any
IPs until it is no longer disabled.
When set to 1, ctdb will not allow IP addresses to be failed over onto
this node. Any IP addresses that the node currently hosts will remain
on the node but no new IP addresses can be failed over to the node.
In particular, I would like to know if anyone has a use case where they
set any of these variables to different values on different nodes. This
only really matters for the last 2 (NoIPHostOnAllDisabled,
NoIPTakeover), since the value on the recovery master is just used for
the other 2. If you do this, can you please explain why? :-)
I would like to make all of the above tunables global but I will
not do that if it breaks an existing use case and I can't find a
There are also 2 tunables to choose the algorithm used to calculate the
IP address layout:
When set to 1, ctdb will try to keep public IP addresses locked to
specific nodes as far as possible. This makes it easier for debugging
since you can know that as long as all nodes are healthy public IP X
will always be hosted by node Y.
The cost of using deterministic IP address assignment is that it
disables part of the logic where ctdb tries to reduce the number of
public IP assignment changes in the cluster. This tunable may increase
the number of IP failover/failbacks that are performed on the cluster
by a small margin.
When set to 1, ctdb uses the LCP2 ip allocation algorithm.
I plan to replace these with a single tunable to select the algorithm
(0 = deterministic, 1 = non-deterministic, 2 = LCP2 (default)).
Thanks for any feedback...
peace & happiness,
More information about the samba-technical