Advice on extending CTDB to support multiple NIC interfaces per node

Thu Mar 27 17:52:33 MDT 2014

Hi Kevin,

On Thu, 27 Mar 2014 14:51:36 +0000, Kevin Osborn
<kosborn at overlandstorage.com> wrote:

> Thanks Martin, that did the trick. Now the IP address just moves
> from one interface to the other when one link goes down.

Cool.

> Hosting the two IP addresses on the same node is required to allow
> us to provide two different paths to the same iSCSI targets. This
> means that these two IP addresses must stay together if the node
> fails entirely.  Because of this we still plan to add a new tunable
> (but we will probably call it IP_ADDR_TUPLES instead of
> MULTI_INTERFACE_PER_NODE). We will also add a new
> ip_alloc_ip_tuple_interface() to the ctdb_takeover_run_core()
> function. And finally, we would add a new configuration file that
> would list the valid IP address tuples that any node can host. This
> file would be saved on the cluster file system.

Not that I can think of a better way of doing what you're saying, but
this sounds like it might be difficult.  The main issues are:

* Public IP address configuration can be heterogeneous across nodes in
  the cluster.  The current IP allocation framework collects a list of
  IPs that can be hosted from each node.

  So you might need to extend the information that is collected from
  all nodes.  We have long-term plans to do this (e.g. include allowed
  interfaces for each IP address).  It could be that different IP
  allocation algorithms would collect different information.

  Both the main daemon and the recovery daemon have knowledge of the
  public IP address configuration, so this makes the task even more
  complex.  In the future we're hoping to break the public IP address
  handling out into a separate daemon, which would do IP address
  allocation and consistency checking.  The main daemon and the
  recovery daemon would then have no knowledge of IP addresses.  At
  that point adding more information to share between nodes would
  become easier.  It would also become easier to simply replace the
  whole public IP daemon with one that makes different assumptions.
  We're a couple of steps away from doing this...  perhaps some time
  this year.

* I would encourage you not to save configuration in the cluster
  filesystem.  When cluster filesystem performance hits its limits
  then CTDB would be unable to reload the configuration.  Also, if you
  assume a common file then you break the assumption of heterogeneous
  IP configuration.  At a minumum can you please make the location of
  the file configurable?

  In fact, how is this for a crazy but mostly backward compatible
  hack?  Extend the current public addresses file to allow multiple IP
  addresses:

    ip-address iface-list [ip-address iface-list ...]

  You could extend the ctdb_vnn structure to link to a list of "slave"
  ctdb_vnn structures, which would contain all but the first IP from
  each line.  The IP allocation stuff could just (continue to) work
  with the primary IP, since the other IPs have to follow
  (i.e. whenever the primary is taken/released then all the slaves
  would follow - that could be implemented in the main daemon).  To
  keep certain functions simple (e.g. killtcp/tickle handling) the
  slave ctdb_vnn's could also be in the main list but could be tagged
  as slaves (so that IP allocation ignores them).  There are some
  potential wrinkles in the public IP consistency checking that the
  recovery daemon does...

  I clearly need to think this through more but it might work.  :-)

peace & happiness,
martin