Advice on extending CTDB to support multiple NIC interfaces per node

Kevin Osborn kosborn at overlandstorage.com
Wed Mar 26 15:40:34 MDT 2014


Hi Michael,

Thanks for your help. I tried using the interface list after each IP, but both IP addresses on a node are failed over whenever any single interface goes down on a node. I would have expected the IP from the down interface to have moved to the "up" interface on the same node. Instead all IPs are moved off of that node.

Here is my public_addresses file (same for every node)
192.168.51.161/22 bond1,bond2
192.168.51.162/22 bond1,bond2
192.168.51.163/22 bond1,bond2
192.168.51.171/22 bond2,bond1
192.168.51.172/22 bond2,bond1
192.168.51.173/22 bond2,bond1
 
Here is the ctdb status from before unplugging a cable from one of the NICs on a node:
Number of nodes:3
pnn:0 192.0.2.59       OK (THIS NODE)
pnn:1 192.0.2.216      OK
pnn:2 192.0.2.24       OK
Generation:1168736351
Size:3
hash:0 lmaster:0
hash:1 lmaster:1
hash:2 lmaster:2
Recovery mode:NORMAL (0)
Recovery master:0

And the output from ctdb ip:
Public IPs on node 0
192.168.51.161 2
192.168.51.162 0
192.168.51.163 1
192.168.51.171 1
192.168.51.172 2
192.168.51.173 0


Now same commands from after bringing an interface down on this node:
Number of nodes:3
pnn:0 192.0.2.59       UNHEALTHY (THIS NODE)
pnn:1 192.0.2.216      OK
pnn:2 192.0.2.24       OK
Generation:100668884
Size:3
hash:0 lmaster:0
hash:1 lmaster:1
hash:2 lmaster:2
Recovery mode:NORMAL (0)
Recovery master:0

Public IPs on node 0
192.168.51.161 2
192.168.51.162 2
192.168.51.163 1
192.168.51.171 1
192.168.51.172 2
192.168.51.173 1

ctdb ifaces
Interfaces on node 0
name:bond2 link:down references:0
name:bond1 link:up references:0

ctdb listvars shows these settings:
DeterministicIPs        = 0
LCP2PublicIPs           = 0

Thanks again,

-Kevin

-----Original Message-----
From: Michael Adam [mailto:obnox at samba.org] 
Sent: Wednesday, March 26, 2014 1:57 PM
To: Kevin Osborn
Cc: samba-technical at lists.samba.org
Subject: Re: Advice on extending CTDB to support multiple NIC interfaces per node

Hi Kevin,

if I don't get your description wrong, the feature you are requesting already exists in ctdb.

You can name multiple interfaces after an IP in public addresses, like this:

10.11.12.1/24 eth1,eth2,eth3

In that case, local failover will be done if possible. If an interface goes down, but others are still up, the node will not be UNHEALTHY but PARTIALLY ONLINE.

Cheers - Michael

On 2014-03-26 at 15:49 +0000, Kevin Osborn wrote:
> Hi,
> 
> We are thinking of extending CTDB to support multiple public NIC interfaces per node and we would like to ask your advice on the correct approach. I will describe what necessitates this feature as well as the areas we expect to have to modify. We would love some advice from the experts, especially if our approach is leading us off into dangerous territory. Thanks in advance for any help you can offer.
> 
> Why do we need multiple NIC interfaces per node? 
> We are working on passing the VMware iSCSI certification test suite and there are some tests that expect at least two physical routes to the same iSCSI target. Each iSCSI target must be hosted by just one node in our architecture. So we will need to add another Ethernet interface to the node to facilitate the VMware failover scheme and pass the certification test.
> 
> What failover behavior would we expect?
> Each node would host two Ethernet interfaces and two public IP addresses allocated by CTDB. Failure of a single interface would result in the failed IP address being moved to the other interface on the same node. Failure of the entire node would move both IP addresses to some other node, but both would be served up by the same node. The two IP addresses would always be hosted by a single node. 
> 
> How would we have to change CTDB?
> We see the following areas that will need to change to support this new feature.
> Failover: 
> 1. Introduce a new tunable that would activate this mode, say MULTI_INTERFACE_PER_NODE. This tunable would activate several new code paths including a new failover mechanism and introduce a new IP allocation scheme.
> 2. We would not fail an entire node when a single interface goes down. 
> This means that the failover logic needs to be changed from node based 
> to interface based. (This looks like it could get complicated!) 3. Add a new ip_alloc_multi_interface() to the ctdb_takeover_run_core() function 4. We would add a new configuration file that would list the valid ip address tuples that any node can host. This file would be saved on the cluster file system.
> 
> I have made this description as brief as possible to sketch out our intentions. We are open to other approaches too. Please feel free to ask detailed questions by contacting me directly.
> 
> Thanks again for your help,
> 
> -Kevin
> 


More information about the samba-technical mailing list