[Samba] Create network oscillations on the leader node, resulting in brain splitting

tu.qiuping tu.qiuping at qq.com
Mon Jun 12 16:46:07 UTC 2023


My ctdb version is 4.17.7


Hello, everyone. 
My ctdb cluster configuration is correct and the cluster is healthy before operation.
My cluster has three nodes, namely 192.168.40.131(node 0), 192.168.40.132(node 1), and 192.168.40.133(node 2). And the node 192.168.40.133 is the leader.


I conducted network oscillation testing on node 192.168.40.133, and after a period of time, the lock update of this node failed, and at this time, the lock was taken away by node 0. 
Amazingly, after node 0 received the lock, it sent a message with leader=0 to node 1, but did not send it to node 2. After a short period of time, node 0 and node 1 received a broadcast with leader=2, and at this time, node 0 did not release the lock and believed that it was not the leadereven though the network of node 2 was healthy at this time.
And when I restored the network of node 2, node 1 and node 2 kept trying to acquire the lock and reported an error: Unable to take cluster lock - contention.



The logs of the three nodes are attached.


ctdb status of the node 0:
[root at host-192-168-40-131 ~]# ctdb status
Number of nodes:3
pnn:0 192.136.40.131   OK (THIS NODE)
pnn:1 192.136.40.132   OK
pnn:2 192.136.40.133   UNHEALTHY
Generation:629720908
Size:2
hash:0 lmaster:0
hash:1 lmaster:1
Recovery mode:RECOVERY (1)
Leader:UNKNOWN



ctdb status of the node 1:
[root at host-192-168-40-132 tecs]# ctdb status
bNumber of nodes:3
pnn:0 192.136.40.131   OK
pnn:1 192.136.40.132   OK (THIS NODE)
pnn:2 192.136.40.133   UNHEALTHY
Generation:629720908
Size:2
hash:0 lmaster:0
hash:1 lmaster:1
Recovery mode:RECOVERY (1)
Leader:UNKNOWN



ctdb status of the node 2:
[root at host-192-168-40-133 tecs]# ctdb status
Number of nodes:3
pnn:0 192.136.40.131   UNHEALTHY
pnn:1 192.136.40.132   UNHEALTHY
pnn:2 192.136.40.133   OK (THIS NODE)
Generation:1185443889
Size:1
hash:0 lmaster:2
Recovery mode:RECOVERY (1)
Leader:UNKNOWN


More information about the samba mailing list