[Samba] ctdb: Strange behaviour after upgrade

eisofen at eisofen.de eisofen at eisofen.de
Thu Nov 18 08:14:06 MST 2010


Hi,

last weekend I've updated samba and ctdb on my 2-node cluster. Samba is
now on 3.5.6 (from 3.3.4), ctdb on 1.0.114 (from 1.0.84). Both installed
from repo via yum and ctdb-packages.

After restarting both nodes everything was fine, we could access files on
the cluster.

On monday I noticed that the nodes didn't had their initial adresses:

Node 1:
hostname dscln01, public IP 10.0.0.41/8, now 10.0.0.42/8
/etc/sysconfig/network-scripts/ifcfg-bond0:

DEVICE=bond0
BOOTPROTO=none
IPADDR=10.0.0.41
NETWORK=10.0.0.0
BROADCAST=10.0.0.255
NETMASK=255.0.0.0
ONBOOT=yes
USERCTL=no



Node 2:
hostname dscln02, public IP 10.0.0.42/8, now 10.0.0.41/8
/etc/sysconfig/network-scripts/ifcfg-bond0:

DEVICE=bond0
BOOTPROTO=none
IPADDR=10.0.0.42
NETWORK=10.0.0.0
BROADCAST=10.0.0.255
NETMASK=255.0.0.0
ONBOOT=yes
USERCTL=no

Yesterday it felt over so we had to reboot both nodes and the IP where
still mixed up.

log.ctdb got some interesing entries after reboot:

2010/11/17 09:48:02.613807 [ 4383]: killed 30 TCP connections to released
IP 10.0.0.42
2010/11/17 09:48:02.633263 [ 4383]: re-adding secondary address
10.0.0.41/8 to dev bond0
2010/11/17 09:48:02.646140 [ 4383]: /etc/ctdb/interface_modify.sh: line
71: /etc/ctdb/state/interface_modify/bond0.readd.d/10.0.0.41.8/*: No such
file or
directory
2010/11/17 09:48:02.646446 [ 4383]:
/etc/ctdb/state/interface_modify/bond0.readd.d/10.0.0.41.8/* 'bond0'
'10.0.0.41' '8' - failed - 127
2010/11/17 09:48:02.646514 [ 4383]: call
/etc/ctdb/state/interface_modify/bond0.readd.d/10.0.0.41.8/* 'bond0'
'10.0.0.41' '8'
2010/11/17 09:48:02.647412 [ 4383]: Failed to del 10.0.0.42 on dev bond0
2010/11/17 09:48:02.649354 [ 4383]: server/ctdb_daemon.c:688 waitpid()
returned error. errno:10

I also notice, or lets say user reports, slow performance when shutting
down their PC. When it comes to closing time load climbs to ~70 on both
nodes. with high CPU load on ctdbd and mmfsd. OK, 220 PC writing back their
profiles..

Could ctdb the blocking element when writing to it's persistent DB, since
the local disks are not that super fast?

Both nodes are hooked up to an infortrend SAN, connected up via FC-AL, FS
is GPFS, running on CentOS 5.3.
Did I do something wrong after or before upgrading?


Matthias



More information about the samba mailing list