Setting up CTDB on OCFS2 and VMs ...
Rowland Penny
repenny241155 at gmail.com
Thu Jan 1 11:51:21 MST 2015
On 01/01/15 18:30, Michael Adam wrote:
> Oops, I replied before realizing that Martin Schwenke
> and Stefan Kania had already replied to your mail,
> and that Martin has nailed your issues to the
> corresponding debian backports packaging issues
> already. So most of my mail can be ignored. :-)
>
> Still: You need to get your clusterfs setup right
> so that you can use a reclock....
>
> Michael
>
> On 2015-01-01 at 19:25 +0100, Michael Adam wrote:
>> Happy new year!
>>
>> On 2014-12-31 at 18:40 +0000, Rowland Penny wrote:
>>> On 31/12/14 17:59, Michael Adam wrote:
>>>> On 2014-12-31 at 15:46 +0000, Rowland Penny wrote:
>>>>> OK, I have been having another attempt at the ctdb cluster, I cannot get
>>>>> both nodes healthy if I use a lockfile in /etc/default/ctdb,
>>>> This can't be expected to work, since a recovery lock file needs
>>>> to be on shared storage (clustered file system, providing posix
>>>> fcntl byte range lock semantics), and /etc/default/ is not
>>>> generally such a place, unless you are building a fake cluster
>>>> with multiple ctdb instances running on one host.
>>> The lockfile *is* on the shared area i.e. I am sharing /cluster and that is
>>> what I have in /etc/default/ctdb:
>>>
>>> CTDB_RECOVERY_LOCK=/cluster/lockfile
>> Oh, ok... I misread your words above.
>>
>>>>> so I have commented it out, both nodes are now showing OK.
>>>> It is possible to run w/o recovery lock, but it is not
>>>> recommended in a production setup at least.
>>> I am aware of this, but it seems to be the only way of getting ctdb to
>>> start.
>> Which probably means that your clustered fs setup is still
>> not correct. Or there is still a flaw in your ctdb setup.
>> Could you (re-)post your /etc/default/ctdb and /etc/ctdb/nodes
>> and also your network config (ip a l) on the nodes?
>>
>> It is really important to get this right before starting to
>> seriously play with clustered samba on top.
>>
>>>>> Why have very similar data in 3 places ? why have the conf (which
>>>>> incidentaly isn't called a conf file) in a different place from the other
>>>>> ctdb files in /etc ?
>>>> That's essentially two places, one hierarchy under /var/ctdb
>>>> (old ctdb versions) and one hierarchy under /var/lib/ctdb (new
>>>> ctb versions) so my guess is that this stems from earlier
>>>> installs of older versions.
>>>>
>>>> If you stop ctdb, remove both these directory trees, and then
>>>> restart ctdb, do both trees reappear?
>>> No idea, I have only installed ctdb *once*, there is no earlier version.
>> Ok. Does that mean that you performed the above steps and
>> both directory trees reappeared?
>>
>>>>> More to the point, Why, oh why doesn't it work.
>>>> Has the samba version been compiled against the used ctdb
>>>> version. One possible source of such problems is that
>>>> samba might have beenm compiled against an older version
>>>> of ctdb and then you install the latest version of ctdb.
>>> Again, no idea, I am using the samba4 & ctdb packages from backports,
>>> versions 4.1.9 & 2.5.3
>>>> The problem that could explain the "Could not initialize ..."
>>>> message would be that samba tries to access CTDB under the
>>>> socket file /tmp/ctdbd.socket (default in old ctdb versions)
>>>> and the new ctdbd uses /var/run/ctdb/ctdbd.socket by default.
>>> Now that is interesting, because if I do not put a line in smb.conf saying
>>> where ctdbd.socket is, it tries to use /tmp.
>> That confirms my theory. And it means that
>> the samba and ctdv packages from backports simply don't match.
>> Samba has apparently been compiled against an oder version
>> of ctdb that still used /tmp.
>>
>>> With the line in smb.conf, it
>>> just errors with: connect(/var/lib/ctdb/ctdb.socket) failed: No such file or
>>> directory
>> Er strange, and you have entered "/var/run/ctdb/..." into
>> smb.conf and not by accient "/var/lib/ctdb/... ?
>>
>>>> So you could (without needing to recompile) test if things
>>>> work out more nicely if you set:
>>>>
>>>> "ctdbd socket = /var/run/ctdb/ctdbd.socket"
>>>> in smb.conf
>>> No, but finding out where the socket is and altering the line to: ctdbd
>>> socket = /var/lib/run/ctdb/ctdbd.socket
>>>
>>> and running: net ads join -U Administrator at EXAMPLE.COM -d5
>>>
>>> Gets me (after a lot of output)
>>>
>>> Using short domain name -- EXAMPLE
>>> Joined 'SMBCLUSTER' to dns domain 'example.com'
>>> Not doing automatic DNS update in a clustered setup.
>>> return code = 0
>>>
>>> Good Grief!!!! It actually seems to have worked =-O
>> Yay!
>>
>> That path is strange.
>> Are there some symlinks involved or maybe missing?
>> Or does the debian ctdb package have an altered
>> path for the socket? That may well be.
>>
>> Which debian version are you using? (I could inspect it locally.)
>>
>>> Now to try altering the conf file to get it start smbd, nmbd and winbind.
>>>
>>>> and (for the sake of explicitness):
>>>> "CTDB_SOCKET=/var/run/ctdb/ctdbd.socket"
>>>> in /etc/default/ctdb.
>>> I have tried similar lines in /etc/default/ctdb, but whatever I tried, it
>>> just wouldn't let ctdb start.
>> Er, that should not be. Could you post the exact
>> /etc/default/ctdb file used and the error messages
>> that ctdbd prints?
>>
>> Cheers - Michael
>
OK, I understand that I don't really know how to setup a cluster and in
a lot of ways I don't know how they are supposed to work, but this is
how I have setup my test cluster, it is based on the instructions that
Richard Sharp posted, but installed on Debian instead of Centos.
Could someone look at it and tell me where I am going wrong :-)
Can anybody confirm that the ctdb package in Debian backports isn't
built against the samba package available from backports.
1. Create two VirtualBox VMs with enough memory and disk for your Linux
Distro. I used Debian 7.7 with 512MB and 8GB. You will also need an
extra interface on each VM for the clustering private network. I set
them to an internal type.
2. Because you will need a shared disk, create one:
vboxmanage createhd --filename ~/VirtualBox\ VMs/SharedHD1 --size 10240
--variant Fixed --format VDI # Creates a 10GB fixed sized disk
vboxmanage modifyhd ~/VirtualBox\ VMs/SharedHD1.vdi --type shareable
Also, use the GUI to add the shared disk to both VMs.
3. Install the OS on each of the VM's.
4. Install clustering packages:
apt-get install openais corosync pacemaker ocfs2-tools-pacemaker dlm-pcmk
5. Configure corosync
nano /etc/corosync/corosync.conf
Make sure that bindnetaddr is defined and points to your private
interface. I set it to 192.168.1.0
# Copy the file to the other node.
scp /etc/corosync/corosync.conf root at 192.168.0.7:/etc/corosync/
# ON BOTH NODES
service pacemaker stop # Stop these in case they were running
service corosync stop # Same here
nano /etc/default/corosync
Change:
START=no
To:
START=yes
# now start the cluster
service corosync start
# Also start it on the other node(s).
# Now check the status:
root at cluster1:~# crm_mon -1
============
Last updated: Mon Dec 15 10:46:20 2014
Last change: Mon Dec 15 10:44:18 2014 via crmd on cluster1
Stack: openais
Current DC: cluster1 - partition with quorum
Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff
2 Nodes configured, 2 expected votes
0 Resources configured.
============
Online: [ cluster1 cluster2 ]
If you do not see all the other nodes online, then you have to debug the
problem.
6. Configure the Oracle cluster
dpkg-reconfigure ocfs2-tools
Configuring ocfs2-tools
# Would you like to start an OCFS2 cluster (O2CB) at boot time?
#
# <Yes>
#
# Name of the cluster to start at boot time:
#
# ctdbdemo
# Create the ocfs2 cluster conf file
o2cb_ctl -C -n ctdbdemo -t cluster
o2cb_ctl -C -n cluster1 -t node -a number=1 -a ip_address=192.168.1.10
-a ip_port=7777 -a cluster=ctdbdemo
o2cb_ctl -C -n cluster2 -t node -a number=2 -a ip_address=192.168.1.11
-a ip_port=7777 -a cluster=ctdbdemo
# ON BOTH NODES
service corosync stop
# Copy files to the other node.
scp /etc/default/o2cb root at 192.168.0.7:/etc/default/
scp /etc/ocfs2/cluster.conf root at 192.168.0.7:/etc/ocfs2/
service o2cb start
service corosync start
7. Create the shared shared file system on one node:
mkfs.ocfs2 -L CTDBdemocommon -T datafiles -N 4 /dev/sdb
8. Mount it on both and ensure that you can create files/dirs on one
node and see them on the other node.
mkdir /cluster
mount -t ocfs2 /dev/sdb /cluster
umount /cluster
Create a bash script
nano config.sh
#!/bin/bash
crm configure<<EOF
primitive p_dlm_controld ocf:pacemaker:controld \
op start interval="0" timeout="90" \
op stop interval="0" timeout="100" \
op monitor interval="10"
primitive p_fs_ocfs2 ocf:heartbeat:Filesystem \
params device="/dev/sdb" \
directory="/cluster" \
fstype="ocfs2" \
meta target-role=Started \
op monitor interval="10"
group g_ocfs2 p_dlm_controld p_fs_ocfs2
clone cl_ocfs2 g_ocfs2 \
meta interleave="true"
EOF
exit 0
bash ./config.sh
crm configure property stonith-enabled=false
crm configure property no-quorum-policy=ignore
9. Install ctdb
apt-get -t wheezy-backports install ctdb
10. Configure ctdb
nano /etc/default/ctdb
# Options to ctdbd, read by ctdbd_wrapper(1)
#
# See ctdbd.conf(5) for more information about CTDB configuration variables.
# Shared recovery lock file to avoid split brain. No default.
#
# Do NOT run CTDB without a recovery lock file unless you know exactly
# what you are doing.
#CTDB_RECOVERY_LOCK=/some/place/on/shared/storage
#CTDB_RECOVERY_LOCK=/cluster/lockfile
# List of nodes in the cluster. Default is below.
CTDB_NODES=/etc/ctdb/nodes
# List of public addresses for providing NAS services. No default.
CTDB_PUBLIC_ADDRESSES=/etc/ctdb/public_addresses
# What services should CTDB manage? Default is none.
#CTDB_MANAGES_SAMBA=yes
#CTDB_MANAGES_WINBIND=yes
# CTDB_MANAGES_NFS=yes
# Raise the file descriptor limit for CTDB?
# ulimit -n 10000
# Default is to use the log file below instead of syslog.
CTDB_LOGFILE=/var/log/ctdb/log.ctdb
CTDB_SYSLOG=no
# Default log level is ERR. NOTICE is a little more verbose.
CTDB_DEBUGLEVEL=NOTICE
# Set some CTDB tunable variables during CTDB startup?
CTDB_VARDIR=/var/lib/ctdb
# CTDB_SET_TraverseTimeout=60
export CTDB_SOCKET
nano /etc/ctdb/nodes
192.168.1.10
192.168.1.11
nano /etc/ctdb/public_addresses # NOTE: These addresses *SHOULD NOT* exist
192.168.0.8/8 eth0
192.168.0.9/8 eth0
# Copy files to the other node.
scp /etc/default/ctdb root at 192.168.0.7:/etc/default/
scp /etc/ctdb/nodes root at 192.168.0.7:/etc/ctdb/
scp /etc/ctdb/public_addresses root at 192.168.0.7:/etc/ctdb/
# Create a missing directory (on both nodes)
mkdif -p /var/lib/run/ctdb
11. Start ctdb on all nodes
# You must have ctdb started so that the secrets file will get distributed
service ctdb start
Check status:
root at cluster1:~# ctdb status
Number of nodes:3 (including 1 deleted nodes)
pnn:1 192.168.1.10 OK (THIS NODE)
pnn:2 192.168.1.11 OK
Generation:1073761636
Size:2
hash:0 lmaster:1
hash:1 lmaster:2
Recovery mode:NORMAL (0)
Recovery master:1
#12. Turn on the lockfile
#nano /etc/default/ctdb
#set the lockfile:
#CTDB_RECOVERY_LOCK=/cluster/lockfile
#restart ctdb on all nodes.
#service ctdb restart
#Wait a short while and then check the status again.
13. Install samba
apt-get -t wheezy-backports install samba attr krb5-config krb5-user ntp
dnsutils winbind libpam-winbind libpam-krb5 libnss-winbind libsmbclient
smbclient
service smbd stop
service nmbd stop
service winbind stop
update-rc.d -f smbd remove
update-rc.d -f nmbd remove
update-rc.d -f winbind remove
14. Configure samba for the domain you want to join
[global]
workgroup = EXAMPLE
netbios name = SMBCLUSTER
security = ADS
realm = EXAMPLE.COM
dedicated keytab file = /etc/krb5.keytab
kerberos method = secrets and keytab
server string = Samba 4 Client %h
winbind enum users = yes
winbind enum groups = yes
winbind use default domain = yes
winbind expand groups = 4
winbind nss info = rfc2307
winbind refresh tickets = Yes
winbind normalize names = Yes
idmap config * : backend = tdb
idmap config * : range = 2000-9999
idmap config EXAMPLE : backend = ad
idmap config EXAMPLE : range = 10000-999999
idmap config EXAMPLE : schema_mode = rfc2307
clustering = Yes
ctdbd socket = /var/lib/run/ctdb/ctdbd.socket
printcap name = cups
cups options = raw
usershare allow guests = yes
domain master = no
local master = no
preferred master = no
os level = 20
map to guest = bad user
username map = /etc/samba/smbmap
vfs objects = acl_xattr
map acl inherit = Yes
store dos attributes = Yes
log level = 6
wins server = 192.168.0.2
[users]
comment = Home Directories
path = /cluster/users
browseable = no
read only = No
[profiles]
path = /cluster/profiles
read only = No
[testdir]
path = /cluster/testdir
read only = no
15. join the domain
join the domain from node 1 only:
net ads join -UAdministrator
16. Enable samba & winbind in the ctdb config
CTDB_MANAGES_SAMBA=yes
CTDB_MANAGES_WINBIND=yes
17. Restart ctdb on all nodes
These are the hosts & interfaces files from the two cluster machines.
cluster1)
/etc/hosts
127.0.0.1 localhost
192.168.0.6 cluster1.example.com cluster1
# The following lines are desirable for IPv6 capable hosts
::1 localhost ip6-localhost ip6-loopback
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
/etc/network/interfaces
# This file describes the network interfaces available on your system
# and how to activate them. For more information, see interfaces(5).
# The loopback network interface
auto lo
iface lo inet loopback
auto eth0
iface eth0 inet static
address 192.168.0.6
netmask 255.255.255.0
network 192.168.0.0
broadcast 192.168.0.255
gateway 192.168.0.1
auto eth1
iface eth1 inet static
address 192.168.1.10
netmask 255.255.255.0
network 192.168.1.0
broadcast 192.168.1.255
cluster2)
/etc/hosts
127.0.0.1 localhost
192.168.0.7 cluster2.example.com cluster2
# The following lines are desirable for IPv6 capable hosts
::1 localhost ip6-localhost ip6-loopback
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
/etc/network/interfaces
# This file describes the network interfaces available on your system
# and how to activate them. For more information, see interfaces(5).
# The loopback network interface
auto lo
iface lo inet loopback
auto eth0
iface eth0 inet static
address 192.168.0.7
netmask 255.255.255.0
network 192.168.0.0
broadcast 192.168.0.255
gateway 192.168.0.1
auto eth1
iface eth1 inet static
address 192.168.1.11
netmask 255.255.255.0
network 192.168.1.0
broadcast 192.168.1.255
Rowland
More information about the samba-technical
mailing list