Setting up CTDB on OCFS2 and VMs ...

Wed Dec 31 11:40:35 MST 2014

On 31/12/14 17:59, Michael Adam wrote:
> On 2014-12-31 at 15:46 +0000, Rowland Penny wrote:
>> OK, I have been having another attempt at the ctdb cluster, I cannot get
>> both nodes healthy if I use a lockfile in /etc/default/ctdb,
> This can't be expected to work, since a recovery lock file needs
> to be on shared storage (clustered file system, providing posix
> fcntl byte range lock semantics), and /etc/default/ is not
> generally such a place, unless you are building a fake cluster
> with multiple ctdb instances running on one host.

The lockfile *is* on the shared area i.e. I am sharing /cluster and that 
is what I have in /etc/default/ctdb:

CTDB_RECOVERY_LOCK=/cluster/lockfile

>
>> so I have commented it out, both nodes are now showing OK.
> It is possible to run w/o recovery lock, but it is not
> recommended in a production setup at least.

I am aware of this, but it seems to be the only way of getting ctdb to 
start.

>
>
>> I then moved on to trying
>> to get samba to join the domain, but it always fails with this error
>> message:
>>
>> Could not initialise message context. Try running as root
>> Failed to join domain: Access is denied
>>
>> I have investigated ctdb on my system and have come to the conclusion that
>> ctdb is a *MESS*, don't believe me ? then consider this:
>>
>> root at cluster1:~# ls /var/ctdb
>> iptables-ctdb.flock  persistent  state
>> root at cluster1:~# ls /var/lib/ctdb
>> iptables-ctdb.flock  persistent  state
>> root at cluster1:~# ls /var/lib/lib/ctdb
>> brlock.tdb.1           iptables-ctdb.flock  persistent
>> smbXsrv_open_global.tdb.1     smbXsrv_version_global.tdb.1
>> dbwrap_watchers.tdb.1  locking.tdb.1        printer_list.tdb.1
>> smbXsrv_session_global.tdb.1  state
>> g_lock.tdb.1           notify_index.tdb.1   serverid.tdb.1
>> smbXsrv_tcon_global.tdb.1
>> root at cluster1:~# ls /var/ctdb/persistent/
>> root at cluster1:~# ls /var/ctdb/state/
>> failcount  interface_modify_eth0.flock    service_state
>> root at cluster1:~# ls /var/lib/ctdb/persistent/
>> root at cluster1:~# ls /var/lib/ctdb/state/
>> failcount  interface_modify_eth0.flock    service_state
>> root at cluster1:~# ls /var/lib/lib/ctdb/persistent/
>> account_policy.tdb.1  ctdb.tdb.0  ctdb.tdb.1  group_mapping.tdb.1
>> passdb.tdb.1  registry.tdb.1  secrets.tdb.1    share_info.tdb.1
>> root at cluster1:~# ls /var/lib/lib/ctdb/state
>> failcount  interface_modify_eth0.flock    persistent_health.tdb.1
>> recdb.tdb.1  service_state
>>
>> Why have very similar data in 3 places ? why have the conf (which
>> incidentaly isn't called a conf file) in a different place from the other
>> ctdb files in /etc ?
> That's essentially two places, one hierarchy under /var/ctdb
> (old ctdb versions) and one hierarchy under /var/lib/ctdb (new
> ctb versions) so my guess is that this stems from earlier
> installs of older versions.
>
> If you stop ctdb, remove both these directory trees, and then
> restart ctdb, do both trees reappear?

No idea, I have only installed ctdb *once*, there is no earlier version.
>
>> More to the point, Why, oh why doesn't it work.
> Has the samba version been compiled against the used ctdb
> version. One possible source of such problems is that
> samba might have beenm compiled against an older version
> of ctdb and then you install the latest version of ctdb.

Again, no idea, I am using the samba4 & ctdb packages from backports, 
versions 4.1.9 & 2.5.3

>
> The problem that could explain the "Could not initialize ..."
> message would be that samba tries to access CTDB under the
> socket file /tmp/ctdbd.socket (default in old ctdb versions)
> and the new ctdbd uses /var/run/ctdb/ctdbd.socket by default.

Now that is interesting, because if I do not put a line in smb.conf 
saying where ctdbd.socket is, it tries to use /tmp. With the line in 
smb.conf, it just errors with: connect(/var/lib/ctdb/ctdb.socket) 
failed: No such file or directory

>
> So you could (without needing to recompile) test if things
> work out more nicely if you set:
>
> "ctdbd socket = /var/run/ctdb/ctdbd.socket"
> in smb.conf

No, but finding out where the socket is and altering the line to: ctdbd 
socket = /var/lib/run/ctdb/ctdbd.socket

and running: net ads join -U Administrator at EXAMPLE.COM -d5

Gets me (after a lot of output)

Using short domain name -- EXAMPLE
Joined 'SMBCLUSTER' to dns domain 'example.com'
Not doing automatic DNS update in a clustered setup.
return code = 0

Good Grief!!!! It actually seems to have worked =-O

Now to try altering the conf file to get it start smbd, nmbd and winbind.

> and (for the sake of explicitness):
> "CTDB_SOCKET=/var/run/ctdb/ctdbd.socket"
> in /etc/default/ctdb.

I have tried similar lines in /etc/default/ctdb, but whatever I tried, 
it just wouldn't let ctdb start.

Rowland

>
>
> And then of course restart everything.
> let's see if this improves anything...
>
>
> Cheers - Michael
>
>