Setting up CTDB on OCFS2 and VMs ...
Rowland Penny
repenny241155 at gmail.com
Wed Dec 31 08:46:30 MST 2014
On 17/12/14 09:08, Rowland Penny wrote:
> On 16/12/14 23:45, Martin Schwenke wrote:
>> On Tue, 16 Dec 2014 21:12:12 +0000, Rowland Penny
>> <repenny241155 at gmail.com> wrote:
>>
>>> I ran the ping_pong test this morning, following the wiki page and as
>>> far as I could see it passed all tests.
>> When I run "ping_pong /clusterfs/test.dat 3" on 1 node of a 2 node OCFS2
>> cluster, I see a very high locking rate - in the 10000s. When I run it
>> on another node I see the same high locking rate and I don't see the
>> rate drop on the 1st node. That's a fail.
>
> All I can say is that it did what the page said it would.
>
>>
>> This is on a cluster where I haven't worked out the extra steps to get
>> lock coherence.
>>
>>> I have come to the conclusion that you need to be a CTDB dev to set
>>> CTDB
>>> up, only they seem to have ALL the information required.
>> Sorry, but that line is starting to grate. I'm concerned that
>> statements like this are likely to put people off using CTDB. There are
>> many non-CTDB-devs out there running CTDB with other cluster
>> filesystems.
>
> Sorry if what I said upsets you, but I have put a lot of time into
> trying to get this setup to work, but it seems to fail when I try to
> add CTDB.
>
>> When the CTDB recovery lock is configured then CTDB has a hard
>> requirement that the cluster filesystem *must* provide lock coherence.
>> So the problem you have is a lack of lock coherence in OCFS2.
>
> But it passes the ping_pong test.
>
>> I am a CTDB dev. I haven't yet got OCFS2 working, partly due to lack
>> of time to figure out which pieces I'm missing. I have a simple recipe
>> that gets me to a similar point to where you are at and I haven't even
>> looked at corosync. At some time I will try to go through Richard's
>> instructions and try to distill out the part that adds lock coherence.
>>
>> I was confused by the ping pong test results so I tried to clarify the
>> documentation for that test.
>>
>> It seems like OCFS2 is stupendously difficult to setup with lock
>> coherence. This is not CTDB's fault. Perhaps you need to be an OCFS2
>> dev to setup CTDB with OCFS2? ;-)
>
> You could be right :-D
>>> I absolutely give up, I cannot make it work, god knows I have tried,
>>> but
>>> I just cannot make it work with the information available. I can find
>>> bits here and bits there, but there still seems to be something
>>> missing,
>>> or is it just me. Debian 7.7, Pacemaker, Corosync and Ocfs2 work OK, it
>>> is just when you try to add CTDB.
>> If all those other things provided lock coherence on the cluster
>> filesystem then CTDB would work. So adding CTDB makes you notice the
>> problem but CTDB does not cause it. :-)
>
> I can well believe what you are saying, so it might help if CTDB could
> print something in the logs.
>
> Rowland
>
>>
>> peace & happiness,
>> martin
>
OK, I have been having another attempt at the ctdb cluster, I cannot get
both nodes healthy if I use a lockfile in /etc/default/ctdb, so I have
commented it out, both nodes are now showing OK. I then moved on to
trying to get samba to join the domain, but it always fails with this
error message:
Could not initialise message context. Try running as root
Failed to join domain: Access is denied
I have investigated ctdb on my system and have come to the conclusion
that ctdb is a *MESS*, don't believe me ? then consider this:
root at cluster1:~# ls /var/ctdb
iptables-ctdb.flock persistent state
root at cluster1:~# ls /var/lib/ctdb
iptables-ctdb.flock persistent state
root at cluster1:~# ls /var/lib/lib/ctdb
brlock.tdb.1 iptables-ctdb.flock persistent
smbXsrv_open_global.tdb.1 smbXsrv_version_global.tdb.1
dbwrap_watchers.tdb.1 locking.tdb.1 printer_list.tdb.1
smbXsrv_session_global.tdb.1 state
g_lock.tdb.1 notify_index.tdb.1 serverid.tdb.1
smbXsrv_tcon_global.tdb.1
root at cluster1:~# ls /var/ctdb/persistent/
root at cluster1:~# ls /var/ctdb/state/
failcount interface_modify_eth0.flock service_state
root at cluster1:~# ls /var/lib/ctdb/persistent/
root at cluster1:~# ls /var/lib/ctdb/state/
failcount interface_modify_eth0.flock service_state
root at cluster1:~# ls /var/lib/lib/ctdb/persistent/
account_policy.tdb.1 ctdb.tdb.0 ctdb.tdb.1 group_mapping.tdb.1
passdb.tdb.1 registry.tdb.1 secrets.tdb.1 share_info.tdb.1
root at cluster1:~# ls /var/lib/lib/ctdb/state
failcount interface_modify_eth0.flock persistent_health.tdb.1
recdb.tdb.1 service_state
Why have very similar data in 3 places ? why have the conf (which
incidentaly isn't called a conf file) in a different place from the
other ctdb files in /etc ?
More to the point, Why, oh why doesn't it work.
Rowland
More information about the samba-technical
mailing list