ctdb reclock + fcntl posix locks with ocfs2
Nicolas Ecarnot
nicolas at ecarnot.net
Thu Nov 28 04:19:49 MST 2013
[Short version]
What to tweak for ctdb reclock dir to handle correct fcntl locks on
ocfs2 fs?
[Less short version]
Hi,
We are using in production a 2 nodes cluster that is working fine :
Ubuntu server 12.10, cman, samba, ctdb.
The ctdb reclock are done on a dedicated GFS2 partition, and the data
storage (samba user shares) are done on a dedicated OCFS2 partition.
At the time being, I am setting up a similar cluster but with updated
versions, distribs and with the will to simplify some parts :
- 2 nodes, CentOS 6.4 64 bits, as two oVirt VM
- cman 3.0.12
- corosync 1.4.1
- samba 3.6.9
- ctdb 1.0.114.5-3
- ocfs2 1.8.0-10.el6
- ctdb lock LUN and ocfs2 user data both stored in a remote equalogic
iSCSI SAN
- UEK kernel : 2.6.39-400.211.1.el6uek.x86_64
I would like to get rid of two different clustered filesystem types
(gfs2, ocfs2), and keep only ocfs2, as my 3 To of data are already
stored this way.
Once everything is setup, I run ctdb on the first node, then on the
second, and I'm facing a problem I remember I already had to cope with 2
years ago on the production cluster :
ctdb_recovery_lock: Got recovery lock on '/ctdb/.ctdb.lock'
ERROR: recovery lock file /ctdb/.ctdb.lock not locked when recovering!
This was two years ago, but I remember I had to switch back to store
this lock mechanism on a GFS2 partition.
I took the time to search and test many things.
Ping_pong tests showed this :
* 1 *
ping_pong /ctdb/test.dat 3
showing ~ 2.1 M locks/s, and /NOT/ dropping to a different value when
ran on the second node. Symmetrical, though.
* 2 *
ping_pong -rw /ctdb/test.dat 3
showing ~ 330 k locks/s, and dropping to 190 locks/s when ran on node 2.
Symmetrical again.
* 3 *
ping_pong -rw -m /ctdb/test.dat 3
showing ~ 2.1 M locks/s, and dropping to 88 k locks/s when ran on node
2, and showing several data increment oscillating between 1 and
30,150,180...
In the test 1, according to the ctdb webpage advices, the value _NOT_
dropping seems to indicate an issue, so I tried to validate the correct
POSIX locking, following the advices found there :
http://serverfault.com/questions/531813/how-to-determine-posix-advisory-file-locks-are-working-in-simfs-in-the-vm-im-us
and the tests were successful.
I found that this samba-technical mailing list helped someone to solve a
similar problem when storing the ctdb lock on a gluster FS
(https://lists.samba.org/archive/samba-technical/2012-October/087515.html),
but on ocfs2, I have no mean to use the --direct-io-mode=enable mount
option). In the OCFS2 man page, I read this :
"datavolume
This mount option has been deprecated in OCFS2 1.6. It has been used in
the past (OCFS2 1.2 and OCFS2 1.4), to force the Oracle RDBMS to
issue direct IOs to the hosted data files, control files, redo logs,
archive logs, voting disk, cluster registry, etc. It has been deprecated
because it is no longer required. Oracle RDBMS users should instead
use the init.ora parameter, filesystemio_options, to enable direct IOs."
I also found this relevant bug :
https://bugzilla.samba.org/show_bug.cgi?id=6777
but the version of filesystem and kernel I'm using should be more than
capable of handling all this. This is confirmed on Oracle's ocfs2 1.6
release notes.
My cluster.conf contains the following line :
<dlm plock_ownership="1" plock_rate_limit="0"/>
because this seemed correct to me according to what I understood so far.
I'd be very glad to hear your advices on what I could change next, or
what to check.
On my side, I will now test the setup I wanted to avoid : store the ctdb
lock on a GFS2 partition, and see if it works as on our old production
cluster.
Thank you.
--
Nicolas Ecarnot
More information about the samba-technical
mailing list