[Samba] ctdb_recovery_lock: Failed to get recovery lock
Nicolas Ecarnot
nicolas at ecarnot.net
Tue Mar 27 03:14:16 MDT 2012
Hi,
I'm happily progressing toward the successful setup of my two nodes
samba cluster : cman, qdisk, clvm, gfs2, ctdb, samba, winbind, ad.
And now, I'm in testing phase.
When my cluster is up and running, I can transfer each ip address toward
on node or the other, seamlessly.
They can fence each other.
But I still have one big issue : though they have been setup as clones,
they don't behave identically : when shutting down node 1, node 0 takes
over every part of ctdb setup (ip, recmaster, services).
But when I stop ctdb daemon on node 1, though ctdb node 0 correctly
stops its children daemons (nmbd, smbd and winbind) and kills itself,
node 1 claims :
ctdb_recovery_lock: Failed to get recovery lock on '/ctdb/.ctdb.lock'
(This directory is clvm + gfs2 shared, writable and correctly accessible
from both nodes)
This leads node 1 to get banned.
Then, (I guess), when being unbanned, reelection occurs, but I get :
Recmaster node 1 no longer available. Force reelection
I suppose that node 1 can't become recmaster as it can not get the
recovery lock. But there's no way I see why this node claims it can take
this lock.
I don't know if this may help, but :
- I removed the lock file, and restarting ctdb recreates it correctly
- Every process is ran as root, who can obviously write in this dir
- I don't know if it is correct, but this file weights zero byte?
Waiting for your advice, I'm heading to reading the source code, in the
hope I may understand what's wrong.
--
Nicolas Ecarnot
More information about the samba
mailing list