CTDB - all nodes "unhealthy"

Andy D'Arcy Jewell andy.jewell at sysmicro.co.uk
Fri Mar 23 05:10:30 MDT 2012

On 23/03/12 10:16, ronnie sahlberg wrote:
> Ok so it doesnt even startup properly, so it fails in the startup
> event or the initial recovery.
> Shut down completely and delete the /var/log/log.ctdb files so we get
> a clean trace
> and restart ctdb on all nodes.
> Then post the log.ctdb file after it has been running for about 3 minutes

I'm such a n00b at this!

I did as you suggested - cleared down the logs and restarted ctdb. When 
I looked after a few minutes, I found "ctdb_control error: 'managed to 
lock reclock file from inside daemon'" in the ctdb log, which I hadn't 
seen before; when I looked that up, I found a list post that says (in part):

"Ok, you have a problem with the posix fcntl byte range lock support on your
file system"

GFS2 is *supposed* to be posix(ish) compliant, isn't it. So I checked my GFS mounts... and found I didn't have any. :-( D'oh!

My GFS2 clvm volumes had become "unavailable", and I needed to lvchange 

So, guess what?

[root at ctdb-samba01 ~]# ctdb status
Number of nodes:4
pnn:0     OK (THIS NODE)
pnn:1     OK
pnn:2     OK
pnn:3     OK
hash:0 lmaster:0
hash:1 lmaster:1
hash:2 lmaster:2
hash:3 lmaster:3
Recovery mode:NORMAL (0)
Recovery master:0


Edging, slowly up the ctdb learning curve...

Andy D'Arcy Jewell

SysMicro Limited
Linux Support

More information about the samba-technical mailing list