CTDB - all nodes "unhealthy"
ronnie sahlberg
ronniesahlberg at gmail.com
Fri Mar 23 04:16:07 MDT 2012
Ok so it doesnt even startup properly, so it fails in the startup
event or the initial recovery.
Shut down completely and delete the /var/log/log.ctdb files so we get
a clean trace
and restart ctdb on all nodes.
Then post the log.ctdb file after it has been running for about 3 minutes
On Fri, Mar 23, 2012 at 9:11 PM, Andy D'Arcy Jewell
<andy.jewell at sysmicro.co.uk> wrote:
> On 23/03/12 09:12, ronnie sahlberg wrote:
>>
>> ctdb scriptstatus
>>
>> will often tell the reason why the nodes are unhealthy
>>
>
> Here are the results of running this:
>
> [root at ctdb-samba01 ~]# ctdb scriptstatus
> monitor cycle never run
> [root at ctdb-samba01 ~]# ctdb scriptstatus all
> 14 scripts were executed last startup cycle
> 00.ctdb Status:OK Duration:0.024 Fri Mar 23 10:00:21 2012
> 01.reclock Status:OK Duration:0.017 Fri Mar 23 10:00:21 2012
> 10.interface Status:OK Duration:0.021 Fri Mar 23 10:00:21 2012
> 11.natgw Status:OK Duration:0.010 Fri Mar 23 10:00:21 2012
> 11.routing Status:OK Duration:0.010 Fri Mar 23 10:00:21 2012
> 20.multipathd Status:OK Duration:0.010 Fri Mar 23 10:00:21 2012
> 31.clamd Status:OK Duration:0.011 Fri Mar 23 10:00:21 2012
> 40.vsftpd Status:OK Duration:0.011 Fri Mar 23 10:00:21 2012
> 41.httpd Status:OK Duration:0.011 Fri Mar 23 10:00:21 2012
> 50.samba Status:OK Duration:0.834 Fri Mar 23 10:00:21 2012
> 60.nfs Status:OK Duration:0.011 Fri Mar 23 10:00:22 2012
> 61.nfstickle Status:OK Duration:0.013 Fri Mar 23 10:00:22 2012
> 70.iscsi Status:OK Duration:0.008 Fri Mar 23 10:00:22 2012
> 91.lvs Status:OK Duration:0.008 Fri Mar 23 10:00:22 2012
> 14 scripts were executed last startrecovery cycle
> 00.ctdb Status:OK Duration:0.011 Fri Mar 23 10:06:00 2012
> 01.reclock Status:OK Duration:0.010 Fri Mar 23 10:06:00 2012
> 10.interface Status:OK Duration:0.011 Fri Mar 23 10:06:00 2012
> 11.natgw Status:OK Duration:0.010 Fri Mar 23 10:06:00 2012
> 11.routing Status:OK Duration:0.010 Fri Mar 23 10:06:00 2012
> 20.multipathd Status:OK Duration:0.010 Fri Mar 23 10:06:00 2012
> 31.clamd Status:OK Duration:0.011 Fri Mar 23 10:06:00 2012
> 40.vsftpd Status:OK Duration:0.011 Fri Mar 23 10:06:00 2012
> 41.httpd Status:OK Duration:0.011 Fri Mar 23 10:06:00 2012
> 50.samba Status:OK Duration:0.011 Fri Mar 23 10:06:00 2012
> 60.nfs Status:OK Duration:0.010 Fri Mar 23 10:06:00 2012
> 61.nfstickle Status:OK Duration:0.012 Fri Mar 23 10:06:00 2012
> 70.iscsi Status:OK Duration:0.008 Fri Mar 23 10:06:00 2012
> 91.lvs Status:OK Duration:0.008 Fri Mar 23 10:06:00 2012
> recovered cycle never run
> takeip cycle never run
> releaseip cycle never run
> stopped cycle never run
> monitor cycle never run
> status cycle never run
> shutdown cycle never run
> reload cycle never run
>
> That's confusing, as it all looks ok (to my neophyte eye).
>
> Just for good measure, here's ctdb status:
> [root at ctdb-samba01 ~]# ctdb status
> Number of nodes:4
> pnn:0 172.16.6.180 UNHEALTHY (THIS NODE)
> pnn:1 172.16.6.181 UNHEALTHY
> pnn:2 172.16.6.182 UNHEALTHY
> pnn:3 172.16.6.183 UNHEALTHY
> Generation:974418464
> Size:4
> hash:0 lmaster:0
> hash:1 lmaster:1
> hash:2 lmaster:2
> hash:3 lmaster:3
> Recovery mode:RECOVERY (1)
> Recovery master:0
>
> It oscillates between "Recovery mode:RECOVERY (1)" and "Recovery mode:NORMAL
> (0)" every few seconds.
>
>
> --
> Andy D'Arcy Jewell
>
> SysMicro Limited
> Linux Support
>
More information about the samba-technical
mailing list