CTDB asymetric (non-)recovery

Nicolas Ecarnot nicolas at ecarnot.net
Wed Jun 6 15:02:57 MDT 2012


Le 06/06/2012 22:32, Martin Schwenke a écrit :
> On Wed, 06 Jun 2012 16:48:29 +0200, Nicolas Ecarnot
> <nicolas at ecarnot.net>  wrote:
>
>> Test 04
>> - both nodes down (ctdb stop)
>> - node 0 : ctdb start : OK
>> - node 1 : ctdb start : both OK
>> - node 0 : ctdb stop : OK
>> - node 0 : ctdb start : node 0 down, only node 1 OK !!!
>
>> [...]
>
>> What does my cluster is trying to whisper to my deaf ears?
>
> What does "onnode all ctdb scriptstatus" say?
>
> peace&  happiness,
> martin

Hi Martin,

When being in the bad part of cycling, with node 0 (192.168.42.200) not 
being able to recover, here is what you ask :

root at node0:~# onnode all ctdb scriptstatus

 >> NODE: 192.168.42.200 <<
monitor cycle never run

 >> NODE: 192.168.42.201 <<
18 scripts were executed last monitor cycle
00.ctdb              Status:OK    Duration:0.006 Wed Jun  6 22:47:52 2012
01.reclock           Status:OK    Duration:0.014 Wed Jun  6 22:47:52 2012
10.interface         Status:OK    Duration:0.021 Wed Jun  6 22:47:52 2012
11.routing           Status:OK    Duration:0.005 Wed Jun  6 22:47:52 2012
11.natgw             Status:DISABLED
13.per_ip_routing    Status:OK    Duration:0.006 Wed Jun  6 22:47:52 2012
20.multipathd        Status:DISABLED
31.clamd             Status:DISABLED
40.fs_use            Status:DISABLED
40.vsftpd            Status:DISABLED
41.httpd             Status:DISABLED
50.samba             Status:OK    Duration:0.024 Wed Jun  6 22:47:52 2012
50.samba.dpkg-old    Status:DISABLED
60.ganesha           Status:DISABLED
60.nfs               Status:DISABLED
62.cnfs              Status:DISABLED
70.iscsi             Status:DISABLED
91.lvs               Status:DISABLED


When being in a correct (OK | OK) situation, they both display the same 
scripts.

I have to add : when being in the infinite cycle of death (node0 unable 
to recover), stopping ctdb on node1 leads to node0 recovering well.

Martin : Does your question suggest the issue lies in the scripting part?

-- 
Nicolas Ecarnot


More information about the samba-technical mailing list