CTDB asymetric (non-)recovery

ronnie sahlberg ronniesahlberg at gmail.com
Thu Jun 7 06:45:40 MDT 2012


What does 'ctdb scriptstatus all' show.
Sounds like it is stuck, failing to start up the services proeprly.

Also, check a 'pstree -p -a' and see what child processes ctdbd has
during the start. It should probably be a shell script there as a
child that is failing.



On Thu, Jun 7, 2012 at 10:15 PM, Nicolas Ecarnot <nicolas at ecarnot.net> wrote:
> Le 07/06/2012 12:22, Nicolas Ecarnot a écrit :
>
>> I increased the log level to 9 (damn, this IS verbose), and I try to
>> extract the relevant part of the loop, on the failing node (though yet
>> nothing is proving me that the unhealthy node _is_ the faulty one).
>>
>> The log file is here : http://pastebin.com/YEwrkmPx
>
>
> Comparing the verbose log files between a nice recovery and a failing one, I
> see in the good situation that :
>
> | [recoverd: 3351]: The interfaces status has changed on local node 1 -
> force takeover run
>
> Followed by :
>
> | [recoverd: 3351]: Trigger takeoverrun
>
> In the bad case, this takeoverrun never gets triggered (nothing in the log
> file)
> Reading the source code, I see the function implied is
> verify_local_ip_allocation but I don't understand how one could get out of
> this function without yelding an error message in-between?
>
> --
> Nicolas Ecarnot


More information about the samba-technical mailing list