RAFT and CTDB

Richard Sharpe realrichardsharpe at gmail.com
Tue Nov 25 13:53:08 MST 2014


On Tue, Nov 25, 2014 at 12:25 PM, Rowland Penny <repenny241155 at gmail.com> wrote:
> On 25/11/14 20:09, Richard Sharpe wrote:
>>
>> On Tue, Nov 25, 2014 at 10:40 AM, Min Wai Chan <dcmwai at gmail.com> wrote:
>>>
>>> Dear All,
>>>
>>> What fail...
>>> Both CTDB will start non-stop recovery...
>>> When there is only one node, it is still working
>>> but not on both node...
>>>
>> This appears to be the problem:
>>
>> 2014/11/26 02:18:26.363173 [recoverd: 9883]: ctdb_control error:
>> 'managed to lock reclock file from inside daemon'
>> 2014/11/26 02:18:26.363257 [recoverd: 9883]: ctdb_control error:
>> 'managed to lock reclock file from inside daemon'
>> 2014/11/26 02:18:26.363292 [recoverd: 9883]: Async operation failed
>> with ret=-1 res=-1 opcode=16
>> 2014/11/26 02:18:26.363315 [recoverd: 9883]: Async wait failed -
>> fail_count=1
>> 2014/11/26 02:18:26.363334 [recoverd: 9883]:
>> server/ctdb_recoverd.c:393 Unable to set recovery mode. Recovery
>> failed.
>>
>> Something is going wrong with locking.
>>
>
> Do you think it could have anything to do with posix file locking ??

Hmmm, here is where the error is coming from:

samba-v.x.y.z/ctdb/server/ctdb_recoverd.c

        /* read the childs status when trying to lock the reclock file.
           child wrote 0 if everything is fine and 1 if it did manage
           to lock the file, which would be a problem since that means
           we got a request to exit from recovery but we could still lock
           the file   which at this time SHOULD be locked by the recovery
           daemon on the recmaster
        */
        ret = sys_read(state->fd[0], &c, 1);
        if (ret != 1 || c != 0) {
                ctdb_request_control_reply(state->ctdb, state->c,
NULL, -1, "managed to lock reclock file from inside daemon");
                talloc_free(state);
                return;
        }

What version of Samba is Chan Min Wai running? There are some missing
log messages that are in Master but not in the log above, so I suspect
he/she is running a different version to the code I currently have
available.

-- 
Regards,
Richard Sharpe
(何以解憂?唯有杜康。--曹操)


More information about the samba-technical mailing list