Some thoughts on the external recovery lock helper

Richard Sharpe realrichardsharpe at gmail.com
Fri Jun 17 16:08:43 UTC 2016


On Fri, Jun 17, 2016 at 8:59 AM, Ira Cooper <ira at wakeful.net> wrote:
>
>
> On Fri, Jun 17, 2016 at 11:16 AM, Richard Sharpe
> <realrichardsharpe at gmail.com> wrote:
>>
>> On Fri, Jun 17, 2016 at 8:10 AM, Ira Cooper <ira at wakeful.net> wrote:
>> >
>> >
>> > On Fri, Jun 17, 2016 at 10:56 AM, Richard Sharpe
>> > <realrichardsharpe at gmail.com> wrote:
>> >>
>> >>
>> >> OK, so this part is the harder part. If the parent crashes then the
>> >> helper needs to exit. It is not impossible, but a design where the
>> >> helper is called as a (shared) library function gets this for free. We
>> >> have lots of experience with this sort of thing in Samba.
>> >>
>> >> However, it is not impossible.
>> >>
>> >
>> > Shouldn't we be able to detect that the pipe/socket that we are writing
>> > to
>> > has closed?
>>
>> (Replied on-list this time.)
>>
>> Maybe so, however, what about a failure where the ctdb process is in
>> an infinite loop somewhere and not responding to election requests,
>> but is holding the lock. The external process thus does not get
>> informed to release the lock.
>>
>> Of course, we could add keep-alives :-)
>
>
> "What would we do today?"  The problem should be identical, if I understand
> it correctly?

Hmmm, yes, you are correct. This is the benefit of discussing things.
Would need a watchdog to kill that instance of ctdb if it is hung in
that manner.

> BTW:
>
> Your API also doesn't cover the problem, you need a refresh call from CTDB
> to confirm it wants to hold the lock.

Hmmm, yes, that seems to be so. Any separate thread (in the etcd case)
could keep refreshing the TTL and in the Zookeeper case would not lose
its connection to ZK, so the ephemeral node would not go away.

> Otherwise the library likely will have an infinite lease on the key, or
> it'll have a thread refreshing it.  (which won't be in the infinite spin
> probably ;) )
>
> The infinite lease is the easiest implementation, and probably the one I'd
> use personally.  I guess we need a keep alive.
>
> Don't think about only one third party implementing this API.  I can see...
> 3-4 implementations off the top of my head?

Indeed. In a separate email that did not get sent to the list I
mentioned Zookeeper and maybe doozerd or whatever.

> etcd
> consul
> zookeeper
> Ceph MON (Possibly, I have talked to people about it.  Nothing concrete yet.
> But they have PAXOS, there.)
>
> Possibly Gluster could have its own API.  But I tend to doubt that today,
> based on what I'm seeing.
>
> In each case, the language of choice may be different.
>
> In the case of etcd, this discussion actually is making me wonder if I
> should write the "locking" code in go, instead of Python.

No opinions on this.

> I chose Python because it is one of the languages we use in Samba.  But is
> that a good reason not to use go here?
>
> ---
>
> That said, if this integration grows deeper, I may change my mind, quite
> easily.
>
> The potential for pushing our "persistent" databases up into etcd and
> friends is also quite real, and something we may want to think about as we
> do this.

Yes, this is indeed an interesting topic. We pushed some TDBs into
Zookeeper at Nutanix ... and used dbwrap as the entry point for that.

-- 
Regards,
Richard Sharpe
(何以解憂?唯有杜康。--曹操)



More information about the samba-technical mailing list