Some thoughts on the external recovery lock helper

Richard Sharpe realrichardsharpe at gmail.com
Fri Jun 17 16:53:50 UTC 2016


On Fri, Jun 17, 2016 at 9:39 AM, Ira Cooper <ira at wakeful.net> wrote:
>
>
> On Fri, Jun 17, 2016 at 12:08 PM, Richard Sharpe
> <realrichardsharpe at gmail.com> wrote:
>>
>> On Fri, Jun 17, 2016 at 8:59 AM, Ira Cooper <ira at wakeful.net> wrote:
>> >
>> >
>> > On Fri, Jun 17, 2016 at 11:16 AM, Richard Sharpe
>> > <realrichardsharpe at gmail.com> wrote:
>>
>> > "What would we do today?"  The problem should be identical, if I
>> > understand
>> > it correctly?
>>
>> Hmmm, yes, you are correct. This is the benefit of discussing things.
>> Would need a watchdog to kill that instance of ctdb if it is hung in
>> that manner.
>
>
> I think that would be useful, in general, though you'll see I have some
> reservations about it. :)
>
>> > BTW:
>> >
>> > Your API also doesn't cover the problem, you need a refresh call from
>> > CTDB
>> > to confirm it wants to hold the lock.
>>
>> Hmmm, yes, that seems to be so. Any separate thread (in the etcd case)
>> could keep refreshing the TTL and in the Zookeeper case would not lose
>> its connection to ZK, so the ephemeral node would not go away.
>
>
> The reverse problem is also there... What if CTDB does get caught up in
> bookkeeping for X seconds, and trips the timeout?
>
> Nobody says this is easy.  It isn't.  What I want is something where I can
> understand how it fails.
>
>> > Otherwise the library likely will have an infinite lease on the key, or
>> > it'll have a thread refreshing it.  (which won't be in the infinite spin
>> > probably ;) )
>> >
>> > The infinite lease is the easiest implementation, and probably the one
>> > I'd
>> > use personally.  I guess we need a keep alive.
>> >
>> > Don't think about only one third party implementing this API.  I can
>> > see...
>> > 3-4 implementations off the top of my head?
>>
>> Indeed. In a separate email that did not get sent to the list I
>> mentioned Zookeeper and maybe doozerd or whatever.
>>
>> > etcd
>> > consul
>> > zookeeper
>> > Ceph MON (Possibly, I have talked to people about it.  Nothing concrete
>> > yet.
>> > But they have PAXOS, there.)
>> >
>> > Possibly Gluster could have its own API.  But I tend to doubt that
>> > today,
>> > based on what I'm seeing.
>> >
>> > In each case, the language of choice may be different.
>> >
>> > In the case of etcd, this discussion actually is making me wonder if I
>> > should write the "locking" code in go, instead of Python.
>>
>> No opinions on this.
>
>
> That makes two of us.  I'll think on it a bit ;).
>
>>
>> > I chose Python because it is one of the languages we use in Samba.  But
>> > is
>> > that a good reason not to use go here?
>> >
>> > ---
>> >
>> > That said, if this integration grows deeper, I may change my mind, quite
>> > easily.
>> >
>> > The potential for pushing our "persistent" databases up into etcd and
>> > friends is also quite real, and something we may want to think about as
>> > we
>> > do this.
>>
>> Yes, this is indeed an interesting topic. We pushed some TDBs into
>> Zookeeper at Nutanix ... and used dbwrap as the entry point for that.
>
>
> Is there any chance that code could get out? :)

In principal :-) I will ask. I can outline the code that was written.

> I might want to do the same thing for etcd, or some other database.
>
> We may also want the "locking" and the "database" to be separate.

Indeed. There are two locking issues as well:

1. The recovery lock that ctdb needs, and

2. The need to lock records in the tdb store as they are being messed with.

These things are logically separate although they might use the same
underlying mechanisms.

-- 
Regards,
Richard Sharpe
(何以解憂?唯有杜康。--曹操)



More information about the samba-technical mailing list