RAFT and CTDB

Min Wai Chan dcmwai at gmail.com
Sun Nov 23 22:31:52 MST 2014


Dear Steve,

Ok ok, Will send it over then there are on one login to the server maybe
during mid night.

I saw something about the ping_ping -RW test
something like below?

ping_pong -rw test.dat N

Do you need this test result as well?

I just wonder if we mount the ocfs2 wrong?
If there is a way to mount ocfs2 via user space?


Thank You.




On Mon, Nov 24, 2014 at 3:50 AM, steve <steve at steve-ss.com> wrote:

> On 21/11/14 03:08, Chan Min Wai wrote:
> > Dear Martin,
> >
> > Since we have touch the lock.
> > I've some experience with it where I'd lock are define.
> >
> > I point the lock to the shared ocfs2 cluster.
> >
> > CTDB Will not start and kept on asking for lock.
> >
> > Which is something I'm not sure.
> >
> > I follow this guide.
> >
> http://linuxcostablanca.blogspot.com/2014/07/samba4-cluster-for-ad-drbd-ocfs2-ctdb.html?m=1
> >
> > The different is that my ocfs2 are shared storage between the 2 node and
> thus no Drbd.
> >
> > Does the lock really work on this scenario?
> >
> > Thank you.
> >
> > Ps sorry to cut in as such.
> >
> > Regards,
> > Min Wai, Chan
> >
> >
> >
> >> Martin Schwenke <martin at meltin.net> 於 2014年11月21日 上午8:04 寫道:
> >>
> >> On Thu, 20 Nov 2014 15:55:39 -0800, Richard Sharpe
> >> <realrichardsharpe at gmail.com> wrote:
> >>
> >>>> On Thu, Nov 20, 2014 at 3:41 PM, Martin Schwenke <martin at meltin.net>
> wrote:
> >>>> On Thu, 20 Nov 2014 15:24:39 -0800, Richard Sharpe
> >>>> <realrichardsharpe at gmail.com> wrote:
> >>>>
> >>>>> Hmmm, so the essential abstraction here is that any node that is no
> >>>>> longer a member of the cluster (because it can't get a lock on that
> >>>>> file) cannot try to run recovery. Ie, in ctdb_recovery_lock we try to
> >>>>> open the recovery lock file and then take out a lock on it.
> >>>>>
> >>>>> The first should/will fail if we are no longer a member of the
> cluster
> >>>>> and the second will fail if the cluster properly supports fcntl locks
> >>>>> but another recovery daemon has already locked the file ...
> >>>>
> >>>> No, only the recovery master can hold the recovery lock.  Other nodes
> >>>> would not be able to take the lock but they are still cluster members.
> >>>
> >>> Isn't that what I said? When I said cluster above I was referring to a
> >>> GPFS cluster.
> >>
> >> CTDB has its own independent notion of cluster membership and I thought
> >> you were referring to that.  I didn't notice you mentioning GPFS.  :-)
> >>
> >>>> Cluster membership is defined by being connected to the node that is
> >>>> currently the recovery master.  That is, nodes that the recovery
> master
> >>>> knows about (i.e. connected) and are active (i.e. not stopped or
> >>>> banned) will take part in recovery.
> >>>
> >>> OK, that is a wrinkle I had not thought of. What if they have lost
> >>> connection to the GPFS cluster but are still talking to the recovery
> >>> master?
> >>
> >> Then you would hope that they can't take the recovery lock.  ;-)
> >>
> >> If a node in a break-away cluster (i.e. lost CTDB connection with
> >> main cluster - perhaps just 1 node) wins an election then it will try to
> >> become recovery master.  When it tries to take the recovery lock and
> >> fails it will ban itself.  Rinse and repeat for other nodes in the
> >> break-away cluster.
> >>
> >> So, provided nodes in a break-away cluster can't take the recovery lock
> >> then they will all get banned and can do no harm.
> >>
> >> If such nodes can still take the recovery lock after being expelled
> >> from the GPFS cluster then you should probably have the appropriate GPFS
> >> callback shutdown CTDB.  Depending on the CTDB configuration, this will
> >> probably take down Samba and other services, preventing any issues.
> >>
> >> peace & happiness,
> >> martin
>
> @Chan: Please see the thread: 'Re: posix locking on OCFS2'
> We are being asked for information to solve the lock problem:)
> You will most likely be able to supply:
>
> - precise versions of software used (file system, ctdb, ...)
> - exact description of what fails
> - configuration (ctdb, file system, ...)
> - logs (ctdb, syslog/file system ...)
>
> Cheers,
> Steve
>
>


More information about the samba-technical mailing list