RAFT and CTDB

Scott Lovenberg scott.lovenberg at gmail.com
Fri Dec 5 22:57:08 MST 2014


On Fri, Dec 5, 2014 at 10:55 PM, Martin Schwenke <martin at meltin.net> wrote:
>
> On Thu, 04 Dec 2014 14:55:06 +0000, Rowland Penny
> <repenny241155 at gmail.com> wrote:
>
> > Steve's replies could have been worded better and Richard has spent a
> > considerable time showing that OCFS2 will work with CTDB. The only
> > problem that I can see is, whilst Richard used Centos and the Oracle
> > kernel, neither of the two people who are having/have had problems with
> > CTDB use this distro or kernel. If CTDB will only work with Centos and
> > the Oracle kernel, then in my opinion, it has problems. The only way to
> > prove this one way or the other, is for Richard to post his setup and
> > for someone to try and set it up on Debian (other distros are available
> > ;-) ).
>
> We try to make CTDB as filesystem and distribution agnostic as we can.
> Distribution-wise there simply aren't any hard dependencies, though
> there may be minor bugs.  The daemon has been tested on Linux, FreeBSD
> and AIX.  Unfortunately, the eventscripts contain a lot of
> Linux-specific code so they're not as portable.
>
> As various people have mentioned, the default filesystem requirement is fcntl(2)
> locking support to support the CTDB recovery lock.  There's an assumption
> that if the ping_pong test succeeds then CTDB's recovery lock will
> work.  If that's not true then we need to create a new test.  However,
> I don't believe anyone has conclusively shown ping_pong not to be a
> reliable test.
>
>
> To support testing on RHEL we have autocluster
> (git://git.samba.org/autocluster.git), which Tridge started back in
> 2008 and I have been hacking on since.  It generates
> virtual clusters of RHEL nodes running clustered Samba.  I
> have previously (though not recently) tested it with
> CentOS.  Unfortunately we've never got around to creating a web site or
> writing extensive documentation. Earlier this year I did quite a lot of
> modularisation work on autocluster.  A couple of weeks ago, when this
> thread started, Amitay and I spent a couple of hours trying to add
> OCFS2 support to autocluster.  It was quite trivial to add something
> that we think will work but we haven't been able to test it because we
> simply couldn't make OCFS2 work with RHEL6.  With Richard's
> instructions, posted elsewhere, we'll try to finish this effort when we
> get time. Perhaps that will then encourage others to provide scripts to
> support other cluster filesystems.
>
> Can autocluster work with other (e.g. Debian-based) distros?  Sorry, not
> really.  There are a lot of assumptions about kickstart, network
> configuration and location of configuration files, along with copious
> use of yum.  :-(
>
> However, the modularisation work I did should allow much of the
> post-boot package installation and configuration to be replaced by
> something like Chef (from OpenStack).  Someone would need a
> non-trivial slab of time to make that change.  When we've done that it
> should be a little easier for autocluster to be distribution
> agnostic... and I'm guessing that the part that people are really
> interested in is the post-boot configuration, since that doesn't
> depend on the cluster being virtual.
>
> It all takes time...
>
> peace & happiness,
> martin

As luck would have it, at work I wrote a chef cookbook for clustered
MySQL (I'm getting around to fixing some serious problems with it
before I put it on Github), so I'd like to give some advice.  RUN!
Don't look back, just run!

Now that I have that out of the way... My cookbook was only really for
RHEL.  Chef does provide a few nice abstractions for dealing with
stuff like package management and such, but at some point you're going
to have to cut some Ruby libs if you want to stay sane.  If you're
into Ruby this might not be a huge deal, but I stumbled through stuff
like patching cookbooks I was dependent on.  You will be relying on
other cookbooks if you use the "wrapper cookbook" method, which I
would recommend.

After fixing glaring bugs in other cookbooks I had really mixed
results upstreaming my patches; some authors/groups were great to deal
with (mostly those who work at companies that fund the development),
while other pull requests will sit and finally there's the "here's a
bug fix for a glaring piece of code that never worked" to which the
response will be "yeah, this isn't really the {chef, ruby, berkshelf,
etc} way to fix that."  You'll spin up something that works and they
won't like that either.  At this point you resign yourself to the fact
that they aren't going to fix the problem where they reference a
variable that doesn't exist that you need in a template, so now you're
maintaining a fork of another cookbook. Yes, that did happen and
nearly three months later it hasn't been fixed.  Yes, the cookbook
will still give you an invalid config for Apache and Apache will never
start on certain enterprise distros that rhyme with FELL.

Now that you've got a working cookbook, there's stateful data that you
have to keep track of and chef doesn't deal with race conditions (for
instance, you want to create a bunch of nodes at once and they need
config info from other nodes that are still bootstrapping chef),
that's your problem.  For this I'd recommend to just create a
Zookkeeper server from the start and write a lib for your cookbook to
interface it.  You'll try databags and and vaults, but eventually
you'll end up realizing that they're not the correct tool for the job
regardless of how hard you try to pound that square into a round hole.

I truly don't want to discourage anyone from getting this working on
Chef, but you should know ahead of time that there are some hurdles
that you don't even know you'll deal with until you are confronted by
them.  I could go into more detail and provide code (I've still got
about a dozen patches for the zookkeeper-bridge library to make it
functional that I haven't upstreamed yet because they're hackish) for
anyone that is curious, but I'm sure I lost almost all of you by this
point in the post :).  I spent the better part of three months dealing
with this as a side project at work, so I might be able to save you a
bit of time and hassle.  If anyone wants to take this on, feel free to
ping me.
-- 
Peace and Blessings,
-Scott.


More information about the samba-technical mailing list