Setting up CTDB on OCFS2 and VMs ...

Sat Dec 13 06:55:31 MST 2014

On 2014-12-13 at 12:31 +0000, Rowland Penny wrote:
> 
> OK, I now have a single node up and running as per the
> instructions provided by Ronnie.

Yay!

> I just have a few questions:
> 
> there is this in the ctdb log
> 
> 2014/12/13 11:52:43.522708 [ 5740]: Set runstate to INIT (1)
> 2014/12/13 11:52:43.540992 [ 5740]: 00.ctdb: awk: line 2: function gensub
> never defined
> 2014/12/13 11:52:43.543178 [ 5740]: 00.ctdb: awk: line 2: function gensub
> never defined
> 2014/12/13 11:52:43.545354 [ 5740]: 00.ctdb: awk: line 2: function gensub
> never defined

In Debian, there are several possible awk versions
that provide awk: at least: mawk, gawk, original-awk.
Which is chosen if multiple are installed, depends on
the alternatives-mechanism:
  update-alternatives --display awk
  update-alternatives --edit awk

A quick web search has reveiled that only the gawk
(gnu awk) variant might feature the needed gensub
function.

Maybe we should change "awk" to "gawk" in our scripts
and packages would need to adapt their dependencies.

> 2014/12/13 11:52:56.931393 [recoverd: 5887]: We are still serving a public
> IP '127.0.0.3' that we should not be serving. Removing it
> 2014/12/13 11:52:56.931536 [ 5740]: Could not find which interface the ip
> address is hosted on. can not release it
> 2014/12/13 11:52:56.931648 [recoverd: 5887]: We are still serving a public
> IP '127.0.0.2' that we should not be serving. Removing it
> 
> The above three lines are there 4 times

I guess this will not be the case any more, when you
move to a more realistic setup where you don't use loopback
for hosting nodes internal and public addresses, but
for a start that is ok.

> the final 4 lines are:
> 
> 2014/12/13 11:53:02.982441 [ 5740]: monitor event OK - node re-enabled
> 2014/12/13 11:53:02.982480 [ 5740]: Node became HEALTHY. Ask recovery master
> 0 to perform ip reallocation
> 2014/12/13 11:53:02.982733 [recoverd: 5887]: Node 0 has changed flags - now
> 0x0  was 0x2
> 2014/12/13 11:53:02.983266 [recoverd: 5887]: Takeover run starting
> 2014/12/13 11:53:03.046859 [recoverd: 5887]: Takeover run completed
> successfully
> 
> ctdb status shows:
> 
> Number of nodes:1
> pnn:0 127.0.0.1        OK (THIS NODE)
> Generation:740799152
> Size:1
> hash:0 lmaster:0
> Recovery mode:NORMAL (0)
> Recovery master:0

Great!

> Now I know it works, I just have to pull it all together.

Right. Next step: take a "real" ethernet interface
and first use that for nodes address. You can even
start here with a single node.

You can also go towards more realistic clusters in two
steps: First no public addresse, only the nodes file.
That is the core of a ctdb cluster. Then you can go
towards cluster-resource management and add public
addresse and also CTDB_MANAGES_SAMBA and friends.

One further note:
Virtual machines or even containers (lxc or docker) are
awesome for setting up such clusters for learing and
testing. I use that for development myselves.

And here is one (imho) very neat trick:
If you use lxc containers (or docker can probably also
do that), you can completely take the complexity
of having to set up a cluster file system out of
the equation: You can just bind mount a directory
of the host file system into the node containers'
root file systems by the lxc fstab file.
Thereby you have a posix-file system that is shared
between the nodes and you can use that as cluster FS.

This way, you can concentrate on ctdb and samba immediately
until you are comfortable with that.

I wanted at some point to provide a mechanism to set
such a thing up automatically, by just providing some
config files. Maybe I'll investigate the vagrant+puppet
approach that Ralph Böhme has recently posted in this
or a related thread...

Cheers - Michael

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <http://lists.samba.org/pipermail/samba-technical/attachments/20141213/3efc2f75/attachment.pgp>