CTDB We are still serving a public address...

Martin Schwenke martin at meltin.net
Sun Oct 28 04:28:05 MDT 2012


On Sat, 27 Oct 2012 20:26:18 -0600, patrick medina
<pgmedinajr at gmail.com> wrote:

> I thought I had CTDB down, but it looks like I'm running into another
> issue.
> 
> I have 2 nodes with 2 public addresses this way I can RRDNS, this worked
> fine and was tested for  until I thought it was good to go.  Last night one
> of our nodes went down and after the reboot and I'm getting the following
> in the logs:  (The 2nd node is up and happy, but this guy remains on
> unhealthy)
> 
> 
> 2012/10/27 19:53:00.678119 [ 1510]: server/ctdb_takeover.c:813 release_ip
> of IP x.x.x.x is
> known to the kernel, but we have no interface assigned, has someone
> manually configured it? Ignore for now.
> [...]

This is a bug that we've fixed at a couple of different levels in
CTDB.  There should be a public release of CTDB very soon that includes
this fix.

Right now you should be able to work around this by manually removing
the IP shown in the message using "ip addr del ...".

If you've built CTDB from sources obtained via git then you could
rebuild after cherry-picking the following patches that repair a node
when it gets into this state:

  c6bf22ba5c01001b7febed73dd16a03bd3fd2bed
  f07376309e70f5ccdb7de8453caacc71b451ab48

You can use "git show <sha>" to see what the patches do.

This problem is often caused by a race between a node taking over an IP
and releasing it in quick succession.  We also have some fixes for the
race...

> Did I configure RRDNS wrong, on my dev box this worked like a charm but
> once I went production it's not so happy.  :/

Probably bad luck that you've hit the race only after going
into production...  :-(

peace & happiness,
martin


More information about the samba-technical mailing list