Problem with ctdb_sys_have_ip

Harald Klatte klatte at hrz.uni-kassel.de
Fri Aug 28 10:58:01 MDT 2009


Hi,

testing ctdb on AIX (not on Linux) I detected a misbehaviour:
Ctdb tries to take over IP Adresses, even if they are still configured
according to ifconfig.

I determined the function "ctdb_sys_have_ip" in system_common.c as the cause.
It always returns false no matter if the concerning address is configured or not.

I wound up with inserting a debug line into the code:


/*
  see if we currently have an interface with the given IP

  we try to bind to it, and if that fails then we don't have that IP
  on an interface
 */
bool ctdb_sys_have_ip(ctdb_sock_addr *_addr)
{
        int s;
        int ret;
        ctdb_sock_addr __addr = *_addr;
        ctdb_sock_addr *addr = &__addr;

        switch (addr->sa.sa_family) {
        case AF_INET:
                addr->ip.sin_port = 0;
                break;
        case AF_INET6:
                addr->ip6.sin6_port = 0;
                break;
        }

        s = socket(addr->sa.sa_family, SOCK_STREAM, htons(IPPROTO_TCP));
        if (s == -1) {
                return false;
        }

        ret = bind(s, (struct sockaddr *)addr, sizeof(*addr));
        if (ret == -1) {
                DEBUG(DEBUG_CRIT,("  !! failed to bind address to socket (%s)\n", strerror(errno) ));
                return false;
        }

        close(s);
        return ret == 0;
}


Syslog shows:

2009/08/28 18:19:15.730807 [188748]: server/ctdb_recoverd.c:1541 Recovery - disabled recovery mode
2009/08/28 18:19:15.730894 [188748]: Deterministic IPs enabled. Resetting all ip allocations
2009/08/28 18:19:15.731116 [147676]:   !! failed to bind address to socket (Invalid argument)
2009/08/28 18:19:15.731210 [147676]: Takeover of IP xx.xx.xx.xx/24 on interface en1
2009/08/28 18:19:15.731891 [147676]:   !! failed to bind address to socket (Invalid argument)
2009/08/28 18:19:15.731980 [147676]: Takeover of IP xx.xx.xx.xx/24 on interface en1
2009/08/28 18:19:15.733006 [188748]: server/ctdb_recoverd.c:1552 Recovery - takeip finished
2009/08/28 18:19:15.733140 [147676]: Recovery has finished


this shows EINVAL while there is no obviously wrong data type.

Has someone seen this fault before?
Where can be the problem?


Thanks

Harald


-- 

+----------+  Harald Klatte                       email: klatte at hrz.uni-kassel.de
|Uni-Kassel|  ITS,  Universitaet Kassel                   Tel.: (49) 561/804-2280
+----------+  Moenchebergstr. 11, 34109 Kassel            Fax:  (49) 561/804-2297



More information about the samba-technical mailing list