CTDB complains about "net serverid" and Samba doesn't bind to public IPs
Alexander
forsmbg at googlemail.com
Thu Sep 2 07:55:09 MDT 2010
Hi Samba Team,
I'm trying to setup a simple test cluster with CTDB. The OS is
SLES10SP3, Samba is 3.5.4, installed with RPMs for SLES10 from
enterprisesamba.com. This is on two VMware Server 2.0.2 VMs with 1Gb
RAM each.
I've tried to pull CTDB sources both using rsync and git pull ways
listed in Wiki and CTDB main page, they don't seem to differ.
Looks like there's no "net serverid" command in 3.5.4 and CTDB's
events.d/50.samba tries to call it.
Second problem is that while CTDB assigns proper public IPs to the
interface, Samba doesn't bind to them (when started without CTDB it
does).
And the tird one is that it sometimes crashes almost right after
start, the log snippet is below.
I'm using no lockfile at the moment - just to make things easier at
the beginning and ensure it can start at all.
Could anyone please take a look and suggest something?
=======
public_addresses:
10.125.136.56/24 eth0
10.125.136.57/24 eth0
=======
nodes:
192.168.10.128
192.168.10.129
=======
ip addr show when ctdb is running:
1: lo: <LOOPBACK,UP> mtu 16436 qdisc noqueue
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 brd 127.255.255.255 scope host lo
inet 127.0.0.2/8 brd 127.255.255.255 scope host secondary lo
2: eth0: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000
link/ether 00:0c:29:c6:7f:8f brd ff:ff:ff:ff:ff:ff
inet 10.125.136.21/24 brd 10.125.136.255 scope global eth0
inet 10.125.136.56/24 brd 10.125.136.255 scope global secondary eth0
inet 10.125.136.57/24 brd 10.125.136.255 scope global secondary eth0
3: eth1: <BROADCAST,MULTICAST,NOTRAILERS,UP> mtu 1500 qdisc pfifo_fast qlen 1000
link/ether 00:0c:29:c6:7f:99 brd ff:ff:ff:ff:ff:ff
inet 192.168.10.129/24 brd 192.168.10.255 scope global eth1
=======
I have the following in CTDB log (default ERR verbosity) when it does run:
2010/09/02 17:09:15.229176 [23205]: Recovery lock file set to "".
Disabling recovery lock checking
2010/09/02 17:09:15.382620 [23208]: Starting CTDBD as pid : 23208
2010/09/02 17:09:15.965742 [23208]: Freeze priority 1
2010/09/02 17:09:15.966016 [23208]: Freeze priority 2
2010/09/02 17:09:15.966166 [23208]: Freeze priority 3
2010/09/02 17:09:18.972098 [recoverd:23271]: Trigger takeoverrun
2010/09/02 17:09:18.973317 [23208]: Freeze priority 1
2010/09/02 17:09:18.973908 [23208]: Freeze priority 2
2010/09/02 17:09:18.974167 [23208]: Freeze priority 3
2010/09/02 17:09:19.657790 [23208]: Thawing priority 1
2010/09/02 17:09:19.658025 [23208]: Release freeze handler for prio 1
2010/09/02 17:09:19.658210 [23208]: Thawing priority 2
2010/09/02 17:09:19.658267 [23208]: Release freeze handler for prio 2
2010/09/02 17:09:19.658333 [23208]: Thawing priority 3
2010/09/02 17:09:19.658379 [23208]: Release freeze handler for prio 3
2010/09/02 17:09:21.001663 [recoverd:23271]: Resetting ban count to 0
for all nodes
2010/09/02 17:09:34.104006 [23208]: 2010/09/02 17:09:34.103349
[23534]: Database 'config.tdb' does not exist
2010/09/02 17:09:34.852080 [23208]: Invalid command: net serverid
2010/09/02 17:09:34.853514 [23208]: Usage:
2010/09/02 17:09:34.853633 [23208]: net rpc Run functions
using RPC transport
2010/09/02 17:09:34.853687 [23208]: net rap Run functions
using RAP transport
2010/09/02 17:09:34.853689 [23208]: net ads Run functions
using ADS transport
2010/09/02 17:09:34.853691 [23208]: net file Functions on
remote opened files
2010/09/02 17:09:34.853693 [23208]: net share Functions on shares
2010/09/02 17:09:34.853694 [23208]: net session Manage sessions
2010/09/02 17:09:34.853978 [23208]: net server List servers
in workgroup
2010/09/02 17:09:34.854017 [23208]: net domain List
domains/workgroups on network
2010/09/02 17:09:34.854054 [23208]: net printq Modify printer queue
2010/09/02 17:09:34.854089 [23208]: net user Manage users
2010/09/02 17:09:34.854123 [23208]: net group Manage groups
2010/09/02 17:09:34.854158 [23208]: net groupmap Manage group mappings
2010/09/02 17:09:34.854192 [23208]: net sam Functions on
the SAM database
2010/09/02 17:09:34.854245 [23208]: net validate Validate
username and password
2010/09/02 17:09:34.854280 [23208]: net groupmember Modify group memberships
2010/09/02 17:09:34.854315 [23208]: net admin Execute remote
command on a remote OS/2 server
2010/09/02 17:09:34.854391 [23208]: net service List/modify
running services
2010/09/02 17:09:34.854429 [23208]: net password Change user
password on target server
2010/09/02 17:09:34.854469 [23208]: net changetrustpw Change the
trust password
2010/09/02 17:09:34.854505 [23208]: net changesecretpw Change the
secret password
2010/09/02 17:09:34.854544 [23208]: net setauthuser Set the
winbind auth user
2010/09/02 17:09:34.854623 [23208]: net getauthuser Get the
winbind auth user settings
2010/09/02 17:09:34.854773 [23208]: net time Show/set time
2010/09/02 17:09:34.854807 [23208]: net lookup Look up host
names/IP addresses
2010/09/02 17:09:34.854840 [23208]: net g_lock Manipulate the
global lock table
2010/09/02 17:09:34.854874 [23208]: net join Join a domain/AD
2010/09/02 17:09:34.854908 [23208]: net dom Join/unjoin
(remote) machines to/from a domain/AD
2010/09/02 17:09:34.854943 [23208]: net cache Operate on the
cache tdb file
2010/09/02 17:09:34.855049 [23208]: net getlocalsid Get the SID
for the local domain
2010/09/02 17:09:34.855085 [23208]: net setlocalsid Set the SID
for the local domain
2010/09/02 17:09:34.855131 [23208]: net setdomainsid Set domain SID
on member servers
2010/09/02 17:09:34.855166 [23208]: net getdomainsid Get domain SID
on member servers
2010/09/02 17:09:34.855200 [23208]: net maxrid Display the
maximul RID currently used
2010/09/02 17:09:34.855234 [23208]: net idmap IDmap functions
2010/09/02 17:09:34.855269 [23208]: net status Display server status
2010/09/02 17:09:34.855275 [23208]: net usershare Manage
user-modifiable shares
2010/09/02 17:09:34.855280 [23208]: net usersidlist Display list
of all users with SID
2010/09/02 17:09:34.855284 [23208]: net conf Manage Samba
registry based configuration
2010/09/02 17:09:34.855288 [23208]: net registry Manage the
Samba registry
2010/09/02 17:09:34.855437 [23208]: net eventlog Process Win32
*.evt eventlog files
2010/09/02 17:09:34.855472 [23208]: net help Print usage information
2010/09/02 17:09:34.970420 [23208]: Unable to allocate transport
packet for operation 7 of length 1852731295
2010/09/02 17:09:34.970491 [23208]: Out of memory for c at
server/ctdb_control.c:788
2010/09/02 17:09:34.970529 [23208]: ctdb error: Out of memory at
server/ctdb_control.c:788
2010/09/02 17:09:34.970563 [23208]: server/ctdb_daemon.c:1029 Failed
to send control to remote node 1
2010/09/02 17:09:34.983807 [23208]: Starting SAMBA nmbd :..done
2010/09/02 17:09:35.020839 [recoverd:23271]: Trigger takeoverrun
2010/09/02 17:09:35.104193 [23208]: Unable to allocate transport
packet for operation 7 of length 1919116719
2010/09/02 17:09:35.104373 [23208]: Out of memory for c at
server/ctdb_control.c:788
2010/09/02 17:09:35.104417 [23208]: ctdb error: Out of memory at
server/ctdb_control.c:788
2010/09/02 17:09:35.104452 [23208]: server/ctdb_daemon.c:1029 Failed
to send control to remote node 1
2010/09/02 17:09:35.199770 [23208]: Starting SAMBA smbd :..done
2010/09/02 17:09:37.503404 [recoverd:23271]: Trigger takeoverrun
2010/09/02 17:09:43.321626 [23208]: ERROR: samba tcp port 445 is not responding
2010/09/02 17:09:49.109093 [23208]: ERROR: samba tcp port 445 is not responding
2010/09/02 17:09:59.850374 [23208]: ERROR: samba tcp port 445 is not responding
<the last line keeps coming>
=======
And the following when it crashes:
2010/09/02 16:47:12.633279 [20763]: Recovery lock file set to "".
Disabling recovery lock checking
2010/09/02 16:47:12.770414 [20765]: Starting CTDBD as pid : 20765
2010/09/02 16:47:13.457077 [20765]: Freeze priority 1
2010/09/02 16:47:13.457496 [20765]: Freeze priority 2
2010/09/02 16:47:13.457680 [20765]: Freeze priority 3
2010/09/02 16:47:16.461269 [recoverd:20828]: Trigger takeoverrun
2010/09/02 16:47:16.462253 [20765]: Freeze priority 1
2010/09/02 16:47:16.462483 [20765]: Freeze priority 2
2010/09/02 16:47:16.462609 [20765]: Freeze priority 3
2010/09/02 16:47:17.191078 [20765]: Thawing priority 1
2010/09/02 16:47:17.191259 [20765]: Release freeze handler for prio 1
2010/09/02 16:47:17.191586 [20765]: Thawing priority 2
2010/09/02 16:47:17.191589 [20765]: Release freeze handler for prio 2
2010/09/02 16:47:17.191793 [20765]: Thawing priority 3
2010/09/02 16:47:17.191841 [20765]: Release freeze handler for prio 3
2010/09/02 16:47:18.572036 [recoverd:20828]: Resetting ban count to 0
for all nodes
2010/09/02 16:47:32.563885 [20765]: 2010/09/02 16:47:32.563316
[21109]: Database 'config.tdb' does not exist
2010/09/02 16:47:33.449435 [20765]: Unable to allocate transport
packet for operation 7 of length 3086770620
2010/09/02 16:47:33.449570 [20765]: Out of memory for c at
server/ctdb_control.c:788
2010/09/02 16:47:33.449609 [20765]: ctdb error: Out of memory at
server/ctdb_control.c:788
2010/09/02 16:47:33.449644 [20765]: server/ctdb_daemon.c:1029 Failed
to send control to remote node 1
2010/09/02 16:47:33.554365 [20765]: Starting SAMBA nmbd :..done
2010/09/02 16:47:33.595716 [recoverd:20828]: Trigger takeoverrun
2010/09/02 16:47:33.676324 [20765]:
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
2010/09/02 16:47:33.676459 [20765]: INTERNAL ERROR: Signal 11 in ctdbd
pid 207652010/09/02 16:47:33.676494 [20765]:
Please read the file BUGS.txt in the distribution
2010/09/02 16:47:33.676526 [20765]:
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
2010/09/02 16:47:33.676559 [20765]: PANIC: internal error
2010/09/02 16:47:33.677444 [20765]: BACKTRACE: 19 stack frames:
#0 /usr/sbin/ctdbd [0x8092e87]
#1 /usr/sbin/ctdbd [0x8093147]
#2 /usr/sbin/ctdbd [0x8093267]
#3 /usr/sbin/ctdbd [0x809329b]
#4 [0xffffe420]
#5 /usr/sbin/ctdbd [0x804e3d7]
#6 /usr/sbin/ctdbd [0x804ccfe]
#7 /usr/sbin/ctdbd [0x804cebc]
#8 /usr/sbin/ctdbd [0x808af04]
#9 /usr/sbin/ctdbd [0x808b561]
#10 /usr/sbin/ctdbd [0x80a7c3b]
#11 /usr/sbin/ctdbd [0x80a8236]
#12 /usr/sbin/ctdbd [0x80a4584]
#13 /usr/sbin/ctdbd [0x80a478b]
#14 /usr/sbin/ctdbd [0x80a483d]
#15 /usr/sbin/ctdbd [0x804dbe8]
#16 /usr/sbin/ctdbd [0x804b91f]
#17 /lib/libc.so.6(__libc_start_main+0xdc) [0xb7eb189c]
#18 /usr/sbin/ctdbd [0x804a5f1]
2010/09/02 16:47:33.735159 [recoverd:20828]: recovery daemon parent
died - exiting
=======
cheers,
Alexander
More information about the samba-technical
mailing list