CTDB: Split brain and banning

Michel Buijsman michelb at bit.nl
Thu Nov 1 16:38:19 UTC 2018


On Thu, Nov 01, 2018 at 12:07:36PM +1100, Martin Schwenke wrote:
> > That node doesn't seem to recover either after getting back on the network:
> > It still thinks it's the recovery master and will keep trying for a reclock,
> > getting lock contention, without resetting itself.
> 
> I don't understand this.  :-(
> 
> There are 2 distinct cases that I can see:
> 
> * The node that was recovery master reboots or, at least, ctdbd
>   restarts
> 
>   In this case it doesn't know which node is master so should call an
>   election, so this shouldn't happen.

Yes this is fine, if the node crashes it's ok.

> * The node dropped off the network and reconnected
> 
>   In this case the node has missed the election, so still thinks it is
>   master. However, in this case the run-state should not be "first
>   recovery", so the node should be banned if it can't take the lock.
> 
> Hmmm... is the real problem that the node still thinks it is master
> *and* it still thinks it has the reclock?

This is the one.

There are 2 subtly different cases here:

1. The node has ALL the IPs:

In this case, _nothing_ happens. As in, the node is OK and stays OK,
holding all the IPs and is in a separate reality from the other nodes.
All IPs are up on this node and on the other two as well.

2. The node has some of the IPs:

In this case, it thinks it has the lock and is master, it ignores the
fact that it can't renew the lock:
    "Time out getting recovery lock, allowing recmode set anyway"

Then it starts to take over IPs, which fails due to no network and the 
node gets banned, releasing the IPs.

When the network comes back, this doesn't recover. It's stuck in a loop
where lock contention triggers an instant ban. I have to manually restart 
ctdb on this node to get out of it.

I've attached the relevant logs of case #2, for all 3 servers.

> > Grepping the source, ignoring the reclock when it times out seems to be a 
> > conscious decision. This strikes me as odd since it directly leads to split 
> > brain in this case. I would expect it to fail hard on not getting a lock. 
> > Would it be possible to make this behaviour configurable with a tunable?
> 
> I obviously need to think about that.  :-)
> 
> I wonder if we need a configurable number of timeouts during which the
> lock is retried... and then finally banned.  This relates to David's
> question about whether the helper should block and retry internally -
> that seems like a better solution.  However, then we hit the fixed 5
> second time-out that is allowed for taking the recovery lock.  Perhaps
> that needs to be configurable.  Then David's suggestion could work.

If it were me... ;-)

I'd like to see ctdb not ignore the lock, or at least have this as a 
tuneable option. That should prevent all of the split brain stuff in
our scenario.

Next to that, there's a tight loop where contention leads to a ban and
it doesn't recover. At some point it'd need to look at who else has a 
lock and if something is happening with the other nodes that it might
want to join in on.

-- 
Michel Buijsman
BIT BV | Unix beheer | michelb at bit.nl | 08B90948
-------------- next part --------------
2018/11/01 16:04:12.062246 ctdbd[25978]: dead count reached for node 1
2018/11/01 16:04:12.062275 ctdbd[25978]: 2001:7b8:80:1:0:1:9:1:4379: node 2001:7b8:80:3:0:1:9:2:4379 is dead: 1 connected
2018/11/01 16:04:12.062297 ctdbd[25978]: Tearing down connection to dead node :1
2018/11/01 16:04:12.062363 ctdbd[25978]: dead count reached for node 2
2018/11/01 16:04:12.062382 ctdbd[25978]: 2001:7b8:80:1:0:1:9:1:4379: node 2001:7b8:80:2:0:1:9:3:4379 is dead: 0 connected
2018/11/01 16:04:12.062367 ctdb-recoverd[26083]: ctdb_control error: 'node is disconnected'
2018/11/01 16:04:12.062391 ctdbd[25978]: Tearing down connection to dead node :2
2018/11/01 16:04:12.062392 ctdb-recoverd[26083]: ctdb_control error: 'node is disconnected'
2018/11/01 16:04:12.062399 ctdb-recoverd[26083]: Async operation failed with ret=-1 res=-1 opcode=80
2018/11/01 16:04:12.062407 ctdb-recoverd[26083]: ctdb_control error: 'node is disconnected'
2018/11/01 16:04:12.062410 ctdb-recoverd[26083]: ctdb_control error: 'node is disconnected'
2018/11/01 16:04:12.062415 ctdb-recoverd[26083]: Async operation failed with ret=-1 res=-1 opcode=80
2018/11/01 16:04:12.062419 ctdb-recoverd[26083]: Async wait failed - fail_count=2
2018/11/01 16:04:12.062423 ctdb-recoverd[26083]: ../ctdb/server/ctdb_client.c:1931 Failed to read node capabilities.
2018/11/01 16:04:12.062428 ctdb-recoverd[26083]: ../ctdb/server/ctdb_recoverd.c:370 Failed to get node capabilities
2018/11/01 16:04:12.062441 ctdb-recoverd[26083]: ../ctdb/server/ctdb_recoverd.c:2731 Unable to update node capabilities.
2018/11/01 16:04:12.063180 ctdb-recoverd[26083]: ../ctdb/server/ctdb_recoverd.c:2953 The vnnmap count is different from the number of active lmaster nodes: 3 vs 1
2018/11/01 16:04:12.063198 ctdb-recoverd[26083]: ../ctdb/server/ctdb_recoverd.c:1317 Starting do_recovery
2018/11/01 16:04:12.063203 ctdb-recoverd[26083]: Already holding recovery lock
2018/11/01 16:04:12.063207 ctdb-recoverd[26083]: ../ctdb/server/ctdb_recoverd.c:1392 Recovery initiated due to problem with node 0
2018/11/01 16:04:12.063230 ctdb-recoverd[26083]: ../ctdb/server/ctdb_recoverd.c:1417 Recovery - created remote databases
2018/11/01 16:04:12.063289 ctdb-recoverd[26083]: ../ctdb/server/ctdb_recoverd.c:1446 Recovery - updated flags
2018/11/01 16:04:12.064809 ctdbd[25978]: Recovery mode set to ACTIVE
2018/11/01 16:04:12.064847 ctdb-recovery[20342]: Set recovery mode to ACTIVE
2018/11/01 16:04:12.064899 ctdbd[25978]: Recovery has started
2018/11/01 16:04:12.079489 ctdb-recovery[20342]: start_recovery event finished
2018/11/01 16:04:12.079534 ctdb-recovery[20342]: updated VNNMAP
2018/11/01 16:04:12.079542 ctdb-recovery[20342]: recover database 0x6645c6c4
2018/11/01 16:04:12.079598 ctdbd[25978]: Freeze db: ctdb.tdb
2018/11/01 16:04:12.105137 ctdbd[25978]: Thaw db: ctdb.tdb generation 945981855
2018/11/01 16:04:12.105149 ctdbd[25978]: Release freeze handle for db ctdb.tdb
2018/11/01 16:04:12.105300 ctdb-recovery[20342]: 1 of 1 databases recovered
2018/11/01 16:04:12.105324 ctdbd[25978]: Recovery mode set to NORMAL
2018/11/01 16:04:17.106101 ctdbd[25978]: ../ctdb/server/ctdb_recover.c:787Time out getting recovery lock, allowing recmode set anyway
2018/11/01 16:04:17.106244 ctdb-recovery[20342]: Set recovery mode to NORMAL
2018/11/01 16:04:17.106275 ctdbd[25978]: Recovery has finished
2018/11/01 16:04:17.120622 ctdb-recovery[20342]: recovered event finished
2018/11/01 16:04:17.120687 ctdb-recoverd[26083]: Takeover run starting
2018/11/01 16:04:17.136458 ctdb-recoverd[26083]: Takeover run completed successfully
2018/11/01 16:04:17.136501 ctdb-recoverd[26083]: ../ctdb/server/ctdb_recoverd.c:1464 Recovery complete
2018/11/01 16:04:17.136505 ctdb-recoverd[26083]: Resetting ban count to 0 for all nodes
2018/11/01 16:04:17.136508 ctdb-recoverd[26083]: Just finished a recovery. New recoveries will now be suppressed for the rerecovery timeout (10 seconds)
2018/11/01 16:04:17.136524 ctdb-recoverd[26083]: Disabling recoveries for 10 seconds
2018/11/01 16:04:27.137984 ctdb-recoverd[26083]: Reenabling recoveries after timeout
2018/11/01 16:07:36.127777 ctdbd[25978]: 2001:7b8:80:1:0:1:9:1:4379: connected to 2001:7b8:80:3:0:1:9:2:4379 - 1 connected
2018/11/01 16:07:36.127806 ctdbd[25978]: 2001:7b8:80:1:0:1:9:1:4379: connected to 2001:7b8:80:2:0:1:9:3:4379 - 2 connected
2018/11/01 16:07:37.205051 ctdb-recoverd[26083]: Election period ended
2018/11/01 16:07:37.418103 ctdb-recoverd[26083]: Node 1 has changed flags - now 0x0  was 0x2
2018/11/01 16:07:37.418393 ctdb-recoverd[26083]: Remote node 1 had flags 0x0, local had 0x2 - updating local
2018/11/01 16:07:37.418780 ctdb-recoverd[26083]: Node 2 has changed flags - now 0x0  was 0x2
2018/11/01 16:07:37.419129 ctdb-recoverd[26083]: Remote node 2 had flags 0x0, local had 0x2 - updating local
2018/11/01 16:07:37.420099 ctdb-recoverd[26083]: ../ctdb/server/ctdb_recoverd.c:2913 Remote node:1 has different flags for node 0. It has 0x02 vs our 0x00
2018/11/01 16:07:37.420121 ctdb-recoverd[26083]: Use flags 0x00 from local recmaster node for cluster update of node 0 flags
2018/11/01 16:07:37.420418 ctdb-recoverd[26083]: ../ctdb/server/ctdb_recoverd.c:1317 Starting do_recovery
2018/11/01 16:07:37.420440 ctdb-recoverd[26083]: Already holding recovery lock
2018/11/01 16:07:37.420445 ctdb-recoverd[26083]: ../ctdb/server/ctdb_recoverd.c:1392 Recovery initiated due to problem with node 1
2018/11/01 16:07:37.421205 ctdb-recoverd[26083]: ../ctdb/server/ctdb_recoverd.c:1417 Recovery - created remote databases
2018/11/01 16:07:37.422406 ctdb-recoverd[26083]: ../ctdb/server/ctdb_recoverd.c:1446 Recovery - updated flags
2018/11/01 16:07:37.424301 ctdbd[25978]: Recovery mode set to ACTIVE
2018/11/01 16:07:37.424599 ctdb-recovery[28668]: Set recovery mode to ACTIVE
2018/11/01 16:07:37.424638 ctdbd[25978]: Recovery has started
2018/11/01 16:07:37.439418 ctdb-recovery[28668]: start_recovery event finished
2018/11/01 16:07:37.440023 ctdb-recovery[28668]: updated VNNMAP
2018/11/01 16:07:37.440035 ctdb-recovery[28668]: recover database 0x6645c6c4
2018/11/01 16:07:37.440309 ctdbd[25978]: Freeze db: ctdb.tdb
2018/11/01 16:07:37.493856 ctdbd[25978]: Thaw db: ctdb.tdb generation 656678117
2018/11/01 16:07:37.493878 ctdbd[25978]: Release freeze handle for db ctdb.tdb
2018/11/01 16:07:37.494377 ctdb-recovery[28668]: 1 of 1 databases recovered
2018/11/01 16:07:37.494403 ctdbd[25978]: Recovery mode set to NORMAL
2018/11/01 16:07:37.512007 ctdb-recovery[28668]: failed to set recovery mode NORMAL on node 2, ret=-1
2018/11/01 16:07:37.512037 ctdb-recovery[28668]: database recovery failed, ret=-1
2018/11/01 16:07:38.420689 ctdb-recoverd[26083]: ../ctdb/server/ctdb_recoverd.c:1317 Starting do_recovery
2018/11/01 16:07:38.420712 ctdb-recoverd[26083]: Already holding recovery lock
2018/11/01 16:07:38.420716 ctdb-recoverd[26083]: ../ctdb/server/ctdb_recoverd.c:1392 Recovery initiated due to problem with node 1
2018/11/01 16:07:38.421495 ctdb-recoverd[26083]: ../ctdb/server/ctdb_recoverd.c:1417 Recovery - created remote databases
2018/11/01 16:07:38.422280 ctdb-recoverd[26083]: ../ctdb/server/ctdb_recoverd.c:1446 Recovery - updated flags
2018/11/01 16:07:38.424014 ctdbd[25978]: Recovery mode set to ACTIVE
2018/11/01 16:07:38.424219 ctdb-recovery[28751]: Set recovery mode to ACTIVE
2018/11/01 16:07:38.424259 ctdbd[25978]: Recovery has started
2018/11/01 16:07:38.438481 ctdb-recovery[28751]: start_recovery event finished
2018/11/01 16:07:38.439001 ctdb-recovery[28751]: updated VNNMAP
2018/11/01 16:07:38.439013 ctdb-recovery[28751]: recover database 0x6645c6c4
2018/11/01 16:07:38.439172 ctdbd[25978]: Freeze db: ctdb.tdb
2018/11/01 16:07:38.492718 ctdbd[25978]: Thaw db: ctdb.tdb generation 568703777
2018/11/01 16:07:38.492740 ctdbd[25978]: Release freeze handle for db ctdb.tdb
2018/11/01 16:07:38.493312 ctdb-recovery[28751]: 1 of 1 databases recovered
2018/11/01 16:07:38.493354 ctdbd[25978]: Recovery mode set to NORMAL
2018/11/01 16:07:38.509335 ctdb-recovery[28751]: failed to set recovery mode NORMAL on node 2, ret=-1
2018/11/01 16:07:38.509367 ctdb-recovery[28751]: database recovery failed, ret=-1
2018/11/01 16:07:39.420679 ctdb-recoverd[26083]: ../ctdb/server/ctdb_recoverd.c:1317 Starting do_recovery
2018/11/01 16:07:39.420713 ctdb-recoverd[26083]: Already holding recovery lock
2018/11/01 16:07:39.420718 ctdb-recoverd[26083]: ../ctdb/server/ctdb_recoverd.c:1392 Recovery initiated due to problem with node 1
2018/11/01 16:07:39.421411 ctdb-recoverd[26083]: ../ctdb/server/ctdb_recoverd.c:1417 Recovery - created remote databases
2018/11/01 16:07:39.422215 ctdb-recoverd[26083]: ../ctdb/server/ctdb_recoverd.c:1446 Recovery - updated flags
2018/11/01 16:07:39.424108 ctdbd[25978]: Recovery mode set to ACTIVE
2018/11/01 16:07:39.424302 ctdb-recovery[28809]: Set recovery mode to ACTIVE
2018/11/01 16:07:39.424338 ctdbd[25978]: Recovery has started
2018/11/01 16:07:39.440617 ctdb-recovery[28809]: start_recovery event finished
2018/11/01 16:07:39.440926 ctdb-recovery[28809]: updated VNNMAP
2018/11/01 16:07:39.440942 ctdb-recovery[28809]: recover database 0x6645c6c4
2018/11/01 16:07:39.441049 ctdbd[25978]: Freeze db: ctdb.tdb
2018/11/01 16:07:39.496225 ctdbd[25978]: Thaw db: ctdb.tdb generation 2125763535
2018/11/01 16:07:39.496248 ctdbd[25978]: Release freeze handle for db ctdb.tdb
2018/11/01 16:07:39.496792 ctdb-recovery[28809]: 1 of 1 databases recovered
2018/11/01 16:07:39.496832 ctdbd[25978]: Recovery mode set to NORMAL
2018/11/01 16:07:39.512674 ctdb-recovery[28809]: Set recovery mode to NORMAL
2018/11/01 16:07:39.512727 ctdbd[25978]: Recovery has finished
2018/11/01 16:07:39.527756 ctdb-recovery[28809]: recovered event finished
2018/11/01 16:07:39.527824 ctdb-recoverd[26083]: Takeover run starting
2018/11/01 16:07:39.529817 ctdbd[25978]: Release of IP 172.31.254.9/24 on interface vlan88  node:1
2018/11/01 16:07:39.529910 ctdbd[25978]: Release of IP 172.17.243.140/24 on interface vlan243  node:2
2018/11/01 16:07:39.529927 ctdbd[25978]: Release of IP 172.17.243.139/24 on interface vlan243  node:1
2018/11/01 16:07:39.529950 ctdbd[25978]: Release of IP 172.17.243.133/24 on interface vlan243  node:1
2018/11/01 16:07:39.529963 ctdbd[25978]: Release of IP 172.17.243.9/24 on interface vlan243  node:2
2018/11/01 16:07:40.300092 ctdb-eventd[25980]: 70.iscsi: Releaseip 172.31.254.9
2018/11/01 16:07:40.300116 ctdb-eventd[25980]: 70.iscsi: rbd-nbd: BITED-152623-BITCLOUD/iscsi-target-NDCLE-disk01 is not mapped
2018/11/01 16:07:40.300120 ctdb-eventd[25980]: 70.iscsi: rbd-nbd: BITED-152623-BITCLOUD/iscsi-target-NDCLE-disk01 is not mapped
2018/11/01 16:07:40.300123 ctdb-eventd[25980]: 70.iscsi: Deleted storage object BITED01.
2018/11/01 16:07:40.300126 ctdb-eventd[25980]: 70.iscsi: rbd-nbd: BITED-152623-BITCLOUD/iscsi-target-STEFAN-test01 is not mapped
2018/11/01 16:07:40.300143 ctdb-eventd[25980]: 70.iscsi: Deleted storage object BITED02.
2018/11/01 16:07:40.300146 ctdb-eventd[25980]: 70.iscsi: Deleted storage object BITED03.
2018/11/01 16:07:40.300149 ctdb-eventd[25980]: 70.iscsi: Deleted Target iqn.2003-01.org.linux-iscsi.iscsi-filer.x8664:sn.4aab30bb6db6.
2018/11/01 16:07:40.322599 ctdb-eventd[25980]: 10.interface: Unable to parse connections (Invalid argument)
2018/11/01 16:07:40.322660 ctdb-eventd[25980]: 10.interface: Failed to kill TCP connections
2018/11/01 16:07:42.062631 ctdbd[25978]: Monitoring event was cancelled
2018/11/01 16:07:47.063474 ctdbd[25978]: Monitoring event was cancelled
2018/11/01 16:07:52.064526 ctdbd[25978]: Monitoring event was cancelled
2018/11/01 16:07:57.065725 ctdbd[25978]: Monitoring event was cancelled
2018/11/01 16:08:01.532241 ctdb-eventd[25980]: 60.nfs: Releaseip 172.17.243.140
2018/11/01 16:08:01.532261 ctdb-eventd[25980]: 60.nfs: killing PID 28491
2018/11/01 16:08:01.532266 ctdb-eventd[25980]: 60.nfs: .
2018/11/01 16:08:01.532269 ctdb-eventd[25980]: 60.nfs: Cleaned up /mnt/ganesha/172.17.243.140
2018/11/01 16:08:01.532272 ctdb-eventd[25980]: 60.nfs: rbd-nbd: BITED-152623-BITCLOUD/iscsi-filer-disk08 is not mapped
2018/11/01 16:08:01.532275 ctdb-eventd[25980]: 60.nfs: rbd-nbd: BITED-152623-BITCLOUD/iscsi-filer-disk08 is not mapped
2018/11/01 16:08:01.555206 ctdb-eventd[25980]: 10.interface: Unable to parse connections (Invalid argument)
2018/11/01 16:08:01.555224 ctdb-eventd[25980]: 10.interface: Failed to kill TCP connections
2018/11/01 16:08:02.066569 ctdbd[25978]: Monitoring event was cancelled
2018/11/01 16:08:05.625222 ctdb-eventd[25980]: 60.nfs: Releaseip 172.17.243.139
2018/11/01 16:08:05.625241 ctdb-eventd[25980]: 60.nfs: killing PID 28605
2018/11/01 16:08:05.625246 ctdb-eventd[25980]: 60.nfs: .
2018/11/01 16:08:05.625250 ctdb-eventd[25980]: 60.nfs: Cleaned up /mnt/ganesha/172.17.243.139
2018/11/01 16:08:05.625253 ctdb-eventd[25980]: 60.nfs: rbd-nbd: BITED-152623-BITCLOUD/iscsi-filer-disk09 is not mapped
2018/11/01 16:08:05.625257 ctdb-eventd[25980]: 60.nfs: rbd-nbd: BITED-152623-BITCLOUD/iscsi-filer-disk09 is not mapped
2018/11/01 16:08:05.946512 ctdb-eventd[25980]: 70.iscsi: Releaseip 172.17.243.133
2018/11/01 16:08:05.946537 ctdb-eventd[25980]: 70.iscsi: rbd-nbd: BITED-152623-BITCLOUD/iscsi-filer-disk02 is not mapped
2018/11/01 16:08:05.946540 ctdb-eventd[25980]: 70.iscsi: rbd-nbd: BITED-152623-BITCLOUD/iscsi-filer-disk02 is not mapped
2018/11/01 16:08:05.946543 ctdb-eventd[25980]: 70.iscsi: Deleted storage object BITED06.
2018/11/01 16:08:05.946547 ctdb-eventd[25980]: 70.iscsi: Deleted Target iqn.1996-03.nl.bit.linux-iscsi.iscsi-filer:sn.172.17.243.133.
2018/11/01 16:08:06.388132 ctdb-eventd[25980]: 70.iscsi: Releaseip 172.17.243.9
2018/11/01 16:08:06.388172 ctdb-eventd[25980]: 70.iscsi: rbd-nbd: BITED-152623-BITCLOUD/iscsi-target-BITED-MSSQL-DATA-disk01 is not mapped
2018/11/01 16:08:06.388194 ctdb-eventd[25980]: 70.iscsi: rbd-nbd: BITED-152623-BITCLOUD/iscsi-target-BITED-MSSQL-DATA-disk01 is not mapped
2018/11/01 16:08:06.388198 ctdb-eventd[25980]: 70.iscsi: Deleted storage object BITED13.
2018/11/01 16:08:06.388201 ctdb-eventd[25980]: 70.iscsi: rbd-nbd: BITED-152623-BITCLOUD/iscsi-target-BITED-MSSQL-WITNESS-disk01 is not mapped
2018/11/01 16:08:06.388204 ctdb-eventd[25980]: 70.iscsi: rbd-nbd: BITED-152623-BITCLOUD/iscsi-target-BITED-MSSQL-WITNESS-disk01 is not mapped
2018/11/01 16:08:06.388207 ctdb-eventd[25980]: 70.iscsi: Deleted storage object BITED14.
2018/11/01 16:08:06.388209 ctdb-eventd[25980]: 70.iscsi: Deleted Target iqn.1996-03.nl.bit.linux-iscsi.iscsi-filer:sn.172.17.243.9.
2018/11/01 16:08:06.403365 ctdb-recoverd[26083]: Takeover run completed successfully
2018/11/01 16:08:06.403395 ctdb-recoverd[26083]: ../ctdb/server/ctdb_recoverd.c:1464 Recovery complete
2018/11/01 16:08:06.403399 ctdb-recoverd[26083]: Resetting ban count to 0 for all nodes
2018/11/01 16:08:06.403403 ctdb-recoverd[26083]: Just finished a recovery. New recoveries will now be suppressed for the rerecovery timeout (10 seconds)
2018/11/01 16:08:06.403406 ctdb-recoverd[26083]: Disabling recoveries for 10 seconds
2018/11/01 16:08:16.405326 ctdb-recoverd[26083]: Reenabling recoveries after timeout
2018/11/01 16:14:26.596345 ctdb-recoverd[26083]: Disabling takeover runs for 20 seconds
2018/11/01 16:14:27.783895 ctdbd[25978]: Takeover of IP 172.17.243.139/24 on interface vlan243
2018/11/01 16:14:27.855213 ctdb-eventd[25980]: 06.nfs: Takeip 172.17.243.139
2018/11/01 16:14:27.855228 ctdb-eventd[25980]: 06.nfs: Mounting /dev/nbd0
2018/11/01 16:14:27.855232 ctdb-eventd[25980]: 06.nfs: Starting Ganesha for 172.17.243.139
2018/11/01 16:14:27.855237 ctdb-eventd[25980]: 06.nfs: done
2018/11/01 16:14:30.613001 ctdb-recoverd[26083]: Disabling takeover runs for 20 seconds
2018/11/01 16:14:35.519522 ctdbd[25978]: Takeover of IP 172.17.243.133/24 on interface vlan243
2018/11/01 16:14:36.362403 ctdb-eventd[25980]: 07.iscsi: Takeip 172.17.243.133
2018/11/01 16:14:36.362425 ctdb-eventd[25980]: 07.iscsi: Created target iqn.1996-03.nl.bit.linux-iscsi.iscsi-filer:sn.172.17.243.133.
2018/11/01 16:14:36.362429 ctdb-eventd[25980]: 07.iscsi: Created TPG 1.
2018/11/01 16:14:36.362433 ctdb-eventd[25980]: 07.iscsi: Global pref auto_add_default_portal=true
2018/11/01 16:14:36.362436 ctdb-eventd[25980]: 07.iscsi: Created default portal listening on all IPs (0.0.0.0), port 3260.
2018/11/01 16:14:36.362439 ctdb-eventd[25980]: 07.iscsi: Created block storage object BITED06 using /dev/nbd1.
2018/11/01 16:14:36.362443 ctdb-eventd[25980]: 07.iscsi: Created LUN 1.
2018/11/01 16:14:36.362446 ctdb-eventd[25980]: 07.iscsi: Deleted network portal 0.0.0.0:3260
2018/11/01 16:14:36.362449 ctdb-eventd[25980]: 07.iscsi: Using default IP port 3260
2018/11/01 16:14:36.362452 ctdb-eventd[25980]: 07.iscsi: Created network portal 172.17.243.133:3260.
2018/11/01 16:14:36.362456 ctdb-eventd[25980]: 07.iscsi: Created Node ACL for iqn.1993-08.org.debian:01:c3a88bdaed0
2018/11/01 16:14:36.362459 ctdb-eventd[25980]: 07.iscsi: Created mapped LUN 1.
2018/11/01 16:14:36.362462 ctdb-eventd[25980]: 07.iscsi: set auth for iqn.1993-08.org.debian:01:c3a88bdaed0
2018/11/01 16:14:36.362465 ctdb-eventd[25980]: 07.iscsi: Parameter authentication is now '1'.
2018/11/01 16:14:36.362469 ctdb-eventd[25980]: 07.iscsi: Parameter userid is now 'iqn.1993-08.org.debian:01:c3a88bdaed0'.
2018/11/01 16:14:36.362472 ctdb-eventd[25980]: 07.iscsi: Parameter password is now 'ethi1At0pei5'.
2018/11/01 16:14:36.362476 ctdb-eventd[25980]: 07.iscsi: done
2018/11/01 16:14:50.614052 ctdb-recoverd[26083]: Reenabling takeover runs after timeout
2018/11/01 16:15:12.284517 ctdbd[25978]: dead count reached for node 1
2018/11/01 16:15:12.284547 ctdbd[25978]: 2001:7b8:80:1:0:1:9:1:4379: node 2001:7b8:80:3:0:1:9:2:4379 is dead: 1 connected
2018/11/01 16:15:12.284575 ctdbd[25978]: Tearing down connection to dead node :1
2018/11/01 16:15:12.284650 ctdb-recoverd[26083]: ctdb_control error: 'node is disconnected'
2018/11/01 16:15:12.284676 ctdbd[25978]: dead count reached for node 2
2018/11/01 16:15:12.284678 ctdb-recoverd[26083]: ctdb_control error: 'node is disconnected'
2018/11/01 16:15:12.284682 ctdbd[25978]: 2001:7b8:80:1:0:1:9:1:4379: node 2001:7b8:80:2:0:1:9:3:4379 is dead: 0 connected
2018/11/01 16:15:12.284685 ctdb-recoverd[26083]: Async operation failed with ret=-1 res=-1 opcode=80
2018/11/01 16:15:12.284691 ctdbd[25978]: Tearing down connection to dead node :2
2018/11/01 16:15:12.284694 ctdb-recoverd[26083]: ctdb_control error: 'node is disconnected'
2018/11/01 16:15:12.284698 ctdb-recoverd[26083]: ctdb_control error: 'node is disconnected'
2018/11/01 16:15:12.284703 ctdb-recoverd[26083]: Async operation failed with ret=-1 res=-1 opcode=80
2018/11/01 16:15:12.284707 ctdb-recoverd[26083]: Async wait failed - fail_count=2
2018/11/01 16:15:12.284712 ctdb-recoverd[26083]: ../ctdb/server/ctdb_client.c:1931 Failed to read node capabilities.
2018/11/01 16:15:12.284717 ctdb-recoverd[26083]: ../ctdb/server/ctdb_recoverd.c:370 Failed to get node capabilities
2018/11/01 16:15:12.284721 ctdb-recoverd[26083]: ../ctdb/server/ctdb_recoverd.c:2731 Unable to update node capabilities.
2018/11/01 16:15:12.286690 ctdb-recoverd[26083]: ../ctdb/server/ctdb_recoverd.c:2953 The vnnmap count is different from the number of active lmaster nodes: 3 vs 1
2018/11/01 16:15:12.286705 ctdb-recoverd[26083]: ../ctdb/server/ctdb_recoverd.c:1317 Starting do_recovery
2018/11/01 16:15:12.286709 ctdb-recoverd[26083]: Already holding recovery lock
2018/11/01 16:15:12.286713 ctdb-recoverd[26083]: ../ctdb/server/ctdb_recoverd.c:1392 Recovery initiated due to problem with node 0
2018/11/01 16:15:12.286804 ctdb-recoverd[26083]: ../ctdb/server/ctdb_recoverd.c:1417 Recovery - created remote databases
2018/11/01 16:15:12.287002 ctdb-recoverd[26083]: ../ctdb/server/ctdb_recoverd.c:1446 Recovery - updated flags
2018/11/01 16:15:12.288852 ctdbd[25978]: Recovery mode set to ACTIVE
2018/11/01 16:15:12.288876 ctdb-recovery[14560]: Set recovery mode to ACTIVE
2018/11/01 16:15:12.288924 ctdbd[25978]: Recovery has started
2018/11/01 16:15:12.303503 ctdb-recovery[14560]: start_recovery event finished
2018/11/01 16:15:12.303639 ctdb-recovery[14560]: updated VNNMAP
2018/11/01 16:15:12.303651 ctdb-recovery[14560]: recover database 0x6645c6c4
2018/11/01 16:15:12.303801 ctdbd[25978]: Freeze db: ctdb.tdb
2018/11/01 16:15:12.329511 ctdbd[25978]: Thaw db: ctdb.tdb generation 344363911
2018/11/01 16:15:12.329536 ctdbd[25978]: Release freeze handle for db ctdb.tdb
2018/11/01 16:15:12.329704 ctdb-recovery[14560]: 1 of 1 databases recovered
2018/11/01 16:15:12.329728 ctdbd[25978]: Recovery mode set to NORMAL
2018/11/01 16:15:17.330122 ctdbd[25978]: ../ctdb/server/ctdb_recover.c:787Time out getting recovery lock, allowing recmode set anyway
2018/11/01 16:15:17.330270 ctdb-recovery[14560]: Set recovery mode to NORMAL
2018/11/01 16:15:17.330299 ctdbd[25978]: Recovery has finished
2018/11/01 16:15:17.344646 ctdb-recovery[14560]: recovered event finished
2018/11/01 16:15:17.344709 ctdb-recoverd[26083]: Takeover run starting
2018/11/01 16:15:17.346457 ctdbd[25978]: Takeover of IP 172.31.254.9/24 on interface vlan88
2018/11/01 16:15:17.346518 ctdbd[25978]: Takeover of IP 172.17.243.140/24 on interface vlan243
2018/11/01 16:15:17.346554 ctdbd[25978]: Takeover of IP 172.17.243.9/24 on interface vlan243
2018/11/01 16:15:19.960814 ctdbd[25978]: Monitoring event was cancelled
2018/11/01 16:15:24.961756 ctdbd[25978]: Monitoring event was cancelled
2018/11/01 16:15:29.962313 ctdbd[25978]: Monitoring event was cancelled
2018/11/01 16:15:34.962991 ctdbd[25978]: Monitoring event was cancelled
2018/11/01 16:15:39.963783 ctdbd[25978]: Monitoring event was cancelled
2018/11/01 16:15:44.373782 ctdb-takeover[14779]: TAKEOVER_IP 172.31.254.9 failed to node 0, ret=110
2018/11/01 16:15:44.373832 ctdb-takeover[14779]: TAKEOVER_IP 172.17.243.140 failed to node 0, ret=110
2018/11/01 16:15:44.373838 ctdb-takeover[14779]: TAKEOVER_IP 172.17.243.9 failed to node 0, ret=110
2018/11/01 16:15:44.373849 ctdb-takeover[14779]: Assigning banning credits to node 0
2018/11/01 16:15:44.373870 ctdb-takeover[14779]: takeover run failed, ret=110
2018/11/01 16:15:44.373956 ctdb-recoverd[26083]: ctdb_takeover_run() failed
2018/11/01 16:15:44.373970 ctdb-recoverd[26083]: Takeover run unsuccessful
2018/11/01 16:15:44.373979 ctdb-recoverd[26083]: ../ctdb/server/ctdb_recoverd.c:1464 Recovery complete
2018/11/01 16:15:44.373997 ctdb-recoverd[26083]: Resetting ban count to 0 for all nodes
2018/11/01 16:15:44.374002 ctdb-recoverd[26083]: Just finished a recovery. New recoveries will now be suppressed for the rerecovery timeout (10 seconds)
2018/11/01 16:15:44.374006 ctdb-recoverd[26083]: Disabling recoveries for 10 seconds
2018/11/01 16:15:44.374424 ctdb-recoverd[26083]: Assigned IP 172.17.243.140 not on an interface
2018/11/01 16:15:44.374445 ctdb-recoverd[26083]: Assigned IP 172.17.243.9 not on an interface
2018/11/01 16:15:44.374454 ctdb-recoverd[26083]: Assigned IP 172.31.254.9 not on an interface
2018/11/01 16:15:44.374457 ctdb-recoverd[26083]: Trigger takeoverrun
2018/11/01 16:15:44.374491 ctdb-recoverd[26083]: Takeover run starting
2018/11/01 16:15:44.376513 ctdbd[25978]: Takeover of IP 172.31.254.9/24 rejected update for this IP already in flight
2018/11/01 16:15:44.376532 ctdbd[25978]: Takeover of IP 172.17.243.140/24 rejected update for this IP already in flight
2018/11/01 16:15:44.376549 ctdbd[25978]: Takeover of IP 172.17.243.9/24 rejected update for this IP already in flight
2018/11/01 16:15:44.376600 ctdb-takeover[15539]: TAKEOVER_IP 172.31.254.9 failed on node 0, ret=-1
2018/11/01 16:15:44.376625 ctdb-takeover[15539]: TAKEOVER_IP 172.17.243.140 failed on node 0, ret=-1
2018/11/01 16:15:44.376653 ctdb-takeover[15539]: TAKEOVER_IP 172.17.243.9 failed on node 0, ret=-1
2018/11/01 16:15:44.376657 ctdb-takeover[15539]: Assigning banning credits to node 0
2018/11/01 16:15:44.376669 ctdb-takeover[15539]: takeover run failed, ret=-1
2018/11/01 16:15:44.376688 ctdb-recoverd[26083]: ctdb_takeover_run() failed
2018/11/01 16:15:44.376695 ctdb-recoverd[26083]: Takeover run unsuccessful
2018/11/01 16:15:44.964750 ctdbd[25978]: Monitoring event was cancelled
2018/11/01 16:15:45.376716 ctdb-recoverd[26083]: Node 0 reached 6 banning credits - banning it for 10 seconds
2018/11/01 16:15:45.376737 ctdb-recoverd[26083]: Banning node 0 for 10 seconds
2018/11/01 16:15:45.376814 ctdbd[25978]: Banning this node for 10 seconds
2018/11/01 16:15:45.376824 ctdbd[25978]: This node has been banned - releasing all public IPs and setting the generation to INVALID.
2018/11/01 16:15:47.349107 ctdb-eventd[25980]: 07.iscsi: Takeip 172.31.254.9
2018/11/01 16:15:47.349140 ctdb-eventd[25980]: 07.iscsi: Created target iqn.2003-01.org.linux-iscsi.iscsi-filer.x8664:sn.4aab30bb6db6.
2018/11/01 16:15:47.349144 ctdb-eventd[25980]: 07.iscsi: Created TPG 1.
2018/11/01 16:15:47.349148 ctdb-eventd[25980]: 07.iscsi: Default portal not created, TPGs within a target cannot share ip:port.
2018/11/01 16:15:47.349919 ctdbd[25978]: Ignoring hung script for takeip event
2018/11/01 16:15:47.349945 ctdbd[25978]: Ignoring hung script for takeip event
2018/11/01 16:15:47.349967 ctdbd[25978]: Ignoring hung script for takeip event
2018/11/01 16:15:47.362704 ctdbd[25978]: pnn 0 Invalid reqid 82573 in ctdb_reply_control
2018/11/01 16:15:47.364157 ctdbd[25978]: ctdb_sys_send_arp: Interface '__none__' not found
2018/11/01 16:15:47.369670 ctdb-eventd[25980]: 10.interface: Unable to parse connections (Invalid argument)
2018/11/01 16:15:47.369685 ctdb-eventd[25980]: 10.interface: Failed to kill TCP connections
2018/11/01 16:15:47.382779 ctdbd[25978]: ../ctdb/server/ctdb_takeover.c:378 sending of arp failed on iface '__none__' (No such device)
2018/11/01 16:15:47.382813 ctdbd[25978]: pnn 0 Invalid reqid 82574 in ctdb_reply_control
2018/11/01 16:15:47.398668 ctdbd[25978]: pnn 0 Invalid reqid 82577 in ctdb_reply_control
2018/11/01 16:15:48.484239 ctdbd[25978]: ctdb_sys_send_arp: Interface '__none__' not found
2018/11/01 16:15:48.514801 ctdbd[25978]: ../ctdb/server/ctdb_takeover.c:378 sending of arp failed on iface '__none__' (No such device)
2018/11/01 16:15:49.616347 ctdbd[25978]: ctdb_sys_send_arp: Interface '__none__' not found
2018/11/01 16:15:49.630760 ctdbd[25978]: ../ctdb/server/ctdb_takeover.c:378 sending of arp failed on iface '__none__' (No such device)
2018/11/01 16:15:54.383928 ctdb-recoverd[26083]: Reenabling recoveries after timeout
2018/11/01 16:15:55.377155 ctdbd[25978]: Banning timed out, but not all databases frozen yet - banning this node again.
2018/11/01 16:15:55.377185 ctdbd[25978]: Banning this node for 10 seconds
2018/11/01 16:16:05.378139 ctdbd[25978]: Banning timed out, but not all databases frozen yet - banning this node again.
2018/11/01 16:16:05.378170 ctdbd[25978]: Banning this node for 10 seconds
2018/11/01 16:16:15.378657 ctdbd[25978]: Banning timed out, but not all databases frozen yet - banning this node again.
2018/11/01 16:16:15.378713 ctdbd[25978]: Banning this node for 10 seconds
2018/11/01 16:16:15.378717 ctdb-eventd[25980]: 60.nfs: Releaseip 172.17.243.139
2018/11/01 16:16:15.378732 ctdb-eventd[25980]: 60.nfs: killing PID 12760
2018/11/01 16:16:15.378737 ctdb-eventd[25980]: 60.nfs: .
2018/11/01 16:16:15.378853 ctdbd[25978]: Ignoring hung script for releaseip event
2018/11/01 16:16:15.394683 ctdb-recoverd[26083]: ../ctdb/server/ctdb_client.c:678 control timed out. reqid:46869 opcode:109 dstnode:0
2018/11/01 16:16:15.394702 ctdb-recoverd[26083]: ../ctdb/server/ctdb_client.c:791 ctdb_control_recv failed
2018/11/01 16:16:15.394708 ctdb-recoverd[26083]: ../ctdb/server/ctdb_client.c:2012 ctdb_control for set ban state failed
2018/11/01 16:16:15.394724 ctdb-recoverd[26083]: ../ctdb/server/ctdb_recoverd.c:296 Failed to ban node 0
2018/11/01 16:16:15.394728 ctdb-recoverd[26083]: This node was banned, restart main_loop
2018/11/01 16:16:15.395416 ctdb-recoverd[26083]: Node is stopped or banned but recovery mode is not active. Activate recovery mode and lock databases
2018/11/01 16:16:15.395437 ctdbd[25978]: Recovery mode set to ACTIVE
2018/11/01 16:16:15.395502 ctdbd[25978]: Freeze all
2018/11/01 16:16:15.395508 ctdbd[25978]: Freeze db: ctdb.tdb
2018/11/01 16:16:15.689374 ctdb-eventd[25980]: 70.iscsi: Releaseip 172.17.243.133
2018/11/01 16:16:15.689398 ctdb-eventd[25980]: 70.iscsi: Deleted storage object BITED06.
2018/11/01 16:16:15.689402 ctdb-eventd[25980]: 70.iscsi: Deleted Target iqn.1996-03.nl.bit.linux-iscsi.iscsi-filer:sn.172.17.243.133.
2018/11/01 16:16:15.689525 ctdbd[25978]: ../ctdb/server/ctdb_takeover.c:1638 Released 2 public IPs
2018/11/01 16:16:15.689592 ctdb-recoverd[26083]: ../ctdb/server/ctdb_client.c:623 reqid 46869 not found
2018/11/01 16:16:25.379586 ctdbd[25978]: Banning timed out
2018/11/01 16:16:25.415174 ctdb-recoverd[26083]: Node:0 was in recovery mode. Start recovery process
2018/11/01 16:16:25.415211 ctdb-recoverd[26083]: ../ctdb/server/ctdb_recoverd.c:1317 Starting do_recovery
2018/11/01 16:16:25.415216 ctdb-recoverd[26083]: Already holding recovery lock
2018/11/01 16:16:25.415219 ctdb-recoverd[26083]: ../ctdb/server/ctdb_recoverd.c:1392 Recovery initiated due to problem with node 0
2018/11/01 16:16:25.415236 ctdb-recoverd[26083]: ../ctdb/server/ctdb_recoverd.c:1417 Recovery - created remote databases
2018/11/01 16:16:25.415290 ctdb-recoverd[26083]: ../ctdb/server/ctdb_recoverd.c:1446 Recovery - updated flags
2018/11/01 16:16:25.416745 ctdb-recovery[16774]: Set recovery mode to ACTIVE
2018/11/01 16:16:25.416779 ctdbd[25978]: Recovery has started
2018/11/01 16:16:25.431076 ctdb-recovery[16774]: start_recovery event finished
2018/11/01 16:16:25.431224 ctdb-recovery[16774]: updated VNNMAP
2018/11/01 16:16:25.431248 ctdb-recovery[16774]: recover database 0x6645c6c4
2018/11/01 16:16:25.431614 ctdbd[25978]: Freeze db: ctdb.tdb frozen
2018/11/01 16:16:25.456249 ctdbd[25978]: Thaw db: ctdb.tdb generation 1564370322
2018/11/01 16:16:25.456259 ctdbd[25978]: Release freeze handle for db ctdb.tdb
2018/11/01 16:16:25.456412 ctdb-recovery[16774]: 1 of 1 databases recovered
2018/11/01 16:16:25.456447 ctdbd[25978]: Recovery mode set to NORMAL
2018/11/01 16:16:25.456456 ctdbd[25978]: Thawing all
2018/11/01 16:16:25.456460 ctdbd[25978]: Release freeze handle
2018/11/01 16:16:30.457479 ctdbd[25978]: ../ctdb/server/ctdb_recover.c:787Time out getting recovery lock, allowing recmode set anyway
2018/11/01 16:16:30.457619 ctdb-recovery[16774]: Set recovery mode to NORMAL
2018/11/01 16:16:30.457649 ctdbd[25978]: Recovery has finished
2018/11/01 16:16:30.472121 ctdb-recovery[16774]: recovered event finished
2018/11/01 16:16:30.472186 ctdb-recoverd[26083]: Takeover run starting
2018/11/01 16:16:30.473831 ctdbd[25978]: Takeover of IP 172.31.254.9/24 on interface vlan88
2018/11/01 16:16:30.473879 ctdbd[25978]: Takeover of IP 172.17.243.140/24 on interface vlan243
2018/11/01 16:16:30.473893 ctdbd[25978]: Takeover of IP 172.17.243.139/24 on interface vlan243
2018/11/01 16:16:30.473918 ctdbd[25978]: Takeover of IP 172.17.243.133/24 on interface vlan243
2018/11/01 16:16:30.473929 ctdbd[25978]: Takeover of IP 172.17.243.9/24 on interface vlan243
2018/11/01 16:16:34.971260 ctdbd[25978]: Monitoring event was cancelled
2018/11/01 16:16:39.972580 ctdbd[25978]: Monitoring event was cancelled
2018/11/01 16:16:44.973146 ctdbd[25978]: Monitoring event was cancelled
2018/11/01 16:16:49.973984 ctdbd[25978]: Monitoring event was cancelled
2018/11/01 16:16:54.975073 ctdbd[25978]: Monitoring event was cancelled
2018/11/01 16:16:57.501013 ctdb-takeover[16972]: TAKEOVER_IP 172.31.254.9 failed to node 0, ret=110
2018/11/01 16:16:57.560270 ctdb-takeover[16972]: TAKEOVER_IP 172.17.243.140 failed to node 0, ret=110
2018/11/01 16:16:57.560304 ctdb-takeover[16972]: TAKEOVER_IP 172.17.243.139 failed to node 0, ret=110
2018/11/01 16:16:57.560310 ctdb-takeover[16972]: TAKEOVER_IP 172.17.243.133 failed to node 0, ret=110
2018/11/01 16:16:57.560315 ctdb-takeover[16972]: TAKEOVER_IP 172.17.243.9 failed to node 0, ret=110
2018/11/01 16:16:57.560319 ctdb-takeover[16972]: Assigning banning credits to node 0
2018/11/01 16:16:57.560343 ctdb-takeover[16972]: takeover run failed, ret=110
2018/11/01 16:16:57.560460 ctdb-recoverd[26083]: ctdb_takeover_run() failed
2018/11/01 16:16:57.560479 ctdb-recoverd[26083]: Takeover run unsuccessful
2018/11/01 16:16:57.560488 ctdb-recoverd[26083]: ../ctdb/server/ctdb_recoverd.c:1464 Recovery complete
2018/11/01 16:16:57.560492 ctdb-recoverd[26083]: Resetting ban count to 0 for all nodes
2018/11/01 16:16:57.560497 ctdb-recoverd[26083]: Just finished a recovery. New recoveries will now be suppressed for the rerecovery timeout (10 seconds)
2018/11/01 16:16:57.560501 ctdb-recoverd[26083]: Disabling recoveries for 10 seconds
2018/11/01 16:16:57.561038 ctdb-recoverd[26083]: Assigned IP 172.17.243.140 not on an interface
2018/11/01 16:16:57.561060 ctdb-recoverd[26083]: Assigned IP 172.17.243.139 not on an interface
2018/11/01 16:16:57.561067 ctdb-recoverd[26083]: Assigned IP 172.17.243.9 not on an interface
2018/11/01 16:16:57.561073 ctdb-recoverd[26083]: Assigned IP 172.17.243.133 not on an interface
2018/11/01 16:16:57.561079 ctdb-recoverd[26083]: Assigned IP 172.31.254.9 not on an interface
2018/11/01 16:16:57.561083 ctdb-recoverd[26083]: Trigger takeoverrun
2018/11/01 16:16:57.561120 ctdb-recoverd[26083]: Takeover run starting
2018/11/01 16:16:57.562713 ctdbd[25978]: Takeover of IP 172.31.254.9/24 rejected update for this IP already in flight
2018/11/01 16:16:57.562745 ctdbd[25978]: Takeover of IP 172.17.243.140/24 rejected update for this IP already in flight
2018/11/01 16:16:57.562765 ctdb-takeover[17770]: TAKEOVER_IP 172.31.254.9 failed on node 0, ret=-1
2018/11/01 16:16:57.562799 ctdbd[25978]: Takeover of IP 172.17.243.139/24 rejected update for this IP already in flight
2018/11/01 16:16:57.562823 ctdbd[25978]: Takeover of IP 172.17.243.133/24 rejected update for this IP already in flight
2018/11/01 16:16:57.562838 ctdb-takeover[17770]: TAKEOVER_IP 172.17.243.140 failed on node 0, ret=-1
2018/11/01 16:16:57.562854 ctdbd[25978]: Takeover of IP 172.17.243.9/24 rejected update for this IP already in flight
2018/11/01 16:16:57.562869 ctdb-takeover[17770]: TAKEOVER_IP 172.17.243.139 failed on node 0, ret=-1
2018/11/01 16:16:57.562884 ctdb-takeover[17770]: TAKEOVER_IP 172.17.243.133 failed on node 0, ret=-1
2018/11/01 16:16:57.562891 ctdb-takeover[17770]: TAKEOVER_IP 172.17.243.9 failed on node 0, ret=-1
2018/11/01 16:16:57.562895 ctdb-takeover[17770]: Assigning banning credits to node 0
2018/11/01 16:16:57.562917 ctdb-takeover[17770]: takeover run failed, ret=-1
2018/11/01 16:16:57.562932 ctdb-recoverd[26083]: ctdb_takeover_run() failed
2018/11/01 16:16:57.562938 ctdb-recoverd[26083]: Takeover run unsuccessful
2018/11/01 16:16:58.562885 ctdb-recoverd[26083]: Node 0 reached 6 banning credits - banning it for 10 seconds
2018/11/01 16:16:58.562922 ctdb-recoverd[26083]: Banning node 0 for 10 seconds
2018/11/01 16:16:58.562972 ctdbd[25978]: Banning this node for 10 seconds
2018/11/01 16:16:58.562984 ctdbd[25978]: This node has been banned - releasing all public IPs and setting the generation to INVALID.
2018/11/01 16:16:58.563015 ctdbd[25978]: ../ctdb/server/ctdb_takeover.c:1638 Released 0 public IPs
2018/11/01 16:16:58.563030 ctdb-recoverd[26083]: This node was banned, restart main_loop
2018/11/01 16:16:59.564991 ctdb-recoverd[26083]: Node is stopped or banned but recovery mode is not active. Activate recovery mode and lock databases
2018/11/01 16:16:59.565072 ctdbd[25978]: Recovery mode set to ACTIVE
2018/11/01 16:16:59.565100 ctdbd[25978]: Freeze all
2018/11/01 16:16:59.565106 ctdbd[25978]: Freeze db: ctdb.tdb
2018/11/01 16:17:00.477875 ctdb-eventd[25980]: 07.iscsi: Takeip 172.31.254.9
2018/11/01 16:17:00.477906 ctdb-eventd[25980]: 07.iscsi: This Target already exists in configFS
2018/11/01 16:17:00.478946 ctdbd[25978]: Ignoring hung script for takeip event
2018/11/01 16:17:00.480323 ctdbd[25978]: ctdb_sys_send_arp: Interface '__none__' not found
2018/11/01 16:17:00.530771 ctdbd[25978]: ../ctdb/server/ctdb_takeover.c:378 sending of arp failed on iface '__none__' (No such device)
2018/11/01 16:17:00.530805 ctdbd[25978]: pnn 0 Invalid reqid 83559 in ctdb_reply_control
2018/11/01 16:17:00.530821 ctdbd[25978]: Ignoring hung script for takeip event
2018/11/01 16:17:00.530830 ctdbd[25978]: Ignoring hung script for takeip event
2018/11/01 16:17:00.530836 ctdbd[25978]: Ignoring hung script for takeip event
2018/11/01 16:17:00.530843 ctdbd[25978]: Ignoring hung script for takeip event
2018/11/01 16:17:00.532244 ctdbd[25978]: ctdb_sys_send_arp: Interface '__none__' not found
2018/11/01 16:17:00.546807 ctdbd[25978]: ../ctdb/server/ctdb_takeover.c:378 sending of arp failed on iface '__none__' (No such device)
2018/11/01 16:17:00.546857 ctdbd[25978]: pnn 0 Invalid reqid 83560 in ctdb_reply_control
2018/11/01 16:17:00.548275 ctdbd[25978]: ctdb_sys_send_arp: Interface '__none__' not found
2018/11/01 16:17:00.562804 ctdbd[25978]: ../ctdb/server/ctdb_takeover.c:378 sending of arp failed on iface '__none__' (No such device)
2018/11/01 16:17:00.562840 ctdbd[25978]: pnn 0 Invalid reqid 83561 in ctdb_reply_control
2018/11/01 16:17:00.564249 ctdbd[25978]: ctdb_sys_send_arp: Interface '__none__' not found
2018/11/01 16:17:00.578710 ctdbd[25978]: ../ctdb/server/ctdb_takeover.c:378 sending of arp failed on iface '__none__' (No such device)
2018/11/01 16:17:00.578744 ctdbd[25978]: pnn 0 Invalid reqid 83562 in ctdb_reply_control
2018/11/01 16:17:00.580178 ctdbd[25978]: ctdb_sys_send_arp: Interface '__none__' not found
2018/11/01 16:17:00.594817 ctdbd[25978]: ../ctdb/server/ctdb_takeover.c:378 sending of arp failed on iface '__none__' (No such device)
2018/11/01 16:17:00.594865 ctdbd[25978]: pnn 0 Invalid reqid 83563 in ctdb_reply_control
2018/11/01 16:17:01.632374 ctdbd[25978]: ctdb_sys_send_arp: Interface '__none__' not found
2018/11/01 16:17:01.646746 ctdbd[25978]: ../ctdb/server/ctdb_takeover.c:378 sending of arp failed on iface '__none__' (No such device)
2018/11/01 16:17:01.649100 ctdbd[25978]: ctdb_sys_send_arp: Interface '__none__' not found
2018/11/01 16:17:01.678809 ctdbd[25978]: ../ctdb/server/ctdb_takeover.c:378 sending of arp failed on iface '__none__' (No such device)
2018/11/01 16:17:01.680247 ctdbd[25978]: ctdb_sys_send_arp: Interface '__none__' not found
2018/11/01 16:17:01.694776 ctdbd[25978]: ../ctdb/server/ctdb_takeover.c:378 sending of arp failed on iface '__none__' (No such device)
2018/11/01 16:17:01.696200 ctdbd[25978]: ctdb_sys_send_arp: Interface '__none__' not found
2018/11/01 16:17:01.714805 ctdbd[25978]: ../ctdb/server/ctdb_takeover.c:378 sending of arp failed on iface '__none__' (No such device)
2018/11/01 16:17:01.716252 ctdbd[25978]: ctdb_sys_send_arp: Interface '__none__' not found
2018/11/01 16:17:01.730736 ctdbd[25978]: ../ctdb/server/ctdb_takeover.c:378 sending of arp failed on iface '__none__' (No such device)
2018/11/01 16:17:02.748471 ctdbd[25978]: ctdb_sys_send_arp: Interface '__none__' not found
2018/11/01 16:17:03.186737 ctdbd[25978]: ../ctdb/server/ctdb_takeover.c:378 sending of arp failed on iface '__none__' (No such device)
2018/11/01 16:17:03.188347 ctdbd[25978]: ctdb_sys_send_arp: Interface '__none__' not found
2018/11/01 16:17:03.202706 ctdbd[25978]: ../ctdb/server/ctdb_takeover.c:378 sending of arp failed on iface '__none__' (No such device)
2018/11/01 16:17:03.203846 ctdbd[25978]: ctdb_sys_send_arp: Interface '__none__' not found
2018/11/01 16:17:03.218733 ctdbd[25978]: ../ctdb/server/ctdb_takeover.c:378 sending of arp failed on iface '__none__' (No such device)
2018/11/01 16:17:03.219903 ctdbd[25978]: ctdb_sys_send_arp: Interface '__none__' not found
2018/11/01 16:17:03.234676 ctdbd[25978]: ../ctdb/server/ctdb_takeover.c:378 sending of arp failed on iface '__none__' (No such device)
2018/11/01 16:17:03.235935 ctdbd[25978]: ctdb_sys_send_arp: Interface '__none__' not found
2018/11/01 16:17:03.250653 ctdbd[25978]: ../ctdb/server/ctdb_takeover.c:378 sending of arp failed on iface '__none__' (No such device)
2018/11/01 16:17:04.227602 ctdb-eventd[25980]: Bad talloc magic value - unknown value
2018/11/01 16:17:05.593699 ctdbd[25978]: Lock contention during renew: -16
2018/11/01 16:17:05.594918 ctdbd[25978]: /usr/local/bin/ctdb_mutex_ceph_rados_helper: Failed to drop lock on RADOS object 'ctdb_sgw_recovery_lock' - (No such file or directory)
2018/11/01 16:17:05.597389 ctdb-recoverd[26083]: Recovery lock helper terminated unexpectedly - trying to retake recovery lock
2018/11/01 16:17:05.612360 ctdb-recoverd[26083]: Unable to take recovery lock - contention
2018/11/01 16:17:05.612414 ctdb-recoverd[26083]: Failed to take recovery lock
2018/11/01 16:17:07.562144 ctdb-recoverd[26083]: Reenabling recoveries after timeout
2018/11/01 16:17:08.563283 ctdbd[25978]: Banning timed out
2018/11/01 16:17:08.578468 ctdb-recoverd[26083]: Node:0 was in recovery mode. Start recovery process
2018/11/01 16:17:08.578504 ctdb-recoverd[26083]: ../ctdb/server/ctdb_recoverd.c:1317 Starting do_recovery
2018/11/01 16:17:08.578509 ctdb-recoverd[26083]: Attempting to take recovery lock (!/usr/local/bin/ctdb_mutex_ceph_rados_helper ceph client.BITED-158081-rbd BITED-158081-RADOSMISC ctdb_sgw_recovery_lock 10)
2018/11/01 16:17:08.594772 ctdb-recoverd[26083]: Unable to take recovery lock - contention
2018/11/01 16:17:08.594832 ctdb-recoverd[26083]: Unable to take recovery lock
2018/11/01 16:17:08.594837 ctdb-recoverd[26083]: Abort recovery, ban this node for 10 seconds
2018/11/01 16:17:08.594840 ctdb-recoverd[26083]: Banning node 0 for 10 seconds
2018/11/01 16:17:08.594888 ctdbd[25978]: Banning this node for 10 seconds
2018/11/01 16:17:08.594914 ctdbd[25978]: This node has been banned - releasing all public IPs and setting the generation to INVALID.
2018/11/01 16:17:08.594962 ctdbd[25978]: ../ctdb/server/ctdb_takeover.c:1638 Released 0 public IPs
2018/11/01 16:17:09.580304 ctdbd[25978]: Freeze all: frozen
2018/11/01 16:17:18.594975 ctdbd[25978]: Banning timed out
2018/11/01 16:17:18.597619 ctdb-recoverd[26083]: ../ctdb/server/ctdb_recoverd.c:1317 Starting do_recovery
2018/11/01 16:17:18.597632 ctdb-recoverd[26083]: Attempting to take recovery lock (!/usr/local/bin/ctdb_mutex_ceph_rados_helper ceph client.BITED-158081-rbd BITED-158081-RADOSMISC ctdb_sgw_recovery_lock 10)
2018/11/01 16:17:18.623397 ctdb-recoverd[26083]: Unable to take recovery lock - contention
2018/11/01 16:17:18.623527 ctdb-recoverd[26083]: Unable to take recovery lock
2018/11/01 16:17:18.623532 ctdb-recoverd[26083]: Abort recovery, ban this node for 10 seconds
2018/11/01 16:17:18.623556 ctdb-recoverd[26083]: Banning node 0 for 10 seconds
2018/11/01 16:17:18.628327 ctdbd[25978]: Banning this node for 10 seconds
2018/11/01 16:17:18.628340 ctdbd[25978]: This node has been banned - releasing all public IPs and setting the generation to INVALID.
2018/11/01 16:17:18.628363 ctdbd[25978]: ../ctdb/server/ctdb_takeover.c:1638 Released 0 public IPs
2018/11/01 16:17:19.599321 ctdbd[25978]: Freeze all: frozen
2018/11/01 16:17:28.628486 ctdbd[25978]: Banning timed out
2018/11/01 16:17:29.618587 ctdb-recoverd[26083]: ../ctdb/server/ctdb_recoverd.c:1317 Starting do_recovery
2018/11/01 16:17:29.618634 ctdb-recoverd[26083]: Attempting to take recovery lock (!/usr/local/bin/ctdb_mutex_ceph_rados_helper ceph client.BITED-158081-rbd BITED-158081-RADOSMISC ctdb_sgw_recovery_lock 10)
2018/11/01 16:17:29.635073 ctdb-recoverd[26083]: Unable to take recovery lock - contention
2018/11/01 16:17:29.635144 ctdb-recoverd[26083]: Unable to take recovery lock
2018/11/01 16:17:29.635149 ctdb-recoverd[26083]: Abort recovery, ban this node for 10 seconds
2018/11/01 16:17:29.635165 ctdb-recoverd[26083]: Banning node 0 for 10 seconds
2018/11/01 16:17:29.635225 ctdbd[25978]: Banning this node for 10 seconds
2018/11/01 16:17:29.635251 ctdbd[25978]: This node has been banned - releasing all public IPs and setting the generation to INVALID.
2018/11/01 16:17:29.635298 ctdbd[25978]: ../ctdb/server/ctdb_takeover.c:1638 Released 0 public IPs
2018/11/01 16:17:30.620056 ctdbd[25978]: Freeze all: frozen
2018/11/01 16:17:39.635271 ctdbd[25978]: Banning timed out
2018/11/01 16:17:40.637234 ctdb-recoverd[26083]: ../ctdb/server/ctdb_recoverd.c:1317 Starting do_recovery
2018/11/01 16:17:40.637260 ctdb-recoverd[26083]: Attempting to take recovery lock (!/usr/local/bin/ctdb_mutex_ceph_rados_helper ceph client.BITED-158081-rbd BITED-158081-RADOSMISC ctdb_sgw_recovery_lock 10)
2018/11/01 16:17:40.654049 ctdb-recoverd[26083]: Unable to take recovery lock - contention
2018/11/01 16:17:40.654107 ctdb-recoverd[26083]: Unable to take recovery lock
2018/11/01 16:17:40.654112 ctdb-recoverd[26083]: Abort recovery, ban this node for 10 seconds
2018/11/01 16:17:40.654116 ctdb-recoverd[26083]: Banning node 0 for 10 seconds
2018/11/01 16:17:40.654197 ctdbd[25978]: Banning this node for 10 seconds
2018/11/01 16:17:40.654208 ctdbd[25978]: This node has been banned - releasing all public IPs and setting the generation to INVALID.
2018/11/01 16:17:40.654243 ctdbd[25978]: ../ctdb/server/ctdb_takeover.c:1638 Released 0 public IPs
2018/11/01 16:17:41.639632 ctdbd[25978]: Freeze all: frozen
2018/11/01 16:17:50.654852 ctdbd[25978]: Banning timed out
...
-------------- next part --------------
2018/11/01 16:04:10.111228 ctdbd[22358]: Recovery mode set to ACTIVE
2018/11/01 16:04:10.612554 ctdbd[22358]: This node (1) is now the recovery master
2018/11/01 16:04:11.900176 ctdbd[22358]: dead count reached for node 0
2018/11/01 16:04:11.900204 ctdbd[22358]: 2001:7b8:80:3:0:1:9:2:4379: node 2001:7b8:80:1:0:1:9:1:4379 is dead: 1 connected
2018/11/01 16:04:11.900235 ctdbd[22358]: Tearing down connection to dead node :0
2018/11/01 16:04:11.900271 ctdb-recoverd[22505]: ctdb_control error: 'node is disconnected'
2018/11/01 16:04:11.900285 ctdb-recoverd[22505]: ctdb_control error: 'node is disconnected'
2018/11/01 16:04:11.900291 ctdb-recoverd[22505]: Async operation failed with ret=-1 res=-1 opcode=80
2018/11/01 16:04:11.900295 ctdb-recoverd[22505]: Async wait failed - fail_count=1
2018/11/01 16:04:11.900299 ctdb-recoverd[22505]: ../ctdb/server/ctdb_client.c:1931 Failed to read node capabilities.
2018/11/01 16:04:11.900304 ctdb-recoverd[22505]: ../ctdb/server/ctdb_recoverd.c:370 Failed to get node capabilities
2018/11/01 16:04:11.900308 ctdb-recoverd[22505]: ../ctdb/server/ctdb_recoverd.c:2731 Unable to update node capabilities.
2018/11/01 16:04:13.111978 ctdb-recoverd[22505]: Election period ended
2018/11/01 16:04:13.905092 ctdb-recoverd[22505]: Node:1 was in recovery mode. Start recovery process
2018/11/01 16:04:13.905269 ctdb-recoverd[22505]: Node:2 was in recovery mode. Start recovery process
2018/11/01 16:04:13.905278 ctdb-recoverd[22505]: ../ctdb/server/ctdb_recoverd.c:1317 Starting do_recovery
2018/11/01 16:04:13.905283 ctdb-recoverd[22505]: Attempting to take recovery lock (!/usr/local/bin/ctdb_mutex_ceph_rados_helper ceph client.BITED-158081
-rbd BITED-158081-RADOSMISC ctdb_sgw_recovery_lock 10)
2018/11/01 16:04:13.921337 ctdb-recoverd[22505]: Recovery lock taken successfully
2018/11/01 16:04:13.921372 ctdb-recoverd[22505]: ../ctdb/server/ctdb_recoverd.c:1392 Recovery initiated due to problem with node 0
2018/11/01 16:04:13.922186 ctdb-recoverd[22505]: ../ctdb/server/ctdb_recoverd.c:1417 Recovery - created remote databases
2018/11/01 16:04:13.922817 ctdb-recoverd[22505]: ../ctdb/server/ctdb_recoverd.c:1446 Recovery - updated flags
2018/11/01 16:04:13.922830 ctdb-recoverd[22505]: Set recovery_helper to "/usr/lib/x86_64-linux-gnu/ctdb/ctdb_recovery_helper"
2018/11/01 16:04:13.924572 ctdb-recovery[14479]: Set recovery mode to ACTIVE
2018/11/01 16:04:13.924627 ctdbd[22358]: Recovery has started
2018/11/01 16:04:13.938511 ctdb-recovery[14479]: start_recovery event finished
2018/11/01 16:04:13.938929 ctdb-recovery[14479]: updated VNNMAP
2018/11/01 16:04:13.938941 ctdb-recovery[14479]: recover database 0x6645c6c4
2018/11/01 16:04:13.939111 ctdbd[22358]: Freeze db: ctdb.tdb
2018/11/01 16:04:14.746854 ctdbd[22358]: Thaw db: ctdb.tdb generation 711836444
2018/11/01 16:04:14.746878 ctdbd[22358]: Release freeze handle for db ctdb.tdb
2018/11/01 16:04:14.747456 ctdb-recovery[14479]: 1 of 1 databases recovered
2018/11/01 16:04:14.747514 ctdbd[22358]: Recovery mode set to NORMAL
2018/11/01 16:04:14.763334 ctdb-recovery[14479]: Set recovery mode to NORMAL
2018/11/01 16:04:14.763410 ctdbd[22358]: Recovery has finished
2018/11/01 16:04:14.784267 ctdb-recovery[14479]: recovered event finished
2018/11/01 16:04:14.784319 ctdb-recoverd[22505]: Takeover run starting
2018/11/01 16:04:14.784341 ctdb-recoverd[22505]: Set takeover_helper to "/usr/lib/x86_64-linux-gnu/ctdb/ctdb_takeover_helper"
2018/11/01 16:04:14.786338 ctdbd[22358]: Takeover of IP 172.31.254.9/24 on interface vlan88
2018/11/01 16:04:14.786400 ctdbd[22358]: Takeover of IP 172.17.243.139/24 on interface vlan243
2018/11/01 16:04:14.786425 ctdbd[22358]: Takeover of IP 172.17.243.133/24 on interface vlan243
2018/11/01 16:04:16.472695 ctdbd[22358]: Monitoring event was cancelled
2018/11/01 16:04:21.472985 ctdbd[22358]: Monitoring event was cancelled
2018/11/01 16:04:26.473429 ctdbd[22358]: Monitoring event was cancelled
2018/11/01 16:04:30.869728 ctdb-eventd[22360]: 07.iscsi: Takeip 172.31.254.9
2018/11/01 16:04:30.869751 ctdb-eventd[22360]: 07.iscsi: Created target iqn.2003-01.org.linux-iscsi.iscsi-filer.x8664:sn.4aab30bb6db6.
2018/11/01 16:04:30.869755 ctdb-eventd[22360]: 07.iscsi: Created TPG 1.
2018/11/01 16:04:30.869759 ctdb-eventd[22360]: 07.iscsi: Global pref auto_add_default_portal=true
2018/11/01 16:04:30.869762 ctdb-eventd[22360]: 07.iscsi: Created default portal listening on all IPs (0.0.0.0), port 3260.
2018/11/01 16:04:30.869765 ctdb-eventd[22360]: 07.iscsi: Created block storage object BITED01 using /dev/nbd1.
2018/11/01 16:04:30.869768 ctdb-eventd[22360]: 07.iscsi: Created LUN 1.
2018/11/01 16:04:30.869771 ctdb-eventd[22360]: 07.iscsi: Created block storage object BITED02 using /dev/nbd2.
2018/11/01 16:04:30.869774 ctdb-eventd[22360]: 07.iscsi: Created LUN 2.
2018/11/01 16:04:30.869777 ctdb-eventd[22360]: 07.iscsi: Created block storage object BITED03 using /dev/nbd3.
2018/11/01 16:04:30.869780 ctdb-eventd[22360]: 07.iscsi: Created LUN 3.
2018/11/01 16:04:30.869783 ctdb-eventd[22360]: 07.iscsi: Deleted network portal 0.0.0.0:3260
2018/11/01 16:04:30.869786 ctdb-eventd[22360]: 07.iscsi: Using default IP port 3260
2018/11/01 16:04:30.869789 ctdb-eventd[22360]: 07.iscsi: Created network portal 172.31.254.9:3260.
2018/11/01 16:04:30.869792 ctdb-eventd[22360]: 07.iscsi: Created Node ACL for iqn.1998-01.com.vmware:ndc-esx8-1cc6dd78
2018/11/01 16:04:30.869795 ctdb-eventd[22360]: 07.iscsi: Created mapped LUN 3.
2018/11/01 16:04:30.869798 ctdb-eventd[22360]: 07.iscsi: Created mapped LUN 2.
2018/11/01 16:04:30.869801 ctdb-eventd[22360]: 07.iscsi: Created mapped LUN 1.
2018/11/01 16:04:30.869804 ctdb-eventd[22360]: 07.iscsi: no auth for iqn.1998-01.com.vmware:ndc-esx8-1cc6dd78
2018/11/01 16:04:30.869807 ctdb-eventd[22360]: 07.iscsi: Created Node ACL for iqn.1998-01.com.vmware:ndc-esx9-2a09a6ff
2018/11/01 16:04:30.869810 ctdb-eventd[22360]: 07.iscsi: Created mapped LUN 3.
2018/11/01 16:04:30.869813 ctdb-eventd[22360]: 07.iscsi: Created mapped LUN 2.
2018/11/01 16:04:30.869816 ctdb-eventd[22360]: 07.iscsi: Created mapped LUN 1.
2018/11/01 16:04:30.869819 ctdb-eventd[22360]: 07.iscsi: no auth for iqn.1998-01.com.vmware:ndc-esx9-2a09a6ff
2018/11/01 16:04:30.869823 ctdb-eventd[22360]: 07.iscsi: Created Node ACL for iqn.1993-08.org.debian:01:e1fe6fd2c243
2018/11/01 16:04:30.869826 ctdb-eventd[22360]: 07.iscsi: Created mapped LUN 3.
2018/11/01 16:04:30.869829 ctdb-eventd[22360]: 07.iscsi: Created mapped LUN 2.
2018/11/01 16:04:30.869832 ctdb-eventd[22360]: 07.iscsi: Created mapped LUN 1.
2018/11/01 16:04:30.869835 ctdb-eventd[22360]: 07.iscsi: no auth for iqn.1993-08.org.debian:01:e1fe6fd2c243
2018/11/01 16:04:30.869838 ctdb-eventd[22360]: 07.iscsi: done
2018/11/01 16:04:31.462132 ctdb-eventd[22360]: 06.nfs: Takeip 172.17.243.139
2018/11/01 16:04:31.462149 ctdb-eventd[22360]: 06.nfs: Mounting /dev/nbd6
2018/11/01 16:04:31.462154 ctdb-eventd[22360]: 06.nfs: Starting Ganesha for 172.17.243.139
2018/11/01 16:04:31.462158 ctdb-eventd[22360]: 06.nfs: done
2018/11/01 16:04:31.473716 ctdbd[22358]: Monitoring event was cancelled
2018/11/01 16:04:33.015703 ctdb-eventd[22360]: 07.iscsi: Takeip 172.17.243.133
2018/11/01 16:04:33.015728 ctdb-eventd[22360]: 07.iscsi: Created target iqn.1996-03.nl.bit.linux-iscsi.iscsi-filer:sn.172.17.243.133.
2018/11/01 16:04:33.015732 ctdb-eventd[22360]: 07.iscsi: Created TPG 1.
2018/11/01 16:04:33.015736 ctdb-eventd[22360]: 07.iscsi: Default portal not created, TPGs within a target cannot share ip:port.
2018/11/01 16:04:33.015739 ctdb-eventd[22360]: 07.iscsi: Created block storage object BITED06 using /dev/nbd7.
2018/11/01 16:04:33.015742 ctdb-eventd[22360]: 07.iscsi: Created LUN 1.
2018/11/01 16:04:33.015745 ctdb-eventd[22360]: 07.iscsi: No such NetworkPortal in configfs: /sys/kernel/config/target/iscsi/iqn.1996-03.nl.bit.linux-iscsi.iscsi-filer:sn.172.17.243.133/tpgt_1/np/0.0.0.0:3260
2018/11/01 16:04:33.015748 ctdb-eventd[22360]: 07.iscsi: Using default IP port 3260
2018/11/01 16:04:33.015751 ctdb-eventd[22360]: 07.iscsi: Created network portal 172.17.243.133:3260.
2018/11/01 16:04:33.015762 ctdb-eventd[22360]: 07.iscsi: Created Node ACL for iqn.1993-08.org.debian:01:c3a88bdaed0
2018/11/01 16:04:33.015765 ctdb-eventd[22360]: 07.iscsi: Created mapped LUN 1.
2018/11/01 16:04:33.015768 ctdb-eventd[22360]: 07.iscsi: set auth for iqn.1993-08.org.debian:01:c3a88bdaed0
2018/11/01 16:04:33.015771 ctdb-eventd[22360]: 07.iscsi: Parameter authentication is now '1'.
2018/11/01 16:04:33.015774 ctdb-eventd[22360]: 07.iscsi: Parameter userid is now 'iqn.1993-08.org.debian:01:c3a88bdaed0'.
2018/11/01 16:04:33.015777 ctdb-eventd[22360]: 07.iscsi: Parameter password is now 'ethi1At0pei5'.
2018/11/01 16:04:33.015781 ctdb-eventd[22360]: 07.iscsi: done
2018/11/01 16:04:33.056531 ctdb-recoverd[22505]: Takeover run completed successfully
2018/11/01 16:04:33.056568 ctdb-recoverd[22505]: ../ctdb/server/ctdb_recoverd.c:1464 Recovery complete
2018/11/01 16:04:33.056573 ctdb-recoverd[22505]: Resetting ban count to 0 for all nodes
2018/11/01 16:04:33.056577 ctdb-recoverd[22505]: Just finished a recovery. New recoveries will now be suppressed for the rerecovery timeout (10 seconds)
2018/11/01 16:04:33.056582 ctdb-recoverd[22505]: Disabling recoveries for 10 seconds
2018/11/01 16:04:43.058072 ctdb-recoverd[22505]: Reenabling recoveries after timeout
2018/11/01 16:07:34.202931 ctdbd[22358]: pnn 1 Invalid reqid 41848 in ctdb_reply_control
2018/11/01 16:07:34.203119 ctdbd[22358]: pnn 1 Invalid reqid 41857 in ctdb_reply_control
2018/11/01 16:07:34.705088 ctdb-recoverd[22505]: Releasing recovery lock
2018/11/01 16:07:34.705200 ctdbd[22358]: Remote node (0) is now the recovery master
2018/11/01 16:07:35.967928 ctdbd[22358]: 2001:7b8:80:3:0:1:9:2:4379: connected to 2001:7b8:80:1:0:1:9:1:4379 - 2 connected
2018/11/01 16:07:37.418395 ctdb-recoverd[22505]: Node 1 has changed flags - now 0x0  was 0x2
2018/11/01 16:07:37.420492 ctdb-recoverd[22505]: Node 0 has changed flags - now 0x0  was 0x2
2018/11/01 16:07:37.424399 ctdbd[22358]: Recovery mode set to ACTIVE
2018/11/01 16:07:37.424777 ctdbd[22358]: Recovery has started
2018/11/01 16:07:37.440396 ctdbd[22358]: Freeze db: ctdb.tdb
2018/11/01 16:07:37.493991 ctdbd[22358]: Thaw db: ctdb.tdb generation 656678117
2018/11/01 16:07:37.494006 ctdbd[22358]: Release freeze handle for db ctdb.tdb
2018/11/01 16:07:37.494821 ctdbd[22358]: Recovery mode set to NORMAL
2018/11/01 16:07:37.705969 ctdb-recoverd[22505]: Election period ended
2018/11/01 16:07:38.424090 ctdbd[22358]: Recovery mode set to ACTIVE
2018/11/01 16:07:38.424327 ctdbd[22358]: Recovery has started
2018/11/01 16:07:38.439377 ctdbd[22358]: Freeze db: ctdb.tdb
2018/11/01 16:07:38.492755 ctdbd[22358]: Thaw db: ctdb.tdb generation 568703777
2018/11/01 16:07:38.492767 ctdbd[22358]: Release freeze handle for db ctdb.tdb
2018/11/01 16:07:38.493674 ctdbd[22358]: Recovery mode set to NORMAL
2018/11/01 16:07:39.424196 ctdbd[22358]: Recovery mode set to ACTIVE
2018/11/01 16:07:39.424406 ctdbd[22358]: Recovery has started
2018/11/01 16:07:39.441275 ctdbd[22358]: Freeze db: ctdb.tdb
2018/11/01 16:07:39.496260 ctdbd[22358]: Thaw db: ctdb.tdb generation 2125763535
2018/11/01 16:07:39.496274 ctdbd[22358]: Release freeze handle for db ctdb.tdb
2018/11/01 16:07:39.497167 ctdbd[22358]: Recovery mode set to NORMAL
2018/11/01 16:07:39.512870 ctdbd[22358]: Recovery has finished
2018/11/01 16:07:39.527927 ctdb-recoverd[22505]: Disabling takeover runs for 60 seconds
2018/11/01 16:08:06.403607 ctdb-recoverd[22505]: Reenabling takeover runs
2018/11/01 16:14:26.596596 ctdb-recoverd[22505]: Disabling takeover runs for 20 seconds
2018/11/01 16:14:26.596618 ctdbd[22358]: Release of IP 172.17.243.139/24 on interface vlan243  node:0
2018/11/01 16:14:26.615983 ctdb-eventd[22360]: 10.interface: Unable to parse connections (Invalid argument)
2018/11/01 16:14:26.616001 ctdb-eventd[22360]: 10.interface: Failed to kill TCP connections
2018/11/01 16:14:27.779954 ctdb-eventd[22360]: 60.nfs: Releaseip 172.17.243.139
2018/11/01 16:14:27.779975 ctdb-eventd[22360]: 60.nfs: killing PID 15322
2018/11/01 16:14:27.779986 ctdb-eventd[22360]: 60.nfs: .
2018/11/01 16:14:27.779989 ctdb-eventd[22360]: 60.nfs: Cleaned up /mnt/ganesha/172.17.243.139
2018/11/01 16:14:27.779992 ctdb-eventd[22360]: 60.nfs: rbd-nbd: BITED-152623-BITCLOUD/iscsi-filer-disk09 is not mapped
2018/11/01 16:14:27.779995 ctdb-eventd[22360]: 60.nfs: rbd-nbd: BITED-152623-BITCLOUD/iscsi-filer-disk09 is not mapped
2018/11/01 16:14:30.613202 ctdb-recoverd[22505]: Disabling takeover runs for 20 seconds
2018/11/01 16:14:30.613305 ctdbd[22358]: Release of IP 172.17.243.133/24 on interface vlan243  node:0
2018/11/01 16:14:35.175915 ctdb-eventd[22360]: 10.interface: Killed 1/1 TCP connections to released IP 172.17.243.133
2018/11/01 16:14:35.328906 ctdbd[22358]: Monitoring event was cancelled
2018/11/01 16:14:35.518887 ctdb-eventd[22360]: 70.iscsi: Releaseip 172.17.243.133
2018/11/01 16:14:35.518910 ctdb-eventd[22360]: 70.iscsi: rbd-nbd: BITED-152623-BITCLOUD/iscsi-filer-disk02 is not mapped
2018/11/01 16:14:35.518914 ctdb-eventd[22360]: 70.iscsi: rbd-nbd: BITED-152623-BITCLOUD/iscsi-filer-disk02 is not mapped
2018/11/01 16:14:35.518918 ctdb-eventd[22360]: 70.iscsi: Deleted storage object BITED06.
2018/11/01 16:14:35.518921 ctdb-eventd[22360]: 70.iscsi: Deleted Target iqn.1996-03.nl.bit.linux-iscsi.iscsi-filer:sn.172.17.243.133.
2018/11/01 16:14:50.614581 ctdb-recoverd[22505]: Reenabling takeover runs after timeout
2018/11/01 16:15:10.957887 ctdbd[22358]: Recovery mode set to ACTIVE
2018/11/01 16:15:11.459190 ctdbd[22358]: This node (1) is now the recovery master
2018/11/01 16:15:12.122462 ctdbd[22358]: dead count reached for node 0
2018/11/01 16:15:12.122490 ctdbd[22358]: 2001:7b8:80:3:0:1:9:2:4379: node 2001:7b8:80:1:0:1:9:1:4379 is dead: 1 connected
2018/11/01 16:15:12.122525 ctdbd[22358]: Tearing down connection to dead node :0
2018/11/01 16:15:12.122589 ctdb-recoverd[22505]: ctdb_control error: 'node is disconnected'
2018/11/01 16:15:12.122619 ctdb-recoverd[22505]: ctdb_control error: 'node is disconnected'
2018/11/01 16:15:12.122641 ctdb-recoverd[22505]: Async operation failed with ret=-1 res=-1 opcode=80
2018/11/01 16:15:12.122647 ctdb-recoverd[22505]: Async wait failed - fail_count=1
2018/11/01 16:15:12.122652 ctdb-recoverd[22505]: ../ctdb/server/ctdb_client.c:1931 Failed to read node capabilities.
2018/11/01 16:15:12.122656 ctdb-recoverd[22505]: ../ctdb/server/ctdb_recoverd.c:370 Failed to get node capabilities
2018/11/01 16:15:12.122660 ctdb-recoverd[22505]: ../ctdb/server/ctdb_recoverd.c:2731 Unable to update node capabilities.
2018/11/01 16:15:13.960043 ctdb-recoverd[22505]: Election period ended
2018/11/01 16:15:14.126314 ctdb-recoverd[22505]: Node:1 was in recovery mode. Start recovery process
2018/11/01 16:15:14.126516 ctdb-recoverd[22505]: Node:2 was in recovery mode. Start recovery process
2018/11/01 16:15:14.126526 ctdb-recoverd[22505]: ../ctdb/server/ctdb_recoverd.c:1317 Starting do_recovery
2018/11/01 16:15:14.126531 ctdb-recoverd[22505]: Attempting to take recovery lock (!/usr/local/bin/ctdb_mutex_ceph_rados_helper ceph client.BITED-158081-rbd BITED-158081-RADOSMISC ctdb_sgw_recovery_lock 10)
2018/11/01 16:15:14.142068 ctdb-recoverd[22505]: Recovery lock taken successfully
2018/11/01 16:15:14.142103 ctdb-recoverd[22505]: ../ctdb/server/ctdb_recoverd.c:1392 Recovery initiated due to problem with node 0
2018/11/01 16:15:14.142642 ctdb-recoverd[22505]: ../ctdb/server/ctdb_recoverd.c:1417 Recovery - created remote databases
2018/11/01 16:15:14.143461 ctdb-recoverd[22505]: ../ctdb/server/ctdb_recoverd.c:1446 Recovery - updated flags
2018/11/01 16:15:14.145271 ctdb-recovery[8016]: Set recovery mode to ACTIVE
2018/11/01 16:15:14.145355 ctdbd[22358]: Recovery has started
2018/11/01 16:15:14.159242 ctdb-recovery[8016]: start_recovery event finished
2018/11/01 16:15:14.159530 ctdb-recovery[8016]: updated VNNMAP
2018/11/01 16:15:14.159551 ctdb-recovery[8016]: recover database 0x6645c6c4
2018/11/01 16:15:14.159610 ctdbd[22358]: Freeze db: ctdb.tdb
2018/11/01 16:15:14.207142 ctdbd[22358]: Thaw db: ctdb.tdb generation 31943707
2018/11/01 16:15:14.207165 ctdbd[22358]: Release freeze handle for db ctdb.tdb
2018/11/01 16:15:14.207543 ctdb-recovery[8016]: 1 of 1 databases recovered
2018/11/01 16:15:14.207611 ctdbd[22358]: Recovery mode set to NORMAL
2018/11/01 16:15:14.223834 ctdb-recovery[8016]: Set recovery mode to NORMAL
2018/11/01 16:15:14.223883 ctdbd[22358]: Recovery has finished
2018/11/01 16:15:14.237847 ctdb-recovery[8016]: recovered event finished
2018/11/01 16:15:14.237910 ctdb-recoverd[22505]: Takeover run starting
2018/11/01 16:15:14.240476 ctdbd[22358]: Takeover of IP 172.17.243.139/24 on interface vlan243
2018/11/01 16:15:14.240521 ctdbd[22358]: Takeover of IP 172.17.243.133/24 on interface vlan243
2018/11/01 16:15:15.550486 ctdbd[22358]: Monitoring event was cancelled
2018/11/01 16:15:20.551015 ctdbd[22358]: Monitoring event was cancelled
2018/11/01 16:15:25.552055 ctdbd[22358]: Monitoring event was cancelled
2018/11/01 16:15:28.002025 ctdb-eventd[22360]: 06.nfs: Takeip 172.17.243.139
2018/11/01 16:15:28.002040 ctdb-eventd[22360]: 06.nfs: Mounting /dev/nbd6
2018/11/01 16:15:28.002044 ctdb-eventd[22360]: 06.nfs: Starting Ganesha for 172.17.243.139
2018/11/01 16:15:28.002048 ctdb-eventd[22360]: 06.nfs: done
2018/11/01 16:15:29.735717 ctdb-eventd[22360]: 07.iscsi: Takeip 172.17.243.133
2018/11/01 16:15:29.735742 ctdb-eventd[22360]: 07.iscsi: Created target iqn.1996-03.nl.bit.linux-iscsi.iscsi-filer:sn.172.17.243.133.
2018/11/01 16:15:29.735746 ctdb-eventd[22360]: 07.iscsi: Created TPG 1.
2018/11/01 16:15:29.735749 ctdb-eventd[22360]: 07.iscsi: Default portal not created, TPGs within a target cannot share ip:port.
2018/11/01 16:15:29.735752 ctdb-eventd[22360]: 07.iscsi: Created block storage object BITED06 using /dev/nbd7.
2018/11/01 16:15:29.735755 ctdb-eventd[22360]: 07.iscsi: Created LUN 1.
2018/11/01 16:15:29.735760 ctdb-eventd[22360]: 07.iscsi: No such NetworkPortal in configfs: /sys/kernel/config/target/iscsi/iqn.1996-03.nl.bit.linux-iscsi.iscsi-filer:sn.172.17.243.133/tpgt_1/np/0.0.0.0:3260
2018/11/01 16:15:29.735763 ctdb-eventd[22360]: 07.iscsi: Using default IP port 3260
2018/11/01 16:15:29.735766 ctdb-eventd[22360]: 07.iscsi: Created network portal 172.17.243.133:3260.
2018/11/01 16:15:29.735770 ctdb-eventd[22360]: 07.iscsi: Created Node ACL for iqn.1993-08.org.debian:01:c3a88bdaed0
2018/11/01 16:15:29.735774 ctdb-eventd[22360]: 07.iscsi: Created mapped LUN 1.
2018/11/01 16:15:29.735776 ctdb-eventd[22360]: 07.iscsi: set auth for iqn.1993-08.org.debian:01:c3a88bdaed0
2018/11/01 16:15:29.735779 ctdb-eventd[22360]: 07.iscsi: Parameter authentication is now '1'.
2018/11/01 16:15:29.735782 ctdb-eventd[22360]: 07.iscsi: Parameter userid is now 'iqn.1993-08.org.debian:01:c3a88bdaed0'.
2018/11/01 16:15:29.735785 ctdb-eventd[22360]: 07.iscsi: Parameter password is now 'ethi1At0pei5'.
2018/11/01 16:15:29.735789 ctdb-eventd[22360]: 07.iscsi: done
2018/11/01 16:15:29.780510 ctdb-recoverd[22505]: Takeover run completed successfully
2018/11/01 16:15:29.780540 ctdb-recoverd[22505]: ../ctdb/server/ctdb_recoverd.c:1464 Recovery complete
2018/11/01 16:15:29.780543 ctdb-recoverd[22505]: Resetting ban count to 0 for all nodes
2018/11/01 16:15:29.780548 ctdb-recoverd[22505]: Just finished a recovery. New recoveries will now be suppressed for the rerecovery timeout (10 seconds)
2018/11/01 16:15:29.780551 ctdb-recoverd[22505]: Disabling recoveries for 10 seconds
2018/11/01 16:15:39.781800 ctdb-recoverd[22505]: Reenabling recoveries after timeout
-------------- next part --------------
2018/11/01 16:04:10.110594 ctdbd[2727]: dead count reached for node 0
2018/11/01 16:04:10.110620 ctdbd[2727]: 2001:7b8:80:2:0:1:9:3:4379: node 2001:7b8:80:1:0:1:9:1:4379 is dead: 1 connected
2018/11/01 16:04:10.110640 ctdbd[2727]: Tearing down connection to dead node :0
2018/11/01 16:04:10.110675 ctdb-recoverd[2833]: ctdb_control error: 'node is disconnected'
2018/11/01 16:04:10.110687 ctdb-recoverd[2833]: ctdb_control error: 'node is disconnected'
2018/11/01 16:04:10.110693 ctdb-recoverd[2833]: Async operation failed with ret=-1 res=-1 opcode=80
2018/11/01 16:04:10.110697 ctdb-recoverd[2833]: Async wait failed - fail_count=1
2018/11/01 16:04:10.110715 ctdb-recoverd[2833]: ../ctdb/server/ctdb_client.c:1931 Failed to read node capabilities.
2018/11/01 16:04:10.110718 ctdb-recoverd[2833]: ../ctdb/server/ctdb_recoverd.c:370 Failed to get node capabilities
2018/11/01 16:04:10.110722 ctdb-recoverd[2833]: ../ctdb/server/ctdb_recoverd.c:2731 Unable to update node capabilities.
2018/11/01 16:04:10.111085 ctdb-recoverd[2833]:  Current recmaster node 0 does not have CAP_RECMASTER, but we (node 2) have - force an election
2018/11/01 16:04:10.111124 ctdbd[2727]: Recovery mode set to ACTIVE
2018/11/01 16:04:10.111342 ctdbd[2727]: This node (2) is now the recovery master
2018/11/01 16:04:10.613020 ctdbd[2727]: Remote node (1) is now the recovery master
2018/11/01 16:04:13.616214 ctdb-recoverd[2833]: Election period ended
2018/11/01 16:04:13.924729 ctdbd[2727]: Recovery has started
2018/11/01 16:04:13.939402 ctdbd[2727]: Freeze db: ctdb.tdb
2018/11/01 16:04:14.746937 ctdbd[2727]: Thaw db: ctdb.tdb generation 711836444
2018/11/01 16:04:14.746991 ctdbd[2727]: Release freeze handle for db ctdb.tdb
2018/11/01 16:04:14.747545 ctdbd[2727]: Recovery mode set to NORMAL
2018/11/01 16:04:14.763521 ctdbd[2727]: Recovery has finished
2018/11/01 16:04:14.784524 ctdb-recoverd[2833]: Disabling takeover runs for 60 seconds
2018/11/01 16:04:14.786431 ctdbd[2727]: Takeover of IP 172.17.243.140/24 on interface vlan243
2018/11/01 16:04:14.786467 ctdbd[2727]: Takeover of IP 172.17.243.9/24 on interface vlan243
2018/11/01 16:04:15.159996 ctdbd[2727]: Monitoring event was cancelled
2018/11/01 16:04:20.160586 ctdbd[2727]: Monitoring event was cancelled
2018/11/01 16:04:25.161243 ctdbd[2727]: Monitoring event was cancelled
2018/11/01 16:04:29.314444 ctdb-eventd[2729]: 06.nfs: Takeip 172.17.243.140
2018/11/01 16:04:29.314457 ctdb-eventd[2729]: 06.nfs: Mounting /dev/nbd0
2018/11/01 16:04:29.314461 ctdb-eventd[2729]: 06.nfs: Starting Ganesha for 172.17.243.140
2018/11/01 16:04:29.314465 ctdb-eventd[2729]: 06.nfs: done
2018/11/01 16:04:30.161752 ctdbd[2727]: Monitoring event was cancelled
2018/11/01 16:04:31.809834 ctdb-eventd[2729]: 07.iscsi: Takeip 172.17.243.9
2018/11/01 16:04:31.809862 ctdb-eventd[2729]: 07.iscsi: Created target iqn.1996-03.nl.bit.linux-iscsi.iscsi-filer:sn.172.17.243.9.
2018/11/01 16:04:31.809867 ctdb-eventd[2729]: 07.iscsi: Created TPG 1.
2018/11/01 16:04:31.809883 ctdb-eventd[2729]: 07.iscsi: Global pref auto_add_default_portal=true
2018/11/01 16:04:31.809886 ctdb-eventd[2729]: 07.iscsi: Created default portal listening on all IPs (0.0.0.0), port 3260.
2018/11/01 16:04:31.809889 ctdb-eventd[2729]: 07.iscsi: Created block storage object BITED13 using /dev/nbd1.
2018/11/01 16:04:31.809891 ctdb-eventd[2729]: 07.iscsi: Created LUN 1.
2018/11/01 16:04:31.809894 ctdb-eventd[2729]: 07.iscsi: Created block storage object BITED14 using /dev/nbd2.
2018/11/01 16:04:31.809897 ctdb-eventd[2729]: 07.iscsi: Created LUN 2.
2018/11/01 16:04:31.809899 ctdb-eventd[2729]: 07.iscsi: Deleted network portal 0.0.0.0:3260
2018/11/01 16:04:31.809902 ctdb-eventd[2729]: 07.iscsi: Using default IP port 3260
2018/11/01 16:04:31.809905 ctdb-eventd[2729]: 07.iscsi: Created network portal 172.17.243.9:3260.
2018/11/01 16:04:31.809908 ctdb-eventd[2729]: 07.iscsi: Created Node ACL for iqn.1991-05.com.microsoft:test-cln2.http-windows.dmz.bit.nl
2018/11/01 16:04:31.809910 ctdb-eventd[2729]: 07.iscsi: Created mapped LUN 2.
2018/11/01 16:04:31.809913 ctdb-eventd[2729]: 07.iscsi: Created mapped LUN 1.
2018/11/01 16:04:31.809916 ctdb-eventd[2729]: 07.iscsi: no auth for iqn.1991-05.com.microsoft:test-cln2.http-windows.dmz.bit.nl
2018/11/01 16:04:31.809919 ctdb-eventd[2729]: 07.iscsi: Created Node ACL for iqn.1991-05.com.microsoft:test-cln1.http-windows.dmz.bit.nl
2018/11/01 16:04:31.809921 ctdb-eventd[2729]: 07.iscsi: Created mapped LUN 2.
2018/11/01 16:04:31.809924 ctdb-eventd[2729]: 07.iscsi: Created mapped LUN 1.
2018/11/01 16:04:31.809926 ctdb-eventd[2729]: 07.iscsi: no auth for iqn.1991-05.com.microsoft:test-cln1.http-windows.dmz.bit.nl
2018/11/01 16:04:31.809929 ctdb-eventd[2729]: 07.iscsi: done
2018/11/01 16:04:33.056638 ctdb-recoverd[2833]: Reenabling takeover runs
2018/11/01 16:07:34.302343 ctdbd[2727]: pnn 2 Invalid reqid 21566 in ctdb_reply_control
2018/11/01 16:07:34.302374 ctdbd[2727]: pnn 2 Invalid reqid 21573 in ctdb_reply_control
2018/11/01 16:07:34.810333 ctdbd[2727]: 2001:7b8:80:2:0:1:9:3:4379: connected to 2001:7b8:80:1:0:1:9:1:4379 - 2 connected
2018/11/01 16:07:37.419160 ctdb-recoverd[2833]: Node 2 has changed flags - now 0x0  was 0x2
2018/11/01 16:07:37.420549 ctdb-recoverd[2833]: Node 0 has changed flags - now 0x0  was 0x2
2018/11/01 16:07:37.424513 ctdbd[2727]: Recovery mode set to ACTIVE
2018/11/01 16:07:37.424859 ctdbd[2727]: Recovery has started
2018/11/01 16:07:37.440545 ctdbd[2727]: Freeze db: ctdb.tdb
2018/11/01 16:07:37.494144 ctdbd[2727]: Thaw db: ctdb.tdb generation 656678117
2018/11/01 16:07:37.494169 ctdbd[2727]: Release freeze handle for db ctdb.tdb
2018/11/01 16:07:37.495003 ctdbd[2727]: Recovery mode set to NORMAL
2018/11/01 16:07:37.509126 ctdbd[2727]: ERROR: Daemon able to take recovery lock on "!/usr/local/bin/ctdb_mutex_ceph_rados_helper ceph client.BITED-158081-rbd BITED-158081-RADOSMISC ctdb_sgw_recovery_lock 10" during recovery
2018/11/01 16:07:37.705906 ctdb-recoverd[2833]: Election period ended
2018/11/01 16:07:38.424377 ctdbd[2727]: Recovery has started
2018/11/01 16:07:38.439577 ctdbd[2727]: Freeze db: ctdb.tdb
2018/11/01 16:07:38.493031 ctdbd[2727]: Thaw db: ctdb.tdb generation 568703777
2018/11/01 16:07:38.493053 ctdbd[2727]: Release freeze handle for db ctdb.tdb
2018/11/01 16:07:38.493838 ctdbd[2727]: Recovery mode set to NORMAL
2018/11/01 16:07:38.508534 ctdbd[2727]: ERROR: Daemon able to take recovery lock on "!/usr/local/bin/ctdb_mutex_ceph_rados_helper ceph client.BITED-158081-rbd BITED-158081-RADOSMISC ctdb_sgw_recovery_lock 10" during recovery
2018/11/01 16:07:39.424590 ctdbd[2727]: Recovery has started
2018/11/01 16:07:39.441291 ctdbd[2727]: Freeze db: ctdb.tdb
2018/11/01 16:07:39.496538 ctdbd[2727]: Thaw db: ctdb.tdb generation 2125763535
2018/11/01 16:07:39.496561 ctdbd[2727]: Release freeze handle for db ctdb.tdb
2018/11/01 16:07:39.497343 ctdbd[2727]: Recovery mode set to NORMAL
2018/11/01 16:07:39.512968 ctdbd[2727]: Recovery has finished
2018/11/01 16:07:39.528195 ctdb-recoverd[2833]: Disabling takeover runs for 60 seconds
2018/11/01 16:08:06.403762 ctdb-recoverd[2833]: Reenabling takeover runs
2018/11/01 16:14:26.596380 ctdb-recoverd[2833]: Disabling takeover runs for 20 seconds
2018/11/01 16:14:30.613156 ctdb-recoverd[2833]: Disabling takeover runs for 20 seconds
2018/11/01 16:14:50.613492 ctdb-recoverd[2833]: Reenabling takeover runs after timeout
2018/11/01 16:15:10.956498 ctdbd[2727]: dead count reached for node 0
2018/11/01 16:15:10.956527 ctdbd[2727]: 2001:7b8:80:2:0:1:9:3:4379: node 2001:7b8:80:1:0:1:9:1:4379 is dead: 1 connected
2018/11/01 16:15:10.956550 ctdbd[2727]: Tearing down connection to dead node :0
2018/11/01 16:15:10.956590 ctdb-recoverd[2833]: ctdb_control error: 'node is disconnected'
2018/11/01 16:15:10.956604 ctdb-recoverd[2833]: ctdb_control error: 'node is disconnected'
2018/11/01 16:15:10.956610 ctdb-recoverd[2833]: Async operation failed with ret=-1 res=-1 opcode=80
2018/11/01 16:15:10.956631 ctdb-recoverd[2833]: Async wait failed - fail_count=1
2018/11/01 16:15:10.956636 ctdb-recoverd[2833]: ../ctdb/server/ctdb_client.c:1931 Failed to read node capabilities.
2018/11/01 16:15:10.956641 ctdb-recoverd[2833]: ../ctdb/server/ctdb_recoverd.c:370 Failed to get node capabilities
2018/11/01 16:15:10.956645 ctdb-recoverd[2833]: ../ctdb/server/ctdb_recoverd.c:2731 Unable to update node capabilities.
2018/11/01 16:15:10.957666 ctdb-recoverd[2833]:  Current recmaster node 0 does not have CAP_RECMASTER, but we (node 2) have - force an election
2018/11/01 16:15:10.957712 ctdbd[2727]: Recovery mode set to ACTIVE
2018/11/01 16:15:10.957971 ctdbd[2727]: This node (2) is now the recovery master
2018/11/01 16:15:11.459831 ctdbd[2727]: Remote node (1) is now the recovery master
2018/11/01 16:15:14.145606 ctdbd[2727]: Recovery has started
2018/11/01 16:15:14.159917 ctdbd[2727]: Freeze db: ctdb.tdb
2018/11/01 16:15:14.207196 ctdbd[2727]: Thaw db: ctdb.tdb generation 31943707
2018/11/01 16:15:14.207218 ctdbd[2727]: Release freeze handle for db ctdb.tdb
2018/11/01 16:15:14.207986 ctdbd[2727]: Recovery mode set to NORMAL
2018/11/01 16:15:14.224003 ctdbd[2727]: Recovery has finished
2018/11/01 16:15:14.238100 ctdb-recoverd[2833]: Disabling takeover runs for 60 seconds
2018/11/01 16:15:14.460382 ctdb-recoverd[2833]: Election period ended
2018/11/01 16:15:29.780863 ctdb-recoverd[2833]: Reenabling takeover runs



More information about the samba-technical mailing list