[Samba] ctdb tcp kill: remaining connections
Ulrich Sibiller
ulrich.sibiller at atos.net
Fri Feb 17 18:07:17 UTC 2023
> From "Takeover run starting" to "Takeover run completed successfully"
> in the logs.
Feb 13 12:27:36 serverpnn0 ctdb-recoverd[24951]: Disabling takeover runs for 60 seconds
Feb 13 12:27:36 serverpnn1 ctdb-recoverd[85688]: Takeover run starting
Feb 13 12:27:36 serverpnn2 ctdb-recoverd[29692]: Disabling takeover runs for 60 seconds
Feb 13 12:27:36 serverpnn3 ctdb-recoverd[3482705]: Disabling takeover runs for 60 seconds
Feb 13 12:27:46 serverpnn0 ctdb-recoverd[24951]: Reenabling takeover runs
Feb 13 12:27:46 serverpnn1 ctdb-recoverd[85688]: Takeover run completed successfully
-> 10s
Feb 13 12:27:46 serverpnn2 ctdb-recoverd[29692]: Reenabling takeover runs
Feb 13 12:27:46 serverpnn3 ctdb-recoverd[3482705]: Reenabling takeover runs
Feb 13 12:27:50 serverpnn0 ctdb-recoverd[24951]: Disabling takeover runs for 60 seconds
Feb 13 12:27:50 serverpnn1 ctdb-recoverd[85688]: Takeover run starting
Feb 13 12:27:50 serverpnn2 ctdb-recoverd[29692]: Disabling takeover runs for 60 seconds
Feb 13 12:27:50 serverpnn3 ctdb-recoverd[3482705]: Disabling takeover runs for 60 seconds
Feb 13 12:27:56 serverpnn0 ctdb-recoverd[24951]: Reenabling takeover runs
Feb 13 12:27:56 serverpnn1 ctdb-recoverd[85688]: Takeover run completed successfully
->6s
Feb 13 12:27:56 serverpnn2 ctdb-recoverd[29692]: Reenabling takeover runs
Feb 13 12:27:56 serverpnn3 ctdb-recoverd[3482705]: Reenabling takeover runs
> This means the problem must be one of 2 things:
>
> 2. ctdb_killtcp is not terminating all connections.
>
> So, I think it is (2). :-)
Yes, I think so, too.
> support capture on InfiniBand networks. It might be worth building a
> static (to avoid shared library issues) ctdb_killtcp binary from 4.18.x
> source (after configuring with --enable-pcap) and installing that in
> place of the existing ctdb_killtcp binary. The version in 4.18 also has
> slightly better debugging for unknown packets.
Sounds promising! I will try that! I have not found enough time today to do the compile, will do that next week.
> > We are using two teaming-Interfaces (one for each LAN). Each
> > team-Interface has 2x10Gbit LACP.
>
> I'm not sure how this would affect the packet capture. I would guess
> that the frame format would be one of the ones that is now supported.
It should be transparent to the consumers.
> Fair enough... but a whole minute is a long time to be running
> ctdb_killtcp during failover...
$ grep "Feb 13 17:36:08" /var/log/messages | grep -i ignor | wc -l
7293
I should have written "one run" or similar, all these lines have been logged (with additional debug lines added by me) with the same timestamp and show up between:
Feb 13 17:36:08 serverpnn2 ctdb-eventd[29607]: 10.interface.debug: team0: blocking [z.z.139.15] on interface [team0]
Feb 13 17:36:08 serverpnn2 ctdb-eventd[29607]: 10.interface.debug: ip_block z.z.139.15 team0
Feb 13 17:36:08 serverpnn2 ctdb-eventd[29607]: 10.interface.debug: team0: killing tcp connections for ip [z.z.139.15] on interface [team0]
Feb 13 17:36:08 serverpnn2 ctdb-eventd[29607]: 10.interface.debug: killcount [22] twoway: [22]
Feb 13 17:36:08 serverpnn2 ctdb-eventd[29607]: 10.interface.debug: reset_connections_send: Adding 44 connections to hash
and
Feb 13 17:36:08 serverpnn2 ctdb-eventd[29607]: 10.interface.debug: Killed 22/22 TCP connections to released IP z.z.139.15
Feb 13 17:36:08 serverpnn2 ctdb-eventd[29607]: 10.interface.debug: team0: deleting ip [z.z.139.15] from interface [team0] maskbits [16]
Feb 13 17:36:08 serverpnn2 ctdb-eventd[29607]: 10.interface.debug: team0: unblocking [z.z.139.15] on interface [team0]
Feb 13 17:36:08 serverpnn2 ctdb-eventd[29607]: 10.interface.debug: ip_unblock z.z.139.15 team0
Feb 13 17:36:08 serverpnn2 ctdb-eventd[29607]: 10.interface.debug: team0: flushing route cache
Feb 13 17:36:08 serverpnn2 ctdbd[29605]: Sending RELEASE_IP message for z.z.139.15
In that very same second we see these message counts on the node running ctdb_killtcp:
$ grep "Feb 13 17:36:08" /var/log/messages | cut -d: -f4- | sed 's/192.*$//' | sort | uniq -c
1 10.interface.debug: ctdb_sys_open_capture_socket: Created RAW SOCKET FD:5 for tcp tickle
1808 10.interface.debug: Ignoring packet:
1 10.interface.debug: ip_block <<< DEBUG output added by me
1 10.interface.debug: ip_unblock <<< DEBUG output added by me
1 10.interface.debug: killcount [22] twoway: [22] <<< DEBUG output added by me
1 10.interface.debug: Killed 22/22 TCP connections to released IP
5485 10.interface.debug: reset_connections_capture_tcp_handler: Ignoring packet for unknown connection:
1 10.interface.debug: reset_connections_send: Adding 44 connections to hash
44 10.interface.debug: reset_connections_send: Adding connection to hash:
905 10.interface.debug: reset_connections_tickle_connection: Sending tickle ACK for connection '
1 10.interface.debug: running /etc/ctdb/events/legacy/10.interface.debug.script with arguments: releaseip team0 <<< DEBUG output added by me
28 10.interface.debug: Sending a TCP RST to for connection
1 10.interface.debug: team0: blocking [ <<< DEBUG output added by me
1 10.interface.debug: team0: deleting ip [ <<< DEBUG output added by me
1 10.interface.debug: team0: flushing route cache <<< DEBUG output added by me
1 10.interface.debug: team0: killing tcp connections for ip [ <<< DEBUG output added by me
1 10.interface.debug: team0: unblocking [ <<< DEBUG output added by me
1 ../../ctdb/server/ctdb_takeover.c:295 public address '
1 Sending RELEASE_IP message for
So in this case only 22 connections to be killed (two-way kill) caused more than 7000 lines of debugging output...
(BTW, there's a typo in cdtb_killtcp: " Sending a TCP RST to for connection" - one of "to" and "for" is obsolete
> In preparation for takeover, for NFS we only remember connections on
> 2049. The other ports, such as lockd, would be handled on reboot
> (rather than crash) by the releasing side... but I doubt anyone has
> thought about this! Perhaps we seen that part handling by UDP in the
> past? Please feel free to open a bug to remind me to look at this.
Ok, I will.
Your comment is interesting because it indicates ctdb might be designed with the idea of always rebooting after a failover in mind. We are also using ctdb moveip or a systemctl stop ctdb to perform updates without reboot (e.g. for gpfs). So are we "holding it wrong"?
> I suspect this could be improved to say "Never mount the same NFS
> directory ...". I think you are OK with mounting the subdirectories
> for each user.
Good!
Thanks,
Uli
More information about the samba
mailing list