[Samba] ctdb tcp kill: remaining connections
Martin Schwenke
martin at meltin.net
Thu May 29 00:40:31 UTC 2025
Hi Uli,
On Wed, 28 May 2025 13:12:29 +0000, Ulrich Sibiller
<ulrich.sibiller at eviden.com> wrote:
> We are exporting GPFS filesystems with NFSv3 via ctdb. Today I have
> stopped ctdb on one node, and the IPs got automatically moved to
> another node. This is something that always works like a charm.
> However, many NFS clients started complaining very soon, a phenomenon
> that we see very often:
> [Mi Mai 28 13:55:24 2025] lockd: server 192.168.xxx.yy not responding, still trying
> [Mi Mai 28 13:55:24 2025] lockd: server 192.168.xxx.yy not responding, still trying
> [Mi Mai 28 13:55:24 2025] lockd: server 192.168.xxx.yy not responding, still trying
>
> As the IP hat already been taken over this time I manually ran ctdb
> gratarp 192.168.xxx.yy on the server that had taken the IP and almost
> immediately the clients were fine again.
> So I checked the 10.interface script and found that gratarp is only
> done in the updateip case, but not in the takeip case. As we have
> extended that script with some (more) logging I can say that we never
> see updateip being called. We only see "monitor" and "takeip" and
> "releaseip". The thing is that takeip never sends the gratarp. Even
> in the most current ctdb it does not (see
> https://gitlab.com/samba-team/samba/-/blame/master/ctdb/config/events/legacy/10.interface.script?ref_type=heads#L137).
> So I am wondering why not?
In the "takeip" case the gratarp is sent by the daemon. The relevant
code is in ctdb_announce_vnn_iface().
I was going to ask questions about firewalls and forwarding rules on
routers between the server nodes and the clients. However, you can run
"ctdb gratarp ..." and it fixes the problem, so it doesn't sound like
the packets are being filtered somewhere. The "ctdb gratarp" command
sends a control to the daemon to send the ARPs and, for the control, the
daemon runs the same low-level code as in ctdb_announce_vnn_iface(),
which is run from the callback when the "takeip" event succeeds.
There are 2 possible differences:
* The interface:
The "takeip" callback automatically determines the interface from
which to send the ARPs. This is simply based on the interface to
which the IP is assigned. I doubt you would be running a manual
command that specifies a different interface.
* Timing/routing:
Not that it should matter, but are you using the 13.per_ip_routing
event script to add source-based routing? If so, I'm wondering if
perhaps something is going wrong there during "takeip" and is being
fixed later in "ipreallocated".
Is anything strange about your routing? In fact, are the clients on
the same subnet as the server nodes? It shouldn't matter if
everything is setup sanely.
One other thing I notice in the relevant (lockd client) kernel code is
that it calls:
rpc_force_rebind(clnt);
after logging the "not responding, still trying" message. Without
digging very deep, that looks like it should be forcing the client to
reconnect. So, in that case too, we need to be sure the ARPs are
making it through to the client.
It would be good if you could tcpdump on the server nodes and on a
client to determine if the ARPs are being sent... and what is happening
to the lockd connections before you intervene manually. You should be
able to construct a filter that captures only relevant gratuitous ARPs
and TCP SYN packets - if so, you could leave that running in the
background.
Thanks...
peace & happiness,
martin
More information about the samba
mailing list