[Samba] ctdb tcp kill: remaining connections
Ulrich Sibiller
ulrich.sibiller at eviden.com
Wed May 28 13:12:29 UTC 2025
Martin Schwenke schrieb am 17.10.2024 13:00:
>> Thanks! I hope to being able to use a current version soon.
>
> Of course, I meant the next minor version (e.g. 4.22.x), since none of
> this is really bug fixes...
Unfortunately I am still not able to run the current version, but for this problem it should not matter because the current code is unchanged in that regard:
We are exporting GPFS filesystems with NFSv3 via ctdb. Today I have stopped ctdb on one node, and the IPs got automatically moved to another node. This is something that always works like a charm. However, many NFS clients started complaining very soon, a phenomenon that we see very often:
[Mi Mai 28 13:55:24 2025] lockd: server 192.168.xxx.yy not responding, still trying
[Mi Mai 28 13:55:24 2025] lockd: server 192.168.xxx.yy not responding, still trying
[Mi Mai 28 13:55:24 2025] lockd: server 192.168.xxx.yy not responding, still trying
As the IP hat already been taken over this time I manually ran ctdb gratarp 192.168.xxx.yy on the server that had taken the IP and almost immediately the clients were fine again.
So I checked the 10.interface script and found that gratarp is only done in the updateip case, but not in the takeip case. As we have extended that script with some (more) logging I can say that we never see updateip being called. We only see "monitor" and "takeip" and "releaseip". The thing is that takeip never sends the gratarp. Even in the most current ctdb it does not (see https://gitlab.com/samba-team/samba/-/blame/master/ctdb/config/events/legacy/10.interface.script?ref_type=heads#L137). So I am wondering why not?
MfG
Ulrich Sibiller
--
Dipl.-Inf. Ulrich Sibiller science + computing ag
System Administration Hagellocher Weg 73
Hotline +49 7071 9457 681 72070 Tuebingen, Germany
https://atos.net/de/deutschland/sc
More information about the samba
mailing list