[Samba] samba failover with ctdb and client-visible errors
Sage Weil
sage at newdream.net
Fri May 3 21:17:45 UTC 2024
Hi everyone,
I'm setting up a clustered Samba+CTDB in front of CephFS and am
running into an issue during failover. For the most part everything
seems to work: the IP moves quickly, smbd is started on the right
node, etc, but if there is an IO load from a client during failover
(e.g., copying a big directory full of files in File Explorer), it
pauses for a couple of seconds and then pops up an error dialog box.
If I hit 'Try Again' everything continues without problems.
However... I assume that a client-visible error like this will cause
problems with most applications (that may not be persistent enough to
retry everything). I did a google search and the only thing I found
was something suggesting passing a flag to xcopy that forces a retry
on error.
Here's what the dialog looks like when I reboot one of the gateway nodes:
https://i.ibb.co/kh4fFPW/tryagain.png
If I click 'Try Again' everything proceeds.
Here's my smb.conf:
root at smbgw2:/etc/samba# cat smb.conf
[global]
clustering = yes
include = registry
root at smbgw2:/etc/samba# net conf list
[global]
netbios name = smbgw
clustering = yes
idmap config * : backend = tdb2
passdb backend = tdbsam
load printers = no
smbd: backgroundqueue = no
[Audio]
path = /mnt/audio
read only = no
oplocks = no
kernel share modes = no
CTDB config looks like so:
# See ctdb.conf(5) for documentation
#
# See ctdb-script.options(5) for documentation about event script
# options
[logging]
# Enable logging to syslog
location = syslog
# Default log level
log level = NOTICE
[cluster]
# Shared recovery lock file to avoid split brain. Daemon
# default is no recovery lock. Do NOT run CTDB without a
# recovery lock file unless you know exactly what you are
# doing.
#
# Please see the RECOVERY LOCK section in ctdb(7) for more
# details.
#
# recovery lock = !/bin/false RECOVERY LOCK NOT CONFIGURED
recovery lock = /mnt/audio/.ctdb/recovery_lock
^ /mnt/audio is the CephFS mount I am reexporting.
CTDB has a single IP in public_addresses that is moving around between
the gateway nodes as expected--from what I can tell that is all
working well.
The only other issue I've identified is that I seem to have to create
the user (and set the password with smbpasswd) on each of the
gateways... even though I expected that the 'passdb backend = tdbsam'
line would keep user and password info in ctdb somewhere. Am I
missing something there?
Thanks!
sage
More information about the samba
mailing list