[Samba] samba failover with ctdb and client-visible errors

Sage Weil sage at newdream.net
Fri May 3 21:17:45 UTC 2024


Hi everyone,

I'm setting up a clustered Samba+CTDB in front of CephFS and am
running into an issue during failover.  For the most part everything
seems to work: the IP moves quickly, smbd is started on the right
node, etc, but if there is an IO load from a client during failover
(e.g., copying a big directory full of files in File Explorer), it
pauses for a couple of seconds and then pops up an error dialog box.
If I hit 'Try Again' everything continues without problems.
However... I assume that a client-visible error like this will cause
problems with most applications (that may not be persistent enough to
retry everything).  I did a google search and the only thing I found
was something suggesting passing a flag to xcopy that forces a retry
on error.

Here's what the dialog looks like when I reboot one of the gateway nodes:
  https://i.ibb.co/kh4fFPW/tryagain.png
If I click 'Try Again' everything proceeds.

Here's my smb.conf:

root at smbgw2:/etc/samba# cat smb.conf
[global]
  clustering = yes
  include = registry
root at smbgw2:/etc/samba# net conf list
[global]
netbios name = smbgw
clustering = yes
idmap config * : backend = tdb2
passdb backend = tdbsam
load printers = no
smbd: backgroundqueue = no

[Audio]
path = /mnt/audio
read only = no
oplocks = no
kernel share modes = no



CTDB config looks like so:

# See ctdb.conf(5) for documentation
#
# See ctdb-script.options(5) for documentation about event script
# options

[logging]
# Enable logging to syslog
location = syslog

# Default log level
log level = NOTICE

[cluster]
# Shared recovery lock file to avoid split brain.  Daemon
# default is no recovery lock.  Do NOT run CTDB without a
# recovery lock file unless you know exactly what you are
# doing.
#
# Please see the RECOVERY LOCK section in ctdb(7) for more
# details.
#
# recovery lock = !/bin/false RECOVERY LOCK NOT CONFIGURED
recovery lock = /mnt/audio/.ctdb/recovery_lock

^ /mnt/audio is the CephFS mount I am reexporting.

CTDB has a single IP in public_addresses that is moving around between
the gateway nodes as expected--from what I can tell that is all
working well.

The only other issue I've identified is that I seem to have to create
the user (and set the password with smbpasswd) on each of the
gateways... even though I expected that the 'passdb backend = tdbsam'
line would keep user and password info in ctdb somewhere.  Am I
missing something there?

Thanks!
sage



More information about the samba mailing list