[Samba] gpfs + sernet samba + ctdb + transparent failover confusion

Michael Adam obnox at samba.org
Sat Jan 25 09:55:12 MST 2014


Hi,

On 2014-01-24 at 17:10 -0600, Sabuj Pattanayek wrote:
> 
> The behavior of the ctdb init script, e.g. /etc/init.d/ctdb doesn't seem to
> follow the directives in /etc/sysconfig/ctdb about not managing samba or
> winbind but only when the service is stopped, i.e. when I run "service ctdb
> stop" it kills winbind and samba even if I set :
> 
> CTDB_MANAGES_SAMBA=no
> 
> A "service ctdb start" however does not start samba if it sees the above in
> the config file. Is that the intended behavior of the init script?

"service ctdb does" not stop samba in that case.
But samba stops itself because it can't access its
databases any more.

> However :
> 
> ctdb disable
> 
> Does seem to follow the directives and doesn't kill smbd, however smbd
> can't be restarted (at least "service sernet-samba-smd restart") on the
> node unless CTDB is enabled again (with clustering = yes in smb.conf), i.e.
> if I run ctdb disable I get a few defunct smbd processes.

Right: ctdb is still running, but does not let clients attach to
databases. Hence Samba can't successfully start.

> However, you mentioned "If you do "ctdb disable" on the active node though,
> you will see the I/O continuing on the other node!", but this is not the
> behavior I'm seeing, either with durable file handles enabled :
> 
> gpfs:sharemodes = no
> gpfs:leases = no
> posix locking = no
> kernel oplocks = no
> kernel share modes = no
> 
> ... or durable file handles *not* enabled :
> 
> gpfs:sharemodes = yes
> gpfs:leases = yes
> posix locking = yes
> kernel oplocks = yes
> kernel share modes = yes
> 
> Perhaps I misunderstood, but I guess the idea here was that the running
> "ctdb disable" on the node I want to "down" is supposed to migrate the IP
> to the other server and that all I/O should continue,

That was the intention, with durable handles enabled.

> but with both reads
> and writes a re-try window is thrown immediately or after the meter hits 0
> bytes per second by the windows client when ctdb is disabled (using "ctdb
> disable") on the server that the client is initially connected to.

That should not happen. I assume that you use a new-enough
windows? (i.e. vista or newer)

Maybe I need to re-test with the software versions that you
mentioned.

> The only
> operation which seems to do "transparent failover" are multiple file/dir
> deletes, i.e. select a directory, hold down shift + delete button to
> permanently delete a folder full of lots of files. This operation also does
> not seem to care about durable file handles and only pauses for a few
> seconds until ip failover completes.  I was trying to look for the
> existence of any modifiable timeouts on the client side but couldn't find
> any that were by default < 30 seconds that would seem to be the cause for
> not waiting long enough on reads or writes even after the meter hits 0
> bytes per second.
> 
> http://blogs.msdn.com/b/openspecification/archive/2013/03/27/smb-2-x-and-smb-3-0-timeouts-in-windows.aspx
> 
> I'm also still confused about the difference between durable file handles
> and persistent file handles (or are these the same?).

Durable file handles exist since SMB 2.0 / Windows Vista.
For those the server attempts to keep the file open for time
window when a client is disconnected e.g. by a short network
outage. The client can reconnect to the server and re-acquire
its open handle with all its state. If a different client opens
the file in between, the state is lost, though. This is a
best effort concept.

Persistent file handles are introduced with SMB 3.0 / Windows 8.
They are like durable file handles with strong guarantees.
They are supposed to survive even a complete server outage.
And a disconnected persistent handle is not given up for a different
client that wants to open the file.

Durable handle should be enough for I/O to survive a ctdb disable.
As said above: I might need to re-test with latest versions...  :-/
There might of course be a regression.

Cheers - Michael

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 215 bytes
Desc: Digital signature
URL: <http://lists.samba.org/pipermail/samba/attachments/20140125/02ed42e8/attachment.pgp>


More information about the samba mailing list