[Samba] gpfs + sernet samba + ctdb + transparent failover confusion

Sabuj Pattanayek sabujp at gmail.com
Fri Jan 24 16:10:31 MST 2014


Hi,

The behavior of the ctdb init script, e.g. /etc/init.d/ctdb doesn't seem to
follow the directives in /etc/sysconfig/ctdb about not managing samba or
winbind but only when the service is stopped, i.e. when I run "service ctdb
stop" it kills winbind and samba even if I set :

CTDB_MANAGES_SAMBA=no

A "service ctdb start" however does not start samba if it sees the above in
the config file. Is that the intended behavior of the init script?

However :

ctdb disable

Does seem to follow the directives and doesn't kill smbd, however smbd
can't be restarted (at least "service sernet-samba-smd restart") on the
node unless CTDB is enabled again (with clustering = yes in smb.conf), i.e.
if I run ctdb disable I get a few defunct smbd processes.

However, you mentioned "If you do "ctdb disable" on the active node though,
you will see the I/O continuing on the other node!", but this is not the
behavior I'm seeing, either with durable file handles enabled :

gpfs:sharemodes = no
gpfs:leases = no
posix locking = no
kernel oplocks = no
kernel share modes = no

... or durable file handles *not* enabled :

gpfs:sharemodes = yes
gpfs:leases = yes
posix locking = yes
kernel oplocks = yes
kernel share modes = yes

Perhaps I misunderstood, but I guess the idea here was that the running
"ctdb disable" on the node I want to "down" is supposed to migrate the IP
to the other server and that all I/O should continue, but with both reads
and writes a re-try window is thrown immediately or after the meter hits 0
bytes per second by the windows client when ctdb is disabled (using "ctdb
disable") on the server that the client is initially connected to. The only
operation which seems to do "transparent failover" are multiple file/dir
deletes, i.e. select a directory, hold down shift + delete button to
permanently delete a folder full of lots of files. This operation also does
not seem to care about durable file handles and only pauses for a few
seconds until ip failover completes.  I was trying to look for the
existence of any modifiable timeouts on the client side but couldn't find
any that were by default < 30 seconds that would seem to be the cause for
not waiting long enough on reads or writes even after the meter hits 0
bytes per second.

http://blogs.msdn.com/b/openspecification/archive/2013/03/27/smb-2-x-and-smb-3-0-timeouts-in-windows.aspx

I'm also still confused about the difference between durable file handles
and persistent file handles (or are these the same?).

Thanks,
Sabuj

On Fri, Jan 24, 2014 at 10:46 AM, Michael Adam <obnox at samba.org> wrote:

> Hi Sabuj,
>
> SMB transparent failover is a new feature of SMB version 3.0
> that is (cum grano salis) also known as Continuously Available
> shares or persistent file handles.
>
> This is a feature that is not yet implemented in Samba.
> We are working on it, but there is a way to go...
>
> Trigde's movies created the impression of a transparent
> fail over, but this is in fact not the case: in those
> demos, tride copied very small files in loop. After a
> node failure, the client simply reconnected to a different
> node (same IP), and potentially repeated the lost copy.
> There were no long-running I/O ops involved.
>
> The concept of durable file handles (from SMB2) can
> provide transparent failover without interruption of I/O
> in some cases. Durable file handles are implemented
> starting with Samba 4.0. (See the manual page smb.conf
> how to activate them.)
>
> You won't get transparent fail over though if you kill
> smbd or the node, though. If you do "ctdb disable" on
> the active node though, you will see the I/O continuing
> on the other node!
>
> Cheers - Michael
>
> On 2014-01-23 at 16:19 -0600, Sabuj Pattanayek wrote:
> > Hi all,
> >
> > We're running gpfs 3.5.0.12 (5 total nsds & quorum servers, 2 nsds
> running
> > samba), sernet-samba 4.1.4-7, and ctdb 1.0.114.7-1 and trying to get
> > transparent failover to work from a windows 8 client. We have ctdb
> failover
> > working, i.e. if I run mmshutdown on one of the nodes the IPs failover
> in a
> > few seconds after the GPFS mount is unmounted. For our transparent
> failover
> > test, I open up firefox and start downloading a large file, e.g. a centos
> > 6.5 iso into the mapped network drive being served up by one of the samba
> > servers. Then I run an "mmshutdown" on the samba server the client is
> > connected to and as soon as the mount disappears on the server the
> download
> > stops and firefox throws an error. I was expecting that with transparent
> > failover that the download would "hang" until the IPs had a chance to
> > failover and the download/writes would continue but that didn't seem to
> > happen.
> >
> > Any idea on how to get this to work? Do we need to use ctdb 2.5.1? What's
> > the difference between the sernet-samba provided ctdb 1.0.114 and the
> ctdb
> > 2.5.1 line ? I tried to install the samba.org provided ctdb 2.5.1 RPMs
> but
> > it failed with :
> >
> > ctdb >= 1.0.115 conflicts with sernet-samba-4.1.4-7.el6.x86_64
> >
> > I also created rpm's using the spec file from the ctdb-2.5.1 sources and
> > got the same error. Looks like using "make install" would be the only way
> > to get this installed, but even then why is sernet-samba requireing the
> > 1.0.x line? Again, does it even matter with the original issue with
> > transparent failover?
> >
> > Thanks,
> > Sabuj
> > --
> > To unsubscribe from this list go to the following URL and read the
> > instructions:  https://lists.samba.org/mailman/options/samba
>


More information about the samba mailing list