NLM and CTDB recovery master node failure
Volker.Lendecke at SerNet.DE
Thu Oct 29 16:32:58 MDT 2009
On Thu, Oct 29, 2009 at 09:20:30PM +0200, Sergey Kleyman wrote:
> We have our internal API that are implemented on top of Spread Toolkit
> (http://www.spread.org/) but our goal is to make as less changes to
> Samba as possible so changing election code to use our API is not the
> optimal solution. I guess it'll be easier to adhere to Samba's
> assumptions about NLM and provide automatic lock clean-up in case of the
> node failure. Are you sure that GPFS and/or GFS have this capability?
I haven't tested it myself, but this is a basic assumption
in ctdb. Tridge might answer this authoritatively.
> As a side note: if I understand you correctly CTDB is assumed to be
> running on the same machines as underlying file system. I was under the
> impression that it's possible to run file system on machines A and B,
> while Samba+CTDB will run on different machines C and D that will see
> clustered file system through NFS mounts in which case C and D are just
> NLM clients to the file system.
Why would you want to do that? Going through the network
twice is a very bad idea for performance. And as I said, the
fcntl locking problems plus very frequent client lockups due
to buggy NFS clients under CIFS load really tell us that you
asking more trouble than you will appreciate.
> One more point I wanted to inquire about: if smbd daemons dies for some
> reason (abnormal exit - panic, etc.) what happens to CIFS locks it was
> holding? Are those locks automatically cleaned up?
They are cleaned up. Look for example at the for-loop in
source3/locking/locking.c:650ff in current master. We also
send immediate retry messages to all processes in case the
parent smbd detects a child has died.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 197 bytes
Desc: Digital signature
More information about the samba-technical