PDC acceptance criteria

Jeffry Smith smith at mclinux.com
Tue Oct 3 21:20:51 GMT 2000


On Tue, 3 Oct 2000 gcarter at valinux.com wrote:

> I do think it is important that Samba has the 
> ability to be fault tolerant (or fail over capabilities).
> This is easy though when you are dealing with Samba
> multiple Samba servers.  Where does the effort need 
> to be focused.
> 

I've been working on having Samba cluster failover capability with
Kimberlite (http://oss.missioncriticallinux.com/kimberlite).  This
isn't built into Samba, but instead Samba rides on top of it.  I've
got the configuration to do most of it, but I do have one issue to
solve (may be simple, and I can't find the tunable parameter, or may
be SMB issue, Samba issue, Linux issue, or Kimberlite issue, I don't know yet):
First, the configuration:  Samba 2.0.7, running on Linux 2.2.16, using
Kimberlite 1.1.  2 servers, with shared storage:
(forgive the ASCII art)

              apparent Samba server SAMBACLU, 10.1.1.4
                   |           
       -----------------------  network
       |                     |
    -------              -------
    |node1| 10.1.1.2     |node2|  10.1.1.3 clustered servers
    -------              -------
       |                     |
       ----------------------
                   |
              -----------
	      | SCSI RAID|
	      -----------

Samba is configured to look only at its own interface (10.1.1.4), and
all the SMB exported files (/var/samba, shared as samba_clu) are on a partition that fails over with
Samba.  I've also configured it so that the lock files are on a
failover partition (/var/samba_prv), along with the smb password file
(/var/samba_prv/smbpasswd) and the log files (/var/samba_prv/log.%m).

Assume Samba running on Node 1, three clients, clienta, clientb,
clientc.

Having set it up the setup, let me describe the problem scenario:
Clienta and clientb log into Samba, accessing files.  Node1 fails,
causing Kimberlite to fail the service over to node2 (unmount the
partitions from Node1, fsck them on node2, mount them on node2, bring
Samba up on node2).  During this process of failover (node1 down,
node2 not up), clienta attempts to access the files, clientb does not.
clientc tries to access after failover.

Results:
clientb sees the files before failover, and after.
clienta gets an error message "share or folder not available"
(understandable since node2 is not up).  However, clienta keeps
getting the message after node2 comes up.
clientc sees sambaclu on node2 just fine.  

It doesn't matter whether the clients are Windows98 or Linux (mount -t
smbfs).  Same problem.

I assume it's due to a cache somewhere, but I can't figure out where.  
Any hints as to where to look appreciated.

Other than this, it appears to work fine (although I haven't tested
with PDC yet - that's the next thing after I track this issue down)


------------------------------------------------------------------------
Jeffry Smith      Technical Sales Consultant     Mission Critical Linux
smith at missioncriticallinux.com   phone:603.930.9739   fax:978.446.9470
------------------------------------------------------------------------
Thought for today:  win big vi. 

 To experience serendipity.  "I went shopping
   and won big; there was a 2-for-1 sale."  See big win.







More information about the samba-technical mailing list