Corrrupt database and no connections possible

Christian Seip cseip at cse-consult.de
Wed Oct 3 23:56:01 GMT 2001


Hi!

I just searched the archive for a solution to our problem:

We've  got  two  machines  clustered  with a shared storage so there's
always  one clusternode active and the other waiting for the active to
fail  (hot  stand-by  configuration).  Both clusternodes have the same
configuration: We're running Samba 2.2.1a on Red Hat Linux 7.1, / (the
node's  local  disk)  and /shares (on the shared storage) as ReiserFS,
and  Kernel  2.4.9,  since  we  ran  occasionally into a kernel bug in
memory management with Red Hat's 2.4.2-2 factory kernel.

  [root at stan logrotate.d]# uname -a
  Linux stan 2.4.9 #1 SMP Thu Sep 27 15:59:08 CEST 2001 i686 unknown

We have about 500 WinNT 4.0 SP6a clients connection to the cluster.

Samba  2.2.1a and 2.2.0a have been running finde for a couple of weeks
but  since  a  few  days,  clients can't connect anymore. This happens
regardless  which  of our clusternodes is active so I dare to say that
we can assume that it's *not* faulty hardware.

We found the below message in the client logfiles (smb.conf: logfile =
/shares/sambalog/log.%m).   I   just   took   a   look   at  the  file
/var/lock/samba/sessionid.tdb and it looks strange to me. Excerpt from
our /var/lock/samba/sessionid.tdb:

[2001/10/04 07:30:37, 0] tdb/tdbutil.c:tdb_log(342)
  tdb(/var/lock/samba/sessionid.tdb): tdb_oob len 875767935 beyond eof at 797816
[2001/10/04 07:30:37, 0] tdb/tdbutil.c:tdb_log(342)
  tdb(/var/lock/samba/sessionid.tdb): tdb_oob len 808465011 beyond eof at 797816
[2001/10/04 07:30:37, 0] tdb/tdbutil.c:tdb_log(342)
  tdb(/var/lock/samba/sessionid.tdb): tdb_oob len 909195343 beyond eof at 797816
[2001/10/04 07:30:37, 0] tdb/tdbutil.c:tdb_log(342)
  tdb(/var/lock/samba/sessionid.tdb): tdb_oob len 808465011 beyond eof at 797816

I'd say this looks like a logfile, not like a database. Am I right?

Stopping  Samba  and  deleting the contents of the /var/lock/samba-dir
helps  but  doesn't cure the problem. Sooner or later we have the same
situation.

As  I said, I searched the archive. I've found similar situations with
different  setups (Samba on HP-UX, WIN2K clients, etc.) Our Samba runs
on  Intel Hardware and our clients are WinNT. Maybe there's some Win9x
between,  but  there's no known Win2K which I will check again to make
it sure.

Samba has been compiled with the Red Hat gcc:

  [root at stan logrotate.d]# gcc -v
  Reading specs from /usr/lib/gcc-lib/i386-redhat-linux/2.96/specs
  gcc version 2.96 20000731 (Red Hat Linux 7.1 2.96-81)


Here's my configure options:

  ./configure \
         --prefix=/usr/local/samba \
         --exec-prefix=/usr/local/samba \
         --bindir=/usr/local/bin \
         --sbindir=/usr/local/sbin \
         --libexecdir=/usr/local/libexec \
         --datadir=/usr/local/samba/share \
         --sysconfdir=/shares/etcsmb \
         --sharedstatedir=/usr/local/samba \
         --localstatedir=/shares/sambalog \
         --libdir=/usr/local/lib \
         --includedir=/usr/include \
         --infodir=/usr/local/samba/info \
         --mandir=/usr/local/man \
         --with-pam \
         --with-privatedir=/shares/etcsmb \
         --with-lockdir=/var/lock/samba \
         --with-swatdir=/usr/local/samba/swat \
         --without-sambabook

This is smb.conf:

  [global]
     workgroup            = SR
     netbios name         = USERHOMES
     server string        = Samba-Cluster %v
     load printers        = no
     log file             = /shares/sambalog/log.%m
     max log size         = 0
     security             = domain
     encrypt passwords    = yes
     name resolve order   = wins host lmhosts bcast
     password server      = 192.168.1.3
  #   password server     = 192.168.1.2
  #   socket options      = TCP_NODELAY
     interfaces           = 192.168.0.69/255.255.255.0
     bind interfaces only = yes
     local master         = no
     os level             = 33
     domain master        = no
     preferred master     = no
     wins server          = 192.168.1.2
     dns proxy            = no 
     create mask          = 0777
     directory mask       = 0777
     add user script      = /shares/etcsmb/smb_useradd.pl %u
     kernel oplocks       = no
     oplocks              = no
  
  #============================ Share Definitions ==============================
  
  [homes]
     comment              = Home-Verzeichnis %u
     browseable           = no
     writable             = yes
     guest ok             = no
     valid users          = %S administrator

I've got absolutely no idea what's the cause of this error.

As this is an server already in production, any hints (use a different
gcc),   workarounds  (some  configure  options)  and  other  solutions
(downgrade do Samba 2.2.0a?) will be gratefully accepted.

Best regards,

Christian Seip

----------------------------------------------------------------------
*** Achtung: Neue Adresse! *** Achtung: Neue Adresse! *** Achtung: Neu

Christian Seip                         Tel            (06386)  404 304
Diplom-Informatiker (FH)               Fax            (06386)  404 305
Petschwiesen 19                        Mobil          (0171) 61 70 886
66904 Brücken/Pfalz                    E-Mail     cseip at cse-consult.de
                                       WWW  http://www.cse-consult.de/





More information about the samba mailing list