[Samba] 2500 smbd processes for 30 users. tdb_oob len beyond eof.

ard at waikato.ac.nz ard at waikato.ac.nz
Thu Jul 18 16:46:02 GMT 2002


Yesterday my Samba server stopped answering mount requests for the first
time in four months.  There were 2500 smbd processes running, consuming
both CPUs, a gig of RAM, and a gig of swap.  Sysstat provided interesting
graphs.  "smbstatus" showed about 30 connected users.  Normally I have
about ten times that.

New mount requests were being refused, or timing out:

        # mount -tsmbfs //server/share /mnt -ousername=etc,uid=etc
        Password: 
        15438: session setup failed: SUCCESS - 0
        SMB connection failed

Windows users were unable to log in because this Samba box is their home H:
drive server.

I fixed it with "killall smbd" and now everything is back to normal (smbd
is started by inetd).  I did not delete any log files or the connection
database, as others on this list have had to do.

Occasionally in the past smbstatus has reported "tdb_oob len beyond eof"
but I have always ignored it because smbd continues working.  Only now,
after searching the mailing list archives, I see that it has been terminal
for some users.

The system is a dual P3 Compaq, 2.4.18-ac1 (-ac1 to get proper quotas),
Samba-2.2.3a.  The original installation was Slackware-8.0, if I remember
correctly, but all apps are compiled from source rather than not taken from
packages.  Load peaks around 300-400 concurrent connections per day with
deadtime set to 30 minutes.  This peak happens between 12pm and 1pm every
weekday.  The "crisis" happened at 12:35pm Thursday.

Logging went into orbit:

# grep smbd daemon.log | cut -f 1 -d: | uniq -c
    304 Jul 18 00
    120 Jul 18 01
    122 Jul 18 02
     98 Jul 18 03
    102 Jul 18 04
    102 Jul 18 05
    118 Jul 18 06
    170 Jul 18 07
   2012 Jul 18 08
   4896 Jul 18 09
   8262 Jul 18 10
   8348 Jul 18 11
1886813 Jul 18 12
 618309 Jul 18 13
  78478 Jul 18 14
    323 Jul 18 15
    342 Jul 18 16
[...continues around 300 for the rest of the day...]

# grep tdb_oob daemon.log |cut -f 1 -d: |uniq -c
 937028 Jul 18 12
 300362 Jul 18 13
  35386 Jul 18 14

# grep tdb_oob daemon.log | cut -f 1,2 -d: | uniq -c
   9007 Jul 18 12:04
  18203 Jul 18 12:05
  21734 Jul 18 12:06
[...about 20000 *every* minute until 12:30, then linear dropoff to 3000 per
   minute at 14:10, ...]
   3041 Jul 18 14:11
   2134 Jul 18 14:12
   1659 Jul 18 14:13      <---- killall smbd here
      6 Jul 18 14:14      <---- ...taking several minutes to complete
      6 Jul 18 14:15
      5 Jul 18 14:16
      6 Jul 18 14:17
      5 Jul 18 14:18
      2 Jul 18 14:19

Next week I will upgrade to 2.2.5, but with a large user base I have to
take some care.  In the meantime, can anybody offer any suggestions?


_________________________________________________________________________
Andrew Donkin                  Waikato University, Hamilton,  New Zealand


P.S. does anybody else think that splitting log lines in two, the way Samba
does, is madness?




More information about the samba mailing list