[Samba] Samba Storage Server getting extremely slow!!!

Narayanan Subramaniam narayanan.s at focuzinfotech.com
Sun Apr 9 10:52:53 GMT 2006


Dear all

 I am running a clustering setup with Linux samba Server as output
file server on RHEL 3 workstation . All the clustering output data is 
written to this Linux Samba server which is shared to more than 40
 machines in the network.

All the machines are Windows XP, logging to a windows 2000 domain
server.
The linux samba server is a workgroup member of this windows 2000
server. 
The complete network is Gigabit ethernet . All the cluster nodes acts
as smb clients and mount the output data shares  using samba. 

  The storage server is an IBM Server connected to an Fibre channel
external storage of 3 TB. The storage server is almost used up but
Still some 4 to 5 GB is free in each partition.

 Now the problem is that the storage server is getting very slow
and data writing has become an extremely difficult task.  


  Following the output of log messages of /var/log/samba/smbd.log

 
------------------------------------------------------------------------  
Mar 29 15:22:29 storage winbindd[1305]: [2006/03/29 15:22:29, 0]
param/loadparm.c:map_parameter(2462)

Mar 29 15:22:29 storage winbindd[1305]: Unknown parameter encountered:
"revalidate"

Mar 29 15:22:29 storage winbindd[1305]: [2006/03/29 15:22:29, 0]
param/loadparm.c:lp_do_parameter(3144)

Mar 29 15:22:29 storage winbindd[1305]: Ignoring unknown parameter
"revalidate"

Mar 29 15:22:29 storage winbindd[1305]: [2006/03/29 15:22:29, 0]
nsswitch/winbindd_util.c:winbindd_param_init(555)

Mar 29 15:22:29 storage winbindd[1305]: winbindd: idmap uid range
missing or invalid

Mar 29 15:22:29 storage winbindd[1305]: [2006/03/29 15:22:29, 0]
nsswitch/winbindd_util.c:winbindd_param_init(556)

Mar 29 15:22:29 storage winbindd[1305]: winbindd: cannot continue,
exiting.

Mar 29 15:22:29 storage smb: winbindd startup succeeded

Mar 29 15:25:10 storage smbd[1319]: [2006/03/29 15:25:10, 0]
lib/util_sock.c:write_socket_data(430)

Mar 29 15:25:10 storage smbd[1319]: write_socket_data: write failure.
Error = Connection reset by peer

Mar 29 15:25:10 storage smbd[1319]: [2006/03/29 15:25:10, 0]
lib/util_sock.c:write_socket(455)

Mar 29 15:25:10 storage smbd[1319]: write_socket: Error writing 4 bytes
to socket 25: ERRNO = Connection reset by peer

Mar 29 15:25:10 storage smbd[1319]: [2006/03/29 15:25:10, 0]
lib/util_sock.c:send_smb(647)

Mar 29 15:25:10 storage smbd[1319]: Error writing 4 bytes to client. -1.
(Connection reset by peer)

Mar 29 15:25:13 storage smbd[1323]: [2006/03/29 15:25:13, 0]
lib/util_sock.c:get_peer_addr(1150)

Mar 29 15:25:13 storage smbd[1323]: getpeername failed. Error was
Transport endpoint is not connected

Mar 29 15:25:13 storage smbd[1323]: [2006/03/29 15:25:13, 0]
lib/util_sock.c:write_socket_data(430)

Mar 29 15:25:13 storage smbd[1323]: write_socket_data: write failure.
Error = 

----------------------------------------------------------------------------

Following is the output of df -h
----------------------------------
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda6             2.9G  1.9G  936M  67% /
/dev/sda1              99M   15M   79M  17% /boot
none                  504M     0  504M   0% /dev/shm
/dev/sda3             3.9G  2.4G  1.3G  66% /usr
/dev/sda5             2.9G  189M  2.6G   7% /var
/dev/sdb1             596G  510G   56G  91% /share/share1
/dev/sdb5             459G  425G   11G  98% /share/share2
/dev/sdb6             230G  205G   13G  95% /share/share3
/dev/sdb7             547G  516G  3.4G 100% /share/backup


Following is the output of ps aux - Here we can see that lot of
pids that are created on March 29 and 30 till April 4 are still there. 
--------------------------------------


root      1298  0.0  0.0  9704  800 ?        S    Mar29   0:01 smbd -D
root      1306  0.0  0.0  9700  472 ?        S    Mar29   0:00 smbd -D
10052     1312  0.0  0.1 10260 1360 ?        S    Mar29   0:45 smbd -D
10052     1313  0.0  0.1 10268 1364 ?        S    Mar29   0:41 smbd -D
root      1315  0.0  0.1 10328 1352 ?        S    Mar29   1:17 smbd -D
root      1316  0.0  0.1 10692 1840 ?        S    Mar29   1:05 smbd -D
root      1338  0.0  0.1 10316 1156 ?        S    Mar29   0:02 smbd -D
10052     1418  0.0  0.1 10328 1332 ?        S    Mar29   0:40 smbd -D
10052     1419  0.0  0.1 10392 1384 ?        S    Mar29   0:43 smbd -D
10052     1420  0.0  0.1 10384 1316 ?        S    Mar29   0:40 smbd -D
10052     1421  0.0  0.1 10384 1316 ?        S    Mar29   0:40 smbd -D
10052     1422  0.0  0.1 10376 1292 ?        S    Mar29   0:46 smbd -D
10052     1703  0.0  0.1 10412 1504 ?        S    Mar29   0:41 smbd -D
10052     1746  0.0  0.1 10460 1676 ?        S    Mar29   0:39 smbd -D
10052     1747  0.0  0.1 10460 1648 ?        S    Mar29   0:37 smbd -D
root      2360  0.0  0.7 16580 7512 ?        S    Mar29   3:43 smbd -D
root      4145  0.0  0.1 10516 1448 ?        S    Mar30   1:05 smbd -D
root     13913  0.0  0.1 11096 2024 ?        S    Apr02   0:06 smbd -D
root     14339  0.0  0.2 11140 2152 ?        S    Apr03   0:51 smbd -D
root     15549  0.0  0.3 12452 3468 ?        S    Apr04   1:05 smbd -D
root     15559  0.0  0.1 10660 1520 ?        S    Apr04   0:02 smbd -D
root     16166  0.0  0.3 12332 3204 ?        S    08:53   0:31 smbd -D
root     16214  0.1  0.3 12408 3636 ?        S    09:04   0:58 smbd -D
10052    16290  0.0  0.2 10696 2508 ?        R    09:43   0:07 smbd -D
root     16325  0.0  0.1 11088 2020 ?        S    10:02   0:14 smbd -D
root     16341  0.2  0.3 12364 3624 ?        S    10:07   1:09 smbd -D
root     16384  0.1  0.3 12844 3948 ?        S    10:27   0:59 smbd -D
root     16428  0.0  0.2 11528 2484 ?        S    11:06   0:15 smbd -D
root     16430  0.0  0.1 10432 1256 ?        S    11:19   0:00 smbd -D
root     16491  0.0  0.1 11076 2040 ?        S    13:49   0:17 smbd -D
root     16511  0.0  0.1 10320 1124 ?        S    14:11   0:00 smbd -D
root     16540  0.0  0.1 10564 1500 ?        S    14:29   0:01 smbd -D
root     16544  0.0  0.1 10440 1176 ?        S    14:41   0:01 smbd -D
root     16666  0.0  0.1 10448 1300 ?        S    16:43   0:01 smbd -D
root     16713  0.1  0.7 17100 8124 ?        S    17:25   0:09 smbd -D
10052    16740  1.6  0.1 10448 1708 ?        S    17:42   1:18 smbd -D
root     16761  0.0  0.1 10560 1404 ?        S    18:05   0:01 smbd -D
10052    16778  0.0  0.1 10448 1452 ?        S    18:17   0:01 smbd -D
10052    16788  0.9  0.1 10452 1672 ?        S    18:29   0:18 smbd -D
10052    16791  0.2  0.1 10436 1616 ?        S    18:30   0:04 smbd -D
root     16792  0.2  0.1 10788 1900 ?        S    18:30   0:05 smbd -D
10052    16850  0.1  0.1 10436 1604 ?        S    18:32   0:03 smbd -D
10052    16852  0.0  0.1 10320 1508 ?        S    18:32   0:00 smbd -D
root     16855  0.1  0.1 10604 1832 ?        S    18:32   0:03 smbd -D
10052    16857  0.2  0.1 10436 1604 ?        S    18:33   0:03 smbd -D
10052    16868  1.2  0.1 10452 1692 ?        S    18:35   0:19 smbd -D
10052    16873  0.3  0.2 10816 2372 ?        S    18:36   0:06 smbd -D
root     16883  0.0  0.1 10204 1092 ?        S    18:38   0:00 smbd -D
root     16886  0.0  0.1 10320 1120 ?        S    18:38   0:00 smbd -D
root     16889  0.3  0.1 10784 1924 ?        S    18:38   0:05 smbd -D
root     16890  0.0  0.1 10320 1216 ?        S    18:38   0:00 smbd -D
10052    16941  0.9  0.1 10448 1692 ?        S    18:39   0:13 smbd -D
10052    16942  0.6  0.1 10448 1668 ?        S    18:39   0:10 smbd -D
10052    16943  1.2  0.1 10448 1684 ?        S    18:39   0:17 smbd -D
10052    16946  1.0  0.1 10452 1656 ?        S    18:40   0:15 smbd -D
10052    16947  0.1  0.1 10436 1612 ?        S    18:40   0:02 smbd -D
10052    16949  0.4  0.2 10848 2764 ?        S    18:41   0:05 smbd -D
10052    17013  0.4  0.2 10468 2636 ?        S    19:01   0:00 smbd -D
root     17021  0.0  0.0  3672  636 pts/0    D    19:03   0:00 grep smbd


Following is the output of service smb status
--------------------------------------------------

smbd (pid 17013 16949 16947 16946 16943 16942 16941 16890 16889 16886
16883 16873 16868 16857 16855 16852 
16850 16792 16791 16788 16778 16761 16740 16713 16666 16544 16540 16511
16491 16430 16428 16384 16341 16325 
16290 16214 16166 15559 15549 14339 13913 4145 2360 1747 1746 1703 1422
1421 1420 1419 1418 1338 1316 1315 
1313 1312 1306 1298) is running...
nmbd (pid 1302) is running...
winbindd (pid 1308 1307) is running...


Following is the output of TOP
---------------------------------

 19:02:50  up 7 days,  4:15,  2 users,  load average: 6.61, 8.09, 9.57
117 processes: 116 sleeping, 1 running, 0 zombie, 0 stopped
CPU states:  cpu    user    nice  system    irq  softirq  iowait    idle
           total    2.9%    0.0%   12.7%   9.8%    10.7%   63.7%    0.0%
Mem:  1030408k av, 1021836k used,    8572k free,       0k shrd,   27200k
buff
                    776876k actv,   96396k in_d,   13936k in_c
Swap: 4192956k av,    4504k used, 4188452k free                  876572k
cached

  PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME CPU
COMMAND
16946 clustadm  16   0  1624 1624  1036 S     9.8  0.1   0:14   0 smbd
16943 clustadm  15   0  1664 1664  1044 D     3.9  0.1   0:17   0 smbd
16740 clustadm  15   0  1696 1696  1056 D     0.9  0.1   1:18   0 smbd
17019 root      20   0  1220 1220   904 R     0.9  0.1   0:00   0 top
    1 root      15   0   500  500   448 S     0.0  0.0   0:03   0 init
    2 root      15   0     0    0     0 SW    0.0  0.0   0:07   0
keventd
    3 root      34  19     0    0     0 SWN   0.0  0.0   0:00   0
ksoftirqd/0
    6 root      15   0     0    0     0 SW    0.0  0.0   0:04   0
bdflush
    4 root      15   0     0    0     0 SW    0.0  0.0   5:54   0 kswapd
    5 root      15   0     0    0     0 SW    0.0  0.0   0:21   0 kscand
    7 root      15   0     0    0     0 SW    0.0  0.0   0:00   0
kupdated
    8 root      25   0     0    0     0 SW    0.0  0.0   0:00   0
mdrecoveryd
   14 root      15   0     0    0     0 SW    0.0  0.0   0:00   0
ahd_dv_0
   15 root      15   0     0    0     0 SW    0.0  0.0   0:00   0
ahd_dv_1
   16 root      25   0     0    0     0 SW    0.0  0.0   0:00   0
scsi_eh_0
   17 root      25   0     0    0     0 SW    0.0  0.0   0:00   0
scsi_eh_1
   20 root      21   0     0    0     0 SW    0.0  0.0   0:00   0
qla2300_dpc2
   21 root      20   0     0    0     0 SW    0.0  0.0   0:00   0
qla2300_dpc3
   22 root      20   0     0    0     0 SW    0.0  0.0   0:00   0
scsi_eh_2
   23 root      20   0     0    0     0 SW    0.0  0.0   0:00   0
scsi_eh_3
   26 root      15   0     0    0     0 SW    0.0  0.0   0:04   0
kjournald
   98 root      25   0     0    0     0 SW    0.0  0.0   0:00   0 khubd
  295 root      15   0     0    0     0 SW    0.0  0.0   0:00   0
kjournald
  296 root      15   0     0    0     0 SW    0.0  0.0   0:00   0
kjournald
  297 root      15   0     0    0     0 SW    0.0  0.0   0:00   0
kjournald
  298 root      15   0     0    0     0 DW    0.0  0.0   4:34   0
kjournald
  299 root      15   0     0    0     0 SW    0.0  0.0   0:09   0
kjournald
  300 root      15   0     0    0     0 SW    0.0  0.0   0:12   0
kjournald
  301 root      15   0     0    0     0 SW    0.0  0.0   0:04   0
kjournald



Inorder to fine tune samba ,i changed the following line in smb.conf ,
 but still the problem continues 

/etc/samba/smb.conf

name resolve order = lmhosts wins bcast



Please advise me how to solve this problem


Thanks for any help


Narayanan S


-- 
This message has been scanned for viruses and
dangerous content by Techfocuz Communicator 2.6  (http://www.focuzinfotech.com), and is
believed to be clean.



More information about the samba mailing list