[Samba] Samba Storage Server getting extremely slow!!!
Narayanan Subramaniam
narayanan.s at focuzinfotech.com
Sun Apr 9 10:52:53 GMT 2006
Dear all
I am running a clustering setup with Linux samba Server as output
file server on RHEL 3 workstation . All the clustering output data is
written to this Linux Samba server which is shared to more than 40
machines in the network.
All the machines are Windows XP, logging to a windows 2000 domain
server.
The linux samba server is a workgroup member of this windows 2000
server.
The complete network is Gigabit ethernet . All the cluster nodes acts
as smb clients and mount the output data shares using samba.
The storage server is an IBM Server connected to an Fibre channel
external storage of 3 TB. The storage server is almost used up but
Still some 4 to 5 GB is free in each partition.
Now the problem is that the storage server is getting very slow
and data writing has become an extremely difficult task.
Following the output of log messages of /var/log/samba/smbd.log
------------------------------------------------------------------------
Mar 29 15:22:29 storage winbindd[1305]: [2006/03/29 15:22:29, 0]
param/loadparm.c:map_parameter(2462)
Mar 29 15:22:29 storage winbindd[1305]: Unknown parameter encountered:
"revalidate"
Mar 29 15:22:29 storage winbindd[1305]: [2006/03/29 15:22:29, 0]
param/loadparm.c:lp_do_parameter(3144)
Mar 29 15:22:29 storage winbindd[1305]: Ignoring unknown parameter
"revalidate"
Mar 29 15:22:29 storage winbindd[1305]: [2006/03/29 15:22:29, 0]
nsswitch/winbindd_util.c:winbindd_param_init(555)
Mar 29 15:22:29 storage winbindd[1305]: winbindd: idmap uid range
missing or invalid
Mar 29 15:22:29 storage winbindd[1305]: [2006/03/29 15:22:29, 0]
nsswitch/winbindd_util.c:winbindd_param_init(556)
Mar 29 15:22:29 storage winbindd[1305]: winbindd: cannot continue,
exiting.
Mar 29 15:22:29 storage smb: winbindd startup succeeded
Mar 29 15:25:10 storage smbd[1319]: [2006/03/29 15:25:10, 0]
lib/util_sock.c:write_socket_data(430)
Mar 29 15:25:10 storage smbd[1319]: write_socket_data: write failure.
Error = Connection reset by peer
Mar 29 15:25:10 storage smbd[1319]: [2006/03/29 15:25:10, 0]
lib/util_sock.c:write_socket(455)
Mar 29 15:25:10 storage smbd[1319]: write_socket: Error writing 4 bytes
to socket 25: ERRNO = Connection reset by peer
Mar 29 15:25:10 storage smbd[1319]: [2006/03/29 15:25:10, 0]
lib/util_sock.c:send_smb(647)
Mar 29 15:25:10 storage smbd[1319]: Error writing 4 bytes to client. -1.
(Connection reset by peer)
Mar 29 15:25:13 storage smbd[1323]: [2006/03/29 15:25:13, 0]
lib/util_sock.c:get_peer_addr(1150)
Mar 29 15:25:13 storage smbd[1323]: getpeername failed. Error was
Transport endpoint is not connected
Mar 29 15:25:13 storage smbd[1323]: [2006/03/29 15:25:13, 0]
lib/util_sock.c:write_socket_data(430)
Mar 29 15:25:13 storage smbd[1323]: write_socket_data: write failure.
Error =
----------------------------------------------------------------------------
Following is the output of df -h
----------------------------------
Filesystem Size Used Avail Use% Mounted on
/dev/sda6 2.9G 1.9G 936M 67% /
/dev/sda1 99M 15M 79M 17% /boot
none 504M 0 504M 0% /dev/shm
/dev/sda3 3.9G 2.4G 1.3G 66% /usr
/dev/sda5 2.9G 189M 2.6G 7% /var
/dev/sdb1 596G 510G 56G 91% /share/share1
/dev/sdb5 459G 425G 11G 98% /share/share2
/dev/sdb6 230G 205G 13G 95% /share/share3
/dev/sdb7 547G 516G 3.4G 100% /share/backup
Following is the output of ps aux - Here we can see that lot of
pids that are created on March 29 and 30 till April 4 are still there.
--------------------------------------
root 1298 0.0 0.0 9704 800 ? S Mar29 0:01 smbd -D
root 1306 0.0 0.0 9700 472 ? S Mar29 0:00 smbd -D
10052 1312 0.0 0.1 10260 1360 ? S Mar29 0:45 smbd -D
10052 1313 0.0 0.1 10268 1364 ? S Mar29 0:41 smbd -D
root 1315 0.0 0.1 10328 1352 ? S Mar29 1:17 smbd -D
root 1316 0.0 0.1 10692 1840 ? S Mar29 1:05 smbd -D
root 1338 0.0 0.1 10316 1156 ? S Mar29 0:02 smbd -D
10052 1418 0.0 0.1 10328 1332 ? S Mar29 0:40 smbd -D
10052 1419 0.0 0.1 10392 1384 ? S Mar29 0:43 smbd -D
10052 1420 0.0 0.1 10384 1316 ? S Mar29 0:40 smbd -D
10052 1421 0.0 0.1 10384 1316 ? S Mar29 0:40 smbd -D
10052 1422 0.0 0.1 10376 1292 ? S Mar29 0:46 smbd -D
10052 1703 0.0 0.1 10412 1504 ? S Mar29 0:41 smbd -D
10052 1746 0.0 0.1 10460 1676 ? S Mar29 0:39 smbd -D
10052 1747 0.0 0.1 10460 1648 ? S Mar29 0:37 smbd -D
root 2360 0.0 0.7 16580 7512 ? S Mar29 3:43 smbd -D
root 4145 0.0 0.1 10516 1448 ? S Mar30 1:05 smbd -D
root 13913 0.0 0.1 11096 2024 ? S Apr02 0:06 smbd -D
root 14339 0.0 0.2 11140 2152 ? S Apr03 0:51 smbd -D
root 15549 0.0 0.3 12452 3468 ? S Apr04 1:05 smbd -D
root 15559 0.0 0.1 10660 1520 ? S Apr04 0:02 smbd -D
root 16166 0.0 0.3 12332 3204 ? S 08:53 0:31 smbd -D
root 16214 0.1 0.3 12408 3636 ? S 09:04 0:58 smbd -D
10052 16290 0.0 0.2 10696 2508 ? R 09:43 0:07 smbd -D
root 16325 0.0 0.1 11088 2020 ? S 10:02 0:14 smbd -D
root 16341 0.2 0.3 12364 3624 ? S 10:07 1:09 smbd -D
root 16384 0.1 0.3 12844 3948 ? S 10:27 0:59 smbd -D
root 16428 0.0 0.2 11528 2484 ? S 11:06 0:15 smbd -D
root 16430 0.0 0.1 10432 1256 ? S 11:19 0:00 smbd -D
root 16491 0.0 0.1 11076 2040 ? S 13:49 0:17 smbd -D
root 16511 0.0 0.1 10320 1124 ? S 14:11 0:00 smbd -D
root 16540 0.0 0.1 10564 1500 ? S 14:29 0:01 smbd -D
root 16544 0.0 0.1 10440 1176 ? S 14:41 0:01 smbd -D
root 16666 0.0 0.1 10448 1300 ? S 16:43 0:01 smbd -D
root 16713 0.1 0.7 17100 8124 ? S 17:25 0:09 smbd -D
10052 16740 1.6 0.1 10448 1708 ? S 17:42 1:18 smbd -D
root 16761 0.0 0.1 10560 1404 ? S 18:05 0:01 smbd -D
10052 16778 0.0 0.1 10448 1452 ? S 18:17 0:01 smbd -D
10052 16788 0.9 0.1 10452 1672 ? S 18:29 0:18 smbd -D
10052 16791 0.2 0.1 10436 1616 ? S 18:30 0:04 smbd -D
root 16792 0.2 0.1 10788 1900 ? S 18:30 0:05 smbd -D
10052 16850 0.1 0.1 10436 1604 ? S 18:32 0:03 smbd -D
10052 16852 0.0 0.1 10320 1508 ? S 18:32 0:00 smbd -D
root 16855 0.1 0.1 10604 1832 ? S 18:32 0:03 smbd -D
10052 16857 0.2 0.1 10436 1604 ? S 18:33 0:03 smbd -D
10052 16868 1.2 0.1 10452 1692 ? S 18:35 0:19 smbd -D
10052 16873 0.3 0.2 10816 2372 ? S 18:36 0:06 smbd -D
root 16883 0.0 0.1 10204 1092 ? S 18:38 0:00 smbd -D
root 16886 0.0 0.1 10320 1120 ? S 18:38 0:00 smbd -D
root 16889 0.3 0.1 10784 1924 ? S 18:38 0:05 smbd -D
root 16890 0.0 0.1 10320 1216 ? S 18:38 0:00 smbd -D
10052 16941 0.9 0.1 10448 1692 ? S 18:39 0:13 smbd -D
10052 16942 0.6 0.1 10448 1668 ? S 18:39 0:10 smbd -D
10052 16943 1.2 0.1 10448 1684 ? S 18:39 0:17 smbd -D
10052 16946 1.0 0.1 10452 1656 ? S 18:40 0:15 smbd -D
10052 16947 0.1 0.1 10436 1612 ? S 18:40 0:02 smbd -D
10052 16949 0.4 0.2 10848 2764 ? S 18:41 0:05 smbd -D
10052 17013 0.4 0.2 10468 2636 ? S 19:01 0:00 smbd -D
root 17021 0.0 0.0 3672 636 pts/0 D 19:03 0:00 grep smbd
Following is the output of service smb status
--------------------------------------------------
smbd (pid 17013 16949 16947 16946 16943 16942 16941 16890 16889 16886
16883 16873 16868 16857 16855 16852
16850 16792 16791 16788 16778 16761 16740 16713 16666 16544 16540 16511
16491 16430 16428 16384 16341 16325
16290 16214 16166 15559 15549 14339 13913 4145 2360 1747 1746 1703 1422
1421 1420 1419 1418 1338 1316 1315
1313 1312 1306 1298) is running...
nmbd (pid 1302) is running...
winbindd (pid 1308 1307) is running...
Following is the output of TOP
---------------------------------
19:02:50 up 7 days, 4:15, 2 users, load average: 6.61, 8.09, 9.57
117 processes: 116 sleeping, 1 running, 0 zombie, 0 stopped
CPU states: cpu user nice system irq softirq iowait idle
total 2.9% 0.0% 12.7% 9.8% 10.7% 63.7% 0.0%
Mem: 1030408k av, 1021836k used, 8572k free, 0k shrd, 27200k
buff
776876k actv, 96396k in_d, 13936k in_c
Swap: 4192956k av, 4504k used, 4188452k free 876572k
cached
PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU
COMMAND
16946 clustadm 16 0 1624 1624 1036 S 9.8 0.1 0:14 0 smbd
16943 clustadm 15 0 1664 1664 1044 D 3.9 0.1 0:17 0 smbd
16740 clustadm 15 0 1696 1696 1056 D 0.9 0.1 1:18 0 smbd
17019 root 20 0 1220 1220 904 R 0.9 0.1 0:00 0 top
1 root 15 0 500 500 448 S 0.0 0.0 0:03 0 init
2 root 15 0 0 0 0 SW 0.0 0.0 0:07 0
keventd
3 root 34 19 0 0 0 SWN 0.0 0.0 0:00 0
ksoftirqd/0
6 root 15 0 0 0 0 SW 0.0 0.0 0:04 0
bdflush
4 root 15 0 0 0 0 SW 0.0 0.0 5:54 0 kswapd
5 root 15 0 0 0 0 SW 0.0 0.0 0:21 0 kscand
7 root 15 0 0 0 0 SW 0.0 0.0 0:00 0
kupdated
8 root 25 0 0 0 0 SW 0.0 0.0 0:00 0
mdrecoveryd
14 root 15 0 0 0 0 SW 0.0 0.0 0:00 0
ahd_dv_0
15 root 15 0 0 0 0 SW 0.0 0.0 0:00 0
ahd_dv_1
16 root 25 0 0 0 0 SW 0.0 0.0 0:00 0
scsi_eh_0
17 root 25 0 0 0 0 SW 0.0 0.0 0:00 0
scsi_eh_1
20 root 21 0 0 0 0 SW 0.0 0.0 0:00 0
qla2300_dpc2
21 root 20 0 0 0 0 SW 0.0 0.0 0:00 0
qla2300_dpc3
22 root 20 0 0 0 0 SW 0.0 0.0 0:00 0
scsi_eh_2
23 root 20 0 0 0 0 SW 0.0 0.0 0:00 0
scsi_eh_3
26 root 15 0 0 0 0 SW 0.0 0.0 0:04 0
kjournald
98 root 25 0 0 0 0 SW 0.0 0.0 0:00 0 khubd
295 root 15 0 0 0 0 SW 0.0 0.0 0:00 0
kjournald
296 root 15 0 0 0 0 SW 0.0 0.0 0:00 0
kjournald
297 root 15 0 0 0 0 SW 0.0 0.0 0:00 0
kjournald
298 root 15 0 0 0 0 DW 0.0 0.0 4:34 0
kjournald
299 root 15 0 0 0 0 SW 0.0 0.0 0:09 0
kjournald
300 root 15 0 0 0 0 SW 0.0 0.0 0:12 0
kjournald
301 root 15 0 0 0 0 SW 0.0 0.0 0:04 0
kjournald
Inorder to fine tune samba ,i changed the following line in smb.conf ,
but still the problem continues
/etc/samba/smb.conf
name resolve order = lmhosts wins bcast
Please advise me how to solve this problem
Thanks for any help
Narayanan S
--
This message has been scanned for viruses and
dangerous content by Techfocuz Communicator 2.6 (http://www.focuzinfotech.com), and is
believed to be clean.
More information about the samba
mailing list