Samba 2.2.0 locking problem (long)
Tony Jago
T.Jago at its.uq.edu.au
Wed May 30 01:01:50 GMT 2001
Hi all,
<execuitive summary>
I would like some help solving some problems we are having with
the current SAMBA_2_2 cvs tree (and with 2.2.0 release). Currently we are
running samba 2.2 cvs as of about 10pm 29/5/2001 on a Solaris 8 machine.
We are seeing samba processes get stuck calling fcntl many times.
</execuitive summary>
I compiled the code with the following options:
configure --with-quotas --with-profile --with-pam --prefix=/usr/local
We run a fairly complex setup with 4 virtual sambe machines:
staff and student: basically both of these are the same. They use
encrypted passwords = no and authenticate using a pam module to either
an ldap or kerberos server. (this bit works fine).
soefile: is part of a domain called UQ. It uses security = domain to pass
authentication through to an NT PDC (this works fine).
adminfs: is part of the ADMIN domain. Authenticates against an NT server.
(this works file).
my config files are basically as follows:
smb.conf:
[global]
workgroup = UQ
log file = /var/log/samba/smb.%m
log level = 1
max log size = 5000
netbios name = staff
netbios aliases = student soefile adminfs
server string = UQ Central File Server
username map = /usr/local/etc/users.map
hosts allow = 127. 192.168. 130.102.
interfaces = 130.102.5.50/24
bind interfaces only = true
wins server = 130.102.2.112
include = /usr/local/etc/smb.conf.%L
include = /usr/local/etc/smbshares.conf
smb.conf.soefile:
netbios name = soefile
security = domain
password server = *
encrypt passwords = yes
smb.conf.adminfs:
workgroup = ADMIN
netbios name = adminfs
security = domain
password server = SOCRATES
encrypt passwords = yes
This setup basically works but I should let you know about a few other
weird things we have here. Neally all of the clients to the soefile
machine are NT servers running Citrix Metaframe. This means that we are
seeing many many many different users coming from a single NT client and
therefore hitting a single smbd process. This wasn't so much of a problem
in 2.0.7 of samba except that it overflowed the maximum number of open
files. In solaris I can increase the number of file descriptors to a very
large number but descriptors after 255 can't be opened with the fopen
system call, samba needs to use the open system call. This is fine for
the most part as file access does seem to use open but there are a few
situations like opening the machine account file etc where 2.0.7 was
using fopen and this was failing. I haven't see this exact problem with
2.2.0 but I have given you all this information to help you try and under
stand our complex setup.
Problem 1: Minor problem is that I need to specify a password server for
the adminfs machine (ie. can't use password server = *). I think it may
be using the UQ domains PDC when I have * selected.
Problem 2: Major problem is that smbd processes are starting to go into
spins looking for a lock.
This is the top of a top display, as you can see there are currently 3
samba processing in a death spin consuming neally 100% of the cpu (its
a 16 way box so 100 / 16 = 6.25)
load averages: 11.00, 11.13, 10.64 10:37:33
487 processes: 476 sleeping, 2 running, 2 zombie, 7 on cpu
CPU states: 33.2% idle, 25.1% user, 33.8% kernel, 8.0% iowait, 0.0% swap
Memory: 16G real, 6457M free, 4678M swap in use, 6935M swap free
PID USERNAME THR PRI NICE SIZE RES STATE TIME CPU COMMAND
9575 nsslapd 18 0 0 2492M 1646M cpu22 687:20 7.28% ns-slapd
8751 root 1 0 0 5000K 4136K cpu24 15:51 5.33% smbd
29211 root 1 0 0 4824K 3680K cpu17 34:50 5.32% smbd
8857 root 1 0 0 4696K 3728K cpu31 38:46 5.26% smbd
18199 root 133 0 0 490M 489M run 93.4H 4.62% squid
13890 root 1 0 0 3584K 2792K sleep 493:37 0.90% top
if I do a truss on one of these processes is produces this:
# truss -p 8751
*** SUID: ruid/euid/suid = 0 / 12444 / 12444 ***
*** SGID: rgid/egid/sgid = 0 / 50 / 50 ***
fcntl(13, F_SETLKW, 0xFFBEF218) = 0
fcntl(13, F_SETLKW, 0xFFBEF218) = 0
fcntl(13, F_SETLKW, 0xFFBEF218) = 0
fcntl(13, F_SETLKW, 0xFFBEF218) = 0
fcntl(13, F_SETLKW, 0xFFBEF218) = 0
fcntl(13, F_SETLKW, 0xFFBEF218) = 0
fcntl(13, F_SETLKW, 0xFFBEF218) = 0
fcntl(13, F_SETLKW, 0xFFBEF218) = 0
fcntl(13, F_SETLKW, 0xFFBEF218) = 0
fcntl(13, F_SETLKW, 0xFFBEF218) = 0
fcntl(13, F_SETLKW, 0xFFBEF218) = 0
fcntl(13, F_SETLKW, 0xFFBEF218) = 0
fcntl(13, F_SETLKW, 0xFFBEF218) = 0
fcntl(13, F_SETLKW, 0xFFBEF218) = 0
fcntl(13, F_SETLKW, 0xFFBEF218) = 0
fcntl(13, F_SETLKW, 0xFFBEF218) = 0
fcntl(13, F_SETLKW, 0xFFBEF218) = 0
fcntl(13, F_SETLKW, 0xFFBEF218) = 0
fcntl(13, F_SETLKW, 0xFFBEF218) = 0
fcntl(13, F_SETLKW, 0xFFBEF218) = 0
fcntl(13, F_SETLKW, 0xFFBEF218) = 0
fcntl(13, F_SETLKW, 0xFFBEF218) = 0
fcntl(13, F_SETLKW, 0xFFBEF218) = 0
fcntl(13, F_SETLKW, 0xFFBEF218) = 0
fcntl(13, F_SETLKW, 0xFFBEF218) = 0
(continues to scroll rapidly up the screen)
smbstatus is as below (dates trimmed off to fit in 80 char width):
# smbstatus | grep 8751
INFO: Debug class all level = 1 (pid 9319 from pid 9319)
uqbmosed uqbmosed staff 8751 ps02 (130.102.5.81)
repository uqschin staff 8751 ps02 (130.102.5.81)
uqacurra uqacurra staff 8751 ps02 (130.102.5.81)
repository vajmarsh staff 8751 ps02 (130.102.5.81)
uqsway uqsway staff 8751 ps02 (130.102.5.81)
repository iosdenga staff 8751 ps02 (130.102.5.81)
uqiholme uqiholme staff 8751 ps02 (130.102.5.81)
IPC$ iosdenga staff 8751 ps02 (130.102.5.81)
repository uqlsharm staff 8751 ps02 (130.102.5.81)
gammarsh gammarsh staff 8751 ps02 (130.102.5.81)
mlwchris mlwchris staff 8751 ps02 (130.102.5.81)
repository gammarsh staff 8751 ps02 (130.102.5.81)
uqmschmi uqmschmi staff 8751 ps02 (130.102.5.81)
uqchoney uqchoney staff 8751 ps02 (130.102.5.81)
vajmarsh vajmarsh staff 8751 ps02 (130.102.5.81)
repository uqbmosed staff 8751 ps02 (130.102.5.81)
uqschin uqschin staff 8751 ps02 (130.102.5.81)
IPC$ nobody nobody 8751 ps02 (130.102.5.81)
uqgbugal uqgbugal staff 8751 ps02 (130.102.5.81)
iosdenga iosdenga staff 8751 ps02 (130.102.5.81)
repository mlwchris staff 8751 ps02 (130.102.5.81)
8751 DENY_ALL WRONLY EXCLUSIVE+BATCH
/export/home/staff/io/iosdenga/BusinessObjects5/UserLibs/extfunct.txt
output of pfiles:
# pfiles 8751
8751: /usr/local/sbin/smbd
Current rlimit: 10010 file descriptors
0: S_IFCHR mode:0666 dev:85,10 ino:223907 uid:0 gid:3 rdev:13,2
O_RDWR|O_LARGEFILE
1: S_IFCHR mode:0666 dev:85,10 ino:223907 uid:0 gid:3 rdev:13,2
O_RDWR|O_LARGEFILE
2: S_IFCHR mode:0666 dev:85,10 ino:223907 uid:0 gid:3 rdev:13,2
O_RDWR|O_LARGEFILE
3: S_IFDOOR mode:0444 dev:230,0 ino:38738 uid:0 gid:0 size:0
O_RDONLY|O_LARGEFILE FD_CLOEXEC door to nscd[853]
4: S_IFSOCK mode:0666 dev:225,0 ino:8245 uid:0 gid:0 size:0
O_RDWR
sockname: AF_INET 130.102.5.50 port: 139
peername: AF_INET 130.102.5.81 port: 1306
5: S_IFREG mode:0600 dev:75,29001 ino:14069 uid:0 gid:0 size:8192
O_RDWR
6: S_IFREG mode:0644 dev:75,29001 ino:14070 uid:0 gid:0 size:20
O_WRONLY|O_NONBLOCK|O_LARGEFILE
advisory write lock set by process 26646
7: S_IFREG mode:0600 dev:75,29001 ino:14065 uid:0 gid:0 size:696
O_RDWR
advisory read lock set by process 26646
8: S_IFREG mode:0644 dev:75,29001 ino:189762 uid:0 gid:0 size:565248
O_RDWR
advisory read lock set by process 26646
9: S_IFREG mode:0644 dev:75,29001 ino:194155 uid:0 gid:0 size:8192
O_RDWR
advisory read lock set by process 13970
10: S_IFIFO mode:0000 dev:226,0 ino:12873147 uid:0 gid:0 size:0
O_RDWR|O_NONBLOCK
11: S_IFIFO mode:0000 dev:226,0 ino:12873147 uid:0 gid:0 size:0
O_RDWR|O_NONBLOCK
12: S_IFSOCK mode:0666 dev:225,0 ino:49209 uid:0 gid:0 size:0
O_RDWR
sockname: AF_INET 127.0.0.1 port: 44843
13: S_IFREG mode:0644 dev:75,29001 ino:194170 uid:0 gid:0 size:90112
O_RDWR
advisory read lock set by process 13970
14: S_IFREG mode:0600 dev:75,29001 ino:194178 uid:0 gid:0 size:8192
O_RDWR
advisory read lock set by process 13970
15: S_IFREG mode:0600 dev:75,29001 ino:194199 uid:0 gid:0 size:8192
O_RDWR
advisory read lock set by process 13970
16: S_IFREG mode:0600 dev:75,29001 ino:194292 uid:0 gid:0 size:8192
O_RDWR
advisory read lock set by process 13970
18: S_IFIFO mode:0000 dev:226,0 ino:13031393 uid:0 gid:0 size:0
O_RDWR|O_NONBLOCK
19: S_IFIFO mode:0000 dev:226,0 ino:13031393 uid:0 gid:0 size:0
O_RDWR|O_NONBLOCK
20: S_IFREG mode:0644 dev:75,29001 ino:194573 uid:0 gid:0 size:57344
O_RDWR
advisory read lock set by process 13970
21: S_IFREG mode:0644 dev:75,29000 ino:825 uid:0 gid:0 size:1110905
O_WRONLY|O_APPEND|O_LARGEFILE
# lsof -p 8751
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
smbd 8751 iosdenga cwd VDIR 75,43000 8192 51934 /export/home/staff/io/iosdenga
smbd 8751 iosdenga txt VREG 75,29001 1714140 222289 /usr/local/sbin/smbd
smbd 8751 iosdenga txt VREG 85,30 44892 3943 /usr/lib/nss_files.so.1
smbd 8751 iosdenga txt VREG 75,29001 565248 189762 /usr/local (/dev/vx/dsk/soetrindg/soetrinlocal)
smbd 8751 iosdenga txt-r VREG 75,29001 57344 194573 /usr/local -- sessionid.tdb
smbd 8751 iosdenga txt-r VREG 75,29001 8192 194292 /usr/local -- share_info.tdb
smbd 8751 iosdenga txt-r VREG 75,29001 8192 194199 /usr/local -- ntdrivers.tdb
smbd 8751 iosdenga txt-r VREG 75,29001 8192 194178 /usr/local (/dev/vx/dsk/soetrindg/soetrinlocal)
smbd 8751 iosdenga txt-r VREG 75,29001 90112 194170 /usr/local -- locking.tdb
smbd 8751 iosdenga txt-r VREG 75,29001 8192 194155 /usr/local -- brlock.tdb
smbd 8751 iosdenga txt VREG 75,29001 696 14065 /usr/local (/dev/vx/dsk/soetrindg/soetrinlocal)
smbd 8751 iosdenga txt VREG 85,30 17096 247619 /usr/platform/sun4u/lib/libc_psr.so.1
smbd 8751 iosdenga txt VREG 85,30 24968 3578 /usr/lib/libmp.so.2
smbd 8751 iosdenga txt VREG 85,30 1129948 3705 /usr/lib/libc.so.1
smbd 8751 iosdenga txt VREG 85,30 36844 3582 /usr/lib/libpam.so.1
smbd 8751 iosdenga txt VREG 75,29001 8192 14069 /usr/local (/dev/vx/dsk/soetrindg/soetrinlocal)
smbd 8751 iosdenga txt VREG 85,30 888508 3677 /usr/lib/libnsl.so.1
smbd 8751 iosdenga txt VREG 85,30 70260 3599 /usr/lib/libsocket.so.1
smbd 8751 iosdenga txt VREG 85,30 42184 3563 /usr/lib/libgen.so.1
smbd 8751 iosdenga txt VREG 85,30 22964 3596 /usr/lib/libsec.so.1
smbd 8751 iosdenga txt VREG 85,30 4624 3535 /usr/lib/libdl.so.1
smbd 8751 iosdenga txt VREG 85,30 195104 3594 /usr/lib/ld.so.1
smbd 8751 iosdenga 0u VCHR 13,2 0t0 223907 /devices/pseudo/mm at 0:null
smbd 8751 iosdenga 1u VCHR 13,2 0t0 223907 /devices/pseudo/mm at 0:null
smbd 8751 iosdenga 2u VCHR 13,2 0t0 223907 /devices/pseudo/mm at 0:null
smbd 8751 iosdenga 3r DOOR 0,3563983 0t0 38738 door to nscd[853]
smbd 8751 iosdenga 4u IPv4 0x300fcfa9228 0t38776288 TCP matrix1.cc.uq.edu.au:netbios-ssn->ps02.soe.uq.edu.au:1306 (CLOSE_WAIT)
smbd 8751 iosdenga 5u VREG 75,29001 8192 14069 /usr/local (/dev/vx/dsk/soetrindg/soetrinlocal)
smbd 8751 iosdenga 6w VREG 75,29001 20 14070 /usr/local (/dev/vx/dsk/soetrindg/soetrinlocal)
smbd 8751 iosdenga 7u VREG 75,29001 696 14065 /usr/local (/dev/vx/dsk/soetrindg/soetrinlocal)
smbd 8751 iosdenga 8u VREG 75,29001 565248 189762 /usr/local (/dev/vx/dsk/soetrindg/soetrinlocal)
smbd 8751 iosdenga 9ur VREG 75,29001 8192 194155 /usr/local -- brlock.tdb
smbd 8751 iosdenga 10u FIFO 0x3001b1c2a20 0t0 12873147 PIPE->0x3001b1c2b08
smbd 8751 iosdenga 11u FIFO 0x3001b1c2b08 0t0 12873147 PIPE->0x3001b1c2a20
smbd 8751 iosdenga 12u IPv4 0x3010fce0198 0t0 UDP localhost:44843 (Idle)
smbd 8751 iosdenga 13ur VREG 75,29001 90112 194170 /usr/local -- locking.tdb
smbd 8751 iosdenga 14ur VREG 75,29001 8192 194178 /usr/local (/dev/vx/dsk/soetrindg/soetrinlocal)
smbd 8751 iosdenga 15ur VREG 75,29001 8192 194199 /usr/local -- ntdrivers.tdb
smbd 8751 iosdenga 16ur VREG 75,29001 8192 194292 /usr/local -- share_info.tdb
smbd 8751 iosdenga 18u FIFO 0x3010a071e20 0t0 13031393 PIPE->0x3010a071f08
smbd 8751 iosdenga 19u FIFO 0x3010a071f08 0t0 13031393 PIPE->0x3010a071e20
smbd 8751 iosdenga 20ur VREG 75,29001 57344 194573 /usr/local -- sessionid.tdb
smbd 8751 iosdenga 21w VREG 75,29000 1113425 825 /var/log/samba/smb.ps02
interesting bits from the output of smb.ps02:
[2001/05/30 10:52:33, 0] smbd/posix_acls.c:create_canon_ace_lists(702)
create_canon_ace_lists: unable to map SID S-1-5-32-544 to uid or gid.
[2001/05/30 10:52:33, 0] smbd/posix_acls.c:create_canon_ace_lists(702)
create_canon_ace_lists: unable to map SID S-1-5-32-544 to uid or gid.
[2001/05/30 10:52:33, 0] smbd/posix_acls.c:create_canon_ace_lists(702)
create_canon_ace_lists: unable to map SID S-1-5-32-544 to uid or gid.
[2001/05/30 10:52:33, 0] smbd/posix_acls.c:create_canon_ace_lists(702)
create_canon_ace_lists: unable to map SID S-1-5-32-544 to uid or gid.
[2001/05/30 10:52:33, 0] smbd/posix_acls.c:create_canon_ace_lists(702)
create_canon_ace_lists: unable to map SID S-1-5-32-544 to uid or gid.
[2001/05/30 10:52:33, 0] smbd/posix_acls.c:create_canon_ace_lists(702)
create_canon_ace_lists: unable to map SID S-1-5-32-544 to uid or gid.
[2001/05/30 10:52:35, 1] smbd/service.c:make_connection(633)
ps02 (130.102.5.81) connect to service edlcowan as user edlcowan (uid=12677, gid=50) (pid 26937)
[2001/05/30 10:52:36, 1] smbd/service.c:make_connection(633)
ps02 (130.102.5.81) connect to service repository as user edlcowan (uid=12677, gid=5
[2001/05/30 10:53:41, 0] smbd/oplock.c:request_oplock_break(995)
request_oplock_break: no response received to oplock break request to pid 8751 on port 44843 for dev = 12ca7f8, inode = 1000248
for dev = 12ca7f8, inode = 1000248, tv_sec = 3b143c9c, tv_usec = d4f38
[2001/05/30 10:54:13, 0] smbd/oplock.c:request_oplock_break(995)
request_oplock_break: no response received to oplock break request to pid 8751 on port 44843 for dev = 12ca7f8, inode = 1000248
for dev = 12ca7f8, inode = 1000248, tv_sec = 3b143c9c, tv_usec = d4f38
[2001/05/30 10:54:45, 0] smbd/oplock.c:request_oplock_break(995)
request_oplock_break: no response received to oplock break request to pid 8751 on port 44843 for dev = 12ca7f8, inode = 1000248
for dev = 12ca7f8, inode = 1000248, tv_sec = 3b143c9c, tv_usec = d4f38
Any assistance at all this this problem would be greatly appreciated. And
if you still looking for 2.2.1 release canidate problems I would class
this as a major one.
Thanks in advance for your help,
Tony
---
Tony Jago, System Administrator, E-Mail: T.Jago at its.uq.edu.au
Server and Security Group, Phone: +61 7 33654078
Information Technology Services, Fax: +61 7 33654065
The University of Queensland. Brisbane, Australia. 4072.
More information about the samba-technical
mailing list