PR#21625: Smbd processes get stuck and consume 100% CPU.
Michael Gerdts
Michael.Gerdts at usa.alcatel.com
Tue Aug 21 13:16:10 GMT 2001
I have captured the following info that is likely related to this problem.
This is on a box running SPARC Solaris 8 MU4 + recommended patches.
The really interesting part of this is that smbstatus and lsof tell me
that the smbd processes are talking to different hosts (lsof was not
compiled on this kernel patch rev, so there could be problems there).
The process that is spinning is continuously locking locking.tdb at a
rate of 100 times or more per second.
Process 28077 is spinning. truss tells me:
# truss -p 28077
*** SUID: ruid/euid/suid = 0 / 12651 / 12651 ***
*** SGID: rgid/egid/sgid = 0 / 620 / 620 ***
fcntl(13, F_SETLKW, 0xFFBEF1A8) = 0
fcntl(13, F_SETLKW, 0xFFBEF1A8) = 0
fcntl(13, F_SETLKW, 0xFFBEF1A8) = 0
fcntl(13, F_SETLKW, 0xFFBEF1A8) = 0
fcntl(13, F_SETLKW, 0xFFBEF1A8) = 0
fcntl(13, F_SETLKW, 0xFFBEF1A8) = 0
fcntl(13, F_SETLKW, 0xFFBEF1A8) = 0
fcntl(13, F_SETLKW, 0xFFBEF1A8) = 0
...
# smbstatus | grep 28077
INFO: Debug class all level = 1 (pid 2695 from pid 2695)
userabcd cderinge groupa 28077 ra070143 (143.209.70.143) Wed Aug 15 15:46:08 2001
IPC$ nobody nobody 28077 ra070143 (143.209.70.143) Wed Aug 15 16:54:41 2001
a__share userabcd groupa 28077 ra070143 (143.209.70.143) Wed Aug 15 15:36:26 2001
[ locks omitted ]
# lsof -p 28077
lsof: WARNING: access /.lsof_smbserver: No such file or directory
lsof: WARNING: created device cache file: /.lsof_smbserver
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
smbd 28077 userabcd cwd VDIR 247,9587 1536 2 /automount/a__share
smbd 28077 userabcd txt VREG 85,0 1711852 329184 / (/dev/md/dsk/d0)
smbd 28077 userabcd txt-r VREG 85,0 65536 180050 /opt/samba/var/locks/locking.tdb
smbd 28077 userabcd txt VREG 85,0 204800 180046 / (/dev/md/dsk/d0)
smbd 28077 userabcd txt VREG 85,0 45816 44316 /usr/lib/nss_nis.so.1
smbd 28077 userabcd txt VREG 85,0 44892 44561 /usr/lib/nss_files.so.1
smbd 28077 userabcd txt-r VREG 85,0 16384 180055 /opt/samba/var/locks/sessionid.tdb
smbd 28077 userabcd txt-r VREG 85,0 8192 180053 /opt/samba/var/locks/share_info.tdb
smbd 28077 userabcd txt-r VREG 85,0 8192 180052 /opt/samba/var/locks/ntdrivers.tdb
smbd 28077 userabcd txt-r VREG 85,0 8192 180051 /opt/samba/var/locks/printing.tdb
smbd 28077 userabcd txt-r VREG 85,0 16384 180049 /opt/samba/var/locks/brlock.tdb
smbd 28077 userabcd txt VREG 85,0 696 180043 /opt/samba/var/locks/messages.tdb
smbd 28077 userabcd txt VREG 85,0 17096 529155 / -- sun4u/lib/libc_psr.so.1
smbd 28077 userabcd txt VREG 85,0 1136608 44410 /usr/lib/libc.so.1
smbd 28077 userabcd txt VREG 85,0 24968 44282 /usr/lib/libmp.so.2
smbd 28077 userabcd txt VREG 85,0 8192 288411 / (/dev/md/dsk/d0)
smbd 28077 userabcd txt VREG 85,0 884548 44600 /usr/lib/libnsl.so.1
smbd 28077 userabcd txt VREG 85,0 70260 44303 /usr/lib/libsocket.so.1
smbd 28077 userabcd txt VREG 85,0 42184 44267 /usr/lib/libgen.so.1
smbd 28077 userabcd txt VREG 85,0 22964 44300 /usr/lib/libsec.so.1
smbd 28077 userabcd txt VREG 85,0 4624 44253 /usr/lib/libdl.so.1
smbd 28077 userabcd txt VREG 85,0 195104 44148 /usr/lib/ld.so.1
smbd 28077 userabcd 0u VCHR 13,2 0t0 549537 /devices/pseudo/mm at 0:null
smbd 28077 userabcd 1u VCHR 13,2 0t0 549537 /devices/pseudo/mm at 0:null
smbd 28077 userabcd 2u VCHR 13,2 0t0 549537 /devices/pseudo/mm at 0:null
smbd 28077 userabcd 3r DOOR 244,0 0t0 40919 /etc/.name_service_door (door to nscd[212])
smbd 28077 userabcd 4u VREG 85,0 8192 288411 / (/dev/md/dsk/d0)
smbd 28077 userabcd 5u IPv4 0x30002834920 0t0 UDP localhost:44112 (Idle)
smbd 28077 userabcd 6w VREG 85,0 20 179925 / (/dev/md/dsk/d0)
smbd 28077 userabcd 7u VREG 85,0 696 180043 /opt/samba/var/locks/messages.tdb
smbd 28077 userabcd 8u VREG 85,0 204800 180046 / (/dev/md/dsk/d0)
smbd 28077 userabcd 9ur VREG 85,0 16384 180049 /opt/samba/var/locks/brlock.tdb
smbd 28077 userabcd 10u FIFO 0x3000c014220 0t0 20661 (fifofs) PIPE->0x3000c014308
smbd 28077 userabcd 11u FIFO 0x3000c014308 0t0 20661 (fifofs) PIPE->0x3000c014220
smbd 28077 userabcd 12u IPv4 0x30001ea6030 0xb3c0cea TCP smbserver:*->RA001962:* (IDLE)
smbd 28077 userabcd 13ur VREG 85,0 65536 180050 /opt/samba/var/locks/locking.tdb
smbd 28077 userabcd 14ur VREG 85,0 8192 180051 /opt/samba/var/locks/printing.tdb
smbd 28077 userabcd 15ur VREG 85,0 8192 180052 /opt/samba/var/locks/ntdrivers.tdb
smbd 28077 userabcd 16ur VREG 85,0 8192 180053 /opt/samba/var/locks/share_info.tdb
smbd 28077 userabcd 17u IPv4 0x3000279b308 0t0 UDP *:44121 (Idle)
smbd 28077 userabcd 18u FIFO 0x300027e6420 0t0 20674 (fifofs) PIPE->0x300027e6508
smbd 28077 userabcd 19u FIFO 0x300027e6508 0t0 20674 (fifofs) PIPE->0x300027e6420
smbd 28077 userabcd 20r VREG 247,9587 10633 6941507 /automount/a__share (nfsserver:/export/a__share)
smbd 28077 userabcd 21ur VREG 85,0 16384 180055 /opt/samba/var/locks/sessionid.tdb
smbd 28077 userabcd 22w VREG 85,0 47933281 492386 / (/dev/md/dsk/d0)
smbd 28077 userabcd 23u VREG 247,9587 1868800 5213085 /automount/a__share (nfsserver:/export/a__share)
smbd 28077 userabcd 24r VREG 247,9587 4270 6941506 /automount/a__share (nfsserver:/export/a__share)
smbd 28077 userabcd 25u VREG 247,9587 0 6941504 /automount/a__share (nfsserver:/export/a__share)
smbd 28077 userabcd 26r VREG 247,9587 27378 6293442 /automount/a__share (nfsserver:/export/a__share)
smbd 28077 userabcd 28r VREG 247,9587 81397 6347403 /automount/a__share (nfsserver:/export/a__share)
smbd 28077 userabcd 29r VREG 247,9587 8885 6347418 /automount/a__share (nfsserver:/export/a__share)
smbd 28077 userabcd 30r VREG 247,9587 15608 6414908 /automount/a__share (nfsserver:/export/a__share)
smbd 28077 userabcd 31r VREG 247,9587 38988 6414907 /automount/a__share (nfsserver:/export/a__share)
smbd 28077 userabcd 32r VREG 247,9587 32096 6414906 /automount/a__share (nfsserver:/export/a__share)
smbd 28077 userabcd 33r VREG 247,9587 22028 6414905 /automount/a__share (nfsserver:/export/a__share)
smbd 28077 userabcd 34r VREG 247,9587 34420 6414904 /automount/a__share (nfsserver:/export/a__share)
Meanwhile, a different process is trying to get pid 28077 to give up its lock:
[2001/08/20 14:06:35, 0, pid=729] smbd/oplock.c:request_oplock_break(997)
request_oplock_break: no response received to oplock break request to pid 28077 on port 44112 for dev = 3dc2573, inode = 6347403
for dev = 3dc2573, inode = 6347403, tv_sec = 3b7d5ec8, tv_usec = 366bd
[2001/08/20 14:07:07, 0, pid=729] smbd/oplock.c:request_oplock_break(997)
request_oplock_break: no response received to oplock break request to pid 28077 on port 44112 for dev = 3dc2573, inode = 6347403
for dev = 3dc2573, inode = 6347403, tv_sec = 3b7d5ec8, tv_usec = 366bd
(kill -USR1 does not seem to up the debug level any more...)
When I use gcore to get a core file, I see:
# gdb /opt/samba/sbin/smbd core.28077 (this is the spinning process)
GNU gdb 4.18
Copyright 1998 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "sparc-sun-solaris2.8"...
Core was generated by `/opt/samba/sbin/smbd -D -d 1 -l /var/opt/samba/log.smb'.
Reading symbols from /usr/lib/libsec.so.1...done.
Reading symbols from /usr/lib/libgen.so.1...done.
Reading symbols from /usr/lib/libsocket.so.1...done.
Reading symbols from /usr/lib/libnsl.so.1...done.
Reading symbols from /usr/lib/libdl.so.1...done.
Reading symbols from /usr/lib/libc.so.1...done.
Reading symbols from /usr/lib/libmp.so.2...done.
Reading symbols from /usr/platform/SUNW,UltraSPARC-IIi-cEngine/lib/libc_psr.so.1...done.
Reading symbols from /usr/lib/nss_files.so.1...done.
Reading symbols from /usr/lib/nss_nis.so.1...done.
#0 0xff217d00 in __fcntl () from /usr/lib/libc.so.1
(gdb) where
#0 0xff217d00 in __fcntl () from /usr/lib/libc.so.1
#1 0xff21261c in s_fcntl () from /usr/lib/libc.so.1
#2 0x10fa5c in tdb_brlock ()
#3 0x10fca8 in tdb_unlock ()
#4 0x11213c in tdb_chainunlock ()
#5 0xe48a4 in unlock_share_entry ()
#6 0x5eaa0 in open_mode_check ()
#7 0x5f140 in open_file_shared ()
#8 0x4570c in reply_ntcreate_and_X ()
#9 0x695cc in switch_message ()
#10 0x69658 in construct_reply ()
#11 0x698ac in process_smb ()
#12 0x6a250 in smbd_process ()
#13 0x2e3bc in main ()
(gdb)
More information about the samba-technical
mailing list