100% cpu utilization

Scott Moomaw scott at bridgewater.edu
Wed Oct 31 14:06:21 GMT 2001


We continue to experience problems with the latest Samba CVS on Solaris 8
consuming 100% of cpu utilization.  The problem seems to occur when the
system is experiencing heavy load.

Here's what we see of the problem.

Using top, we note 100% CPU utilization.  There are approx 200 smb
processes with a handful of the processes in a runable state using up to
2% of CPU each.  It's hard to get details on one of the problematic
processes because as quickly as we can identify them, they disappear.
Using truss, I find most processes in an expected poll state, but when I
can catch one of the problem processes I see bunches of fcntl with calls
like kill(20759, SIG#0) interspersed.  I did manage to grab a core of one
of these processes and have included a stack backtrace below.

__fcntl(0xd,0x23,0x80474c4) + c
fcntl(0xd,0x23,0x80474c4,0x2c13c) + 1f
tdb_brlock(0x81ec8f8,0xa8,0x2,0x23,0x0,0xdfa65b2b,0xd,0x23) + 68
tdb_lock(0x81ec8f8,0x0,0x2,0x16,0x1,0x81e9088) + a2
tdb_chainlock(0x81ec8f8,0x81e8f70,0xc,0x810d0e8,0x81e8f70,0xc) + 2a
delete_fn(0x81ec8f8,0x81e8f70,0xc,0x81e8f7c,0x8a,0x804764c,0x2,0x83) + 3d
tdb_traverse(0x81ec8f8,0x810d0d0,0x804764c,0x806e467) + 9b
locking_end(0xdfa83000,0x80476c0,0x0,0x0,0x0,0x804768c,0x8047690,0x806da8b)
+ 47
exit_server(0x8144c80,0x0,0x0,0x0,0x0,0x8047abc) + 160
dflt_sig(0xf,0x0,0x80476c0) + 13
sigacthandler() + 25
dbg_mask(0x15,0x8047a3c,0x0,0x0,0x8047a34,0x5) + 2044f887
sys_select(0xd,0x8047a3c,0x8047a34,0x80a12f9) + c7
receive_message_or_smb(0x8210991,0x10040,0xea60,0x80a25e0) + 169
smbd_process(0xdfbed1e8,0x8047b10,0x8047bf8,0x210,0xdfa0d67f,0xdfa0d6a3,0xdfbe13
7f,0x8047b10,0x8b,0x1,0x8047b48,0x806d94f,0x2,0x8047b54,0x8047b60,0x8144c30)
+ 11e
main(0x2,0x8047b54,0x8047b60) + 6d9

Here's a snippet from log.smbd in the time period leading up to the
problem in case it is useful

[2001/10/31 11:49:20, 0, pid=414] lib/util_sock.c:get_socket_addr(1038)
  getpeername failed. Error was Transport endpoint is not connected
[2001/10/31 11:49:20, 0, pid=414] lib/util_sock.c:get_socket_addr(1038)
  getpeername failed. Error was Transport endpoint is not connected
[2001/10/31 11:49:20, 0, pid=414] lib/access.c:check_access(322)
[2001/10/31 11:49:20, 0, pid=414] lib/util_sock.c:get_socket_addr(1038)
  getpeername failed. Error was Transport endpoint is not connected
  Denied connection from  (0.0.0.0)
[2001/10/31 11:49:20, 1, pid=414] smbd/process.c:process_smb(850)
[2001/10/31 11:49:20, 0, pid=414] lib/util_sock.c:get_socket_addr(1038)
  getpeername failed. Error was Transport endpoint is not connected
  Connection denied from 0.0.0.0
[2001/10/31 11:49:20, 0, pid=414] lib/util_sock.c:write_socket_data(542)
  write_socket_data: write failure. Error = Broken pipe
[2001/10/31 11:49:20, 0, pid=414] lib/util_sock.c:write_socket(566)
  write_socket: Error writing 5 bytes to socket 5: ERRNO = Broken pipe
[2001/10/31 11:49:20, 0, pid=414] lib/util_sock.c:send_smb(730)
  Error writing 5 bytes to client. -1. (Broken pipe)
[2001/10/31 11:49:29, 0, pid=423] lib/util_sock.c:get_socket_addr(1038)
  getpeername failed. Error was Transport endpoint is not connected

Any insight as to this problem?

Scott

------------------------------------------------------------------------
 Scott Moomaw, Network Administrator              Scott at Bridgewater.edu
 Bridgewater College, IT Center
 Bridgewater, VA  22812
 Phone (540) 828 - 8000  x5437              FAX:  (540) 828 - 5493






More information about the samba-technical mailing list