smbd process block

Goran Lowkrantz goran.lowkrantz at ismobile.com
Tue Apr 7 16:52:01 GMT 2009


Hi Sam,

I have tested accessing the directory with windows explorer but that too 
goes Not responding. I have touched both an old and a new file but that 
didn't change anything.

here is the lsof of the unresponsive smbd:
> lsof -p 71672
COMMAND   PID USER   FD   TYPE             DEVICE SIZE/OFF  NODE NAME
smbd    71672 root  cwd   VDIR       17,910622967       10     3 
/usr/data/glz/imatch
smbd    71672 root  rtd   VDIR              0,104      512     2 /
smbd    71672 root  txt                                          can't read 
vnode at 0x006c96c7
smbd    71672 root  txt                                          can't read 
vnode at 0x00030230
smbd    71672 root  txt                                          can't read 
vnode at 0x00020000
smbd    71672 root  txt                                          can't read 
vnode at 0x00010000
smbd    71672 root  txt                                          can't read 
vnode at 0x000002b8
smbd    71672 root  txt                                          can't read 
vnode at 0x00003000
smbd    71672 root  txt                                          can't read 
vnode at 0x00016000
smbd    71672 root  txt                                          can't read 
vnode at 0x00009d08
smbd    71672 root  txt                                          can't read 
vnode at 0x00007000
smbd    71672 root  txt                                          can't read 
vnode at 0x00002000
smbd    71672 root  txt                                          can't read 
vnode at 0x00028000
smbd    71672 root  txt                                          can't read 
vnode at 0x00025000
smbd    71672 root  txt                                          can't read 
vnode at 0x00043a61
smbd    71672 root  txt                                          can't read 
vnode at 0x00011118
smbd    71672 root  txt                                          can't read 
vnode at 0x000c035c
smbd    71672 root  txt                                          can't read 
vnode at 0x0004ab40
smbd    71672 root  txt                                          can't read 
vnode at 0x00193d40
smbd    71672 root  txt                                          can't read 
vnode at 0x00014900
smbd    71672 root  txt                                          can't read 
vnode at 0x0001c1e0
smbd    71672 root  txt                                          can't read 
vnode at 0x00008070
smbd    71672 root  txt                                          can't read 
vnode at 0x00008bf0
smbd    71672 root  txt                                          can't read 
vnode at 0x0000e698
smbd    71672 root  txt                                          can't read 
vnode at 0x001077f4
smbd    71672 root  txt                                          can't read 
vnode at 0x00005290
smbd    71672 root  txt                                          can't read 
vnode at 0x00008ced
smbd    71672 root  txt                                          can't read 
vnode at 0x0011f688
smbd    71672 root  txt                                          can't read 
vnode at 0x0001bf2b
smbd    71672 root  txt                                          can't read 
vnode at 0x0000b7b1
smbd    71672 root  txt                                          can't read 
vnode at 0x00014058
smbd    71672 root  txt                                          can't read 
vnode at 0x00001dd0
smbd    71672 root    0u  VCHR               0,20      0t0    20 /dev/null
smbd    71672 root    1u  VCHR               0,20      0t0    20 /dev/null
smbd    71672 root    2w  VREG     129,4254531705    31595   630 
/var/log/log.tor
smbd    71672 root    3r  VCHR               0,27      0t9    27 /dev/random
smbd    71672 root    4u  VREG     205,2015822061    12288 47036 /usr/local 
-- etc/samba/secrets.tdb
smbd    71672 root    5u  VREG     203,3580952781      696 11659 /var 
(system/var)
smbd    71672 root    6u  PIPE 0xffffff010c6972e8    16384 
->0xffffff010c697440
smbd    71672 root    7w  VREG     203,3580952781        6 11664 /var 
(system/var)
smbd    71672 root    8u  IPv4 0xffffff0065943b60      0t0   TCP 
midgard:microsoft-ds->tor:53116 (ESTABLISHED)
smbd    71672 root    9u  VREG     203,3580952781   163840 11665 /var 
(system/var)
smbd    71672 root   10u  VREG     203,3580952781    90112 11661 /var 
(system/var)
smbd    71672 root   11u  VREG     203,3580952781    40200 11666 /var 
(system/var)
smbd    71672 root   12u  VREG     203,3580952781   151552 11667 /var 
(system/var)
smbd    71672 root   13u  VREG     203,3580952781     8192 11662 /var 
(system/var)
smbd    71672 root   14u  VREG     203,3580952781     8192  1052 /var 
(system/var)
smbd    71672 root   15u  VREG     203,3580952781     8192  1013 /var 
(system/var)
smbd    71672 root   16u  VREG     203,3580952781    12288  1053 /var 
(system/var)
smbd    71672 root   17u  VREG     203,3580952781      696  1054 /var 
(system/var)
smbd    71672 root   18u  PIPE 0xffffff00096c3000    16384 
->0xffffff00096c3158
smbd    71672 root   20u  PIPE 0xffffff010c697440        0 
->0xffffff010c6972e8
smbd    71672 root   21u  VREG     203,3580952781     8192  1056 /var 
(system/var)
smbd    71672 root   22u  VREG     203,3580952781      696   960 /var 
(system/var)
smbd    71672 root   23u  PIPE 0xffffff010c6ff8b8    16384 
->0xffffff010c6ffa10
smbd    71672 root   24u  PIPE 0xffffff010c6ffa10        0 
->0xffffff010c6ff8b8
smbd    71672 root   25w  VREG     129,4254531705    31595   630 
/var/log/log.tor
smbd    71672 root   26ur VREG     203,3580952781    28672   973 /var 
(system/var)
smbd    71672 root   27u  unix 0xffffff01982e5000      0t0 
->0xffffff0164d145a0

As you can see the process in owned by root, not the logged in user.

/glz

--On Tuesday, April 07, 2009 3:13 PM +0100 Sam Liddicott 
<sam at liddicott.com> wrote:

> * Goran Lowkrantz wrote, On 07/04/09 14:57:
>> Hi Sam,
>>
>> Each of the calls look exactly the same except the multiplex id, i.e.
>> it's the same FID and path, which is a leaf directory containing 202
>> files. And there is only 48 notify calls before it stops, so I can't
>> see how that maps to the total of about 400 directories in the watched
>> tree. just looking at leaf directories give at least 300 directories
>> to watch.
> My mistake, I thought you said 245 calls, but I see now that you said
> 245 seconds.
>>
>> The thing I see is that the application issuing these notify requests
>> go unresponsive after about 10 to 15 min, the frame marked Not
>> responding. I assume this is because it is waiting for some response
>> from the network, as killing the smbd service releases the application.
> Try running some other application on the same share and see if the
> share is responding - even windows explorer.
>>
>> And these notify setup messages are the only one I can find that do
>> not have an SMB response.
> This is quite normal for notify requests.
>
> Try adding another file to a watched folder from another windows machine
> and see if the application comes to life when it gets the notification
> response back.
>
> Sam
>>
>> /glz
>>
>> --On April 7, 2009 14:33:54 +0100 Sam Liddicott <sam at liddicott.com>
>> wrote:
>>
>>> * Goran Lowkrantz wrote, On 07/04/09 14:08:
>>>> Hi Sam,
>>>>
>>>> Looking further in the dump, I find that a series of these:
>>>> NT Trans Request (0xa0)
>>>> NT NOTIFY Setup
>>>> Completion Filter: 0x00000003
>>>> FID: 0x32f6 (\Input\2006\200607\20060729)
>>>>
>>>> but no response from the Samba server, only IP ACK. The FID look OK
>>>> and direct queries on the FID about the directory is answered earlier.
>>>>
>>>> These messages are sent from the client every 5s for 245s. Then they
>>>> stop.
>>> I guess this may be once for every folder of 245 monitored folders.
>>>
>>> A response is not supposed to be given until the request is cancelled or
>>> until a monitored file or folder changes; so this seems quite normal.
>>>
>>> What behaviour or action (or lack of) on the windows client makes you
>>> think that samba has hung?
>>>
>>> Sam
>>>>
>>>> /glz
>>>>
>>>> --On April 6, 2009 15:28:31 +0100 Sam Liddicott <sam at liddicott.com>
>>>> wrote:
>>>>
>>>>> As far as I can tell, samba is not hanging, it's issuing responses to
>>>>> all the requests it gets.
>>>>>
>>>>> I think that the windows client thinks that samba has hung, because
>>>>> the
>>>>> sending of Echo Request is a sign that the client is getting fed up of
>>>>> waiting and is checking to see if the link is still working.
>>>>>
>>>>> So it looks like the client is waiting for some kind of response,
>>>>> but I
>>>>> haven't been able to spot a response that wasn't answered.
>>>>>
>>>>> You load the pcap into wireshark and see if you can spot a request
>>>>> that
>>>>> doesn't get a response...
>>>>>
>>>>> Sam
>>>>>
>>>>> * Goran Lowkrantz wrote, On 16/02/09 22:42:
>>>>>>
>>>>>> I have few Samba servers running FreeBSD 7.1 were we have a problem
>>>>>> with smbd process blocking for a few Vista systems that run a program
>>>>>> that watch directories and files on the samba shares.
>>>>>>
>>>>>> On my test setup I have managed to get a hang in less than 30 min.
>>>>>>
>>>>>> Samba 3.2.7 is built with minimum functions and full debug. Options
>>>>>> don't seems to have any impact on the problem.
>>>>>>
>>>>>> The PC uses Vista Business SP1 and all patches, I run a DAM program
>>>>>> called IMatch ver 3.6.076 that watches for changes in the photo
>>>>>> database. No other application I have tested has the same problems.
>>>>>> For example, Adobe Lightroom 2.2 works without problems when setup
>>>>>> with a watched folder.
>>>>>>
>>>>>> I have attached logfiles from samba with the following extra
>>>>>> settings:
>>>>>> debug pid = yes
>>>>>> debug timestamp = no
>>>>>> debug prefix timestamp = yes
>>>>>> debug uid = yes
>>>>>> log level = 10
>>>>>>
>>>>>> The PID of the server that hangs is 29162.
>>>>>>
>>>>>> The FreeBSD server is an up-to-date quad AMD server with 8GB running
>>>>>> 7.1-STABLE. In normal operation, I see the following:
>>>>>>
>>>>>> # sockstat | grep 445
>>>>>> glz      smbd       7828  23 tcp4   10.255.253.1:445
>>>>>> 10.255.253.2:57355
>>>>>> root     smbd       76917 19 tcp4   127.0.0.1:445         *:*
>>>>>> root     smbd       76917 20 tcp4   10.255.253.1:445      *:*
>>>>>>
>>>>>> When I get the hang, it looks like this:
>>>>>> # sockstat | grep 445
>>>>>> root     smbd       7828  23 tcp4   10.255.253.1:445
>>>>>> 10.255.253.2:57355
>>>>>> root     smbd       76917 19 tcp4   127.0.0.1:445         *:*
>>>>>> root     smbd       76917 20 tcp4   10.255.253.1:445      *:*
>>>>>>
>>>>>> and the GDB session:
>>>>>> # gdb /usr/local/sbin/smbd 7828
>>>>>> GNU gdb 6.1.1 [FreeBSD]
>>>>>> Copyright 2004 Free Software Foundation, Inc.
>>>>>> GDB is free software, covered by the GNU General Public License, and
>>>>>> you are
>>>>>> welcome to change it and/or distribute copies of it under certain
>>>>>> conditions.
>>>>>> Type "show copying" to see the conditions.
>>>>>> There is absolutely no warranty for GDB.  Type "show warranty" for
>>>>>> details.
>>>>>> This GDB was configured as "amd64-marcel-freebsd"...
>>>>>> Attaching to program: /usr/local/sbin/smbd, process 7828
>>>>>> Reading symbols from /usr/local/lib/libldap-2.3.so.2...done.
>>>>>> Loaded symbols for /usr/local/lib/libldap-2.3.so.2
>>>>>> Reading symbols from /usr/local/lib/liblber-2.3.so.2...done.
>>>>>> Loaded symbols for /usr/local/lib/liblber-2.3.so.2
>>>>>> Reading symbols from /usr/local/lib/libcups.so.2...done.
>>>>>> Loaded symbols for /usr/local/lib/libcups.so.2
>>>>>> Reading symbols from /usr/lib/libssl.so.5...done.
>>>>>> Loaded symbols for /usr/lib/libssl.so.5
>>>>>> Reading symbols from /lib/libcrypto.so.5...done.
>>>>>> Loaded symbols for /lib/libcrypto.so.5
>>>>>> Reading symbols from /lib/libz.so.4...done.
>>>>>> Loaded symbols for /lib/libz.so.4
>>>>>> Reading symbols from /lib/libm.so.5...done.
>>>>>> Loaded symbols for /lib/libm.so.5
>>>>>> Reading symbols from /lib/libcrypt.so.4...done.
>>>>>> Loaded symbols for /lib/libcrypt.so.4
>>>>>> Reading symbols from /usr/lib/libpam.so.4...done.
>>>>>> Loaded symbols for /usr/lib/libpam.so.4
>>>>>> Reading symbols from /usr/local/lib/libexecinfo.so.1...done.
>>>>>> Loaded symbols for /usr/local/lib/libexecinfo.so.1
>>>>>> Reading symbols from /usr/local/lib/libiconv.so.3...done.
>>>>>> Loaded symbols for /usr/local/lib/libiconv.so.3
>>>>>> Reading symbols from /usr/local/lib/libdmalloc.so.1...done.
>>>>>> Loaded symbols for /usr/local/lib/libdmalloc.so.1
>>>>>> Reading symbols from /usr/local/lib/libpopt.so.0...done.
>>>>>> Loaded symbols for /usr/local/lib/libpopt.so.0
>>>>>> Reading symbols from /lib/libthr.so.3...done.
>>>>>> [New Thread 0x800a62e00 (LWP 100076)]
>>>>>> Loaded symbols for /lib/libthr.so.3
>>>>>> Reading symbols from /lib/libc.so.7...done.
>>>>>> Loaded symbols for /lib/libc.so.7
>>>>>> Reading symbols from /usr/local/lib/libsasl2.so.2...done.
>>>>>> Loaded symbols for /usr/local/lib/libsasl2.so.2
>>>>>> Reading symbols from /usr/local/lib/libintl.so.8...done.
>>>>>> Loaded symbols for /usr/local/lib/libintl.so.8
>>>>>> Reading symbols from /usr/local/lib/nss_ldap.so.1...done.
>>>>>> Loaded symbols for /usr/local/lib/nss_ldap.so.1
>>>>>> Reading symbols from /usr/lib/libcom_err.so.4...done.
>>>>>> Loaded symbols for /usr/lib/libcom_err.so.4
>>>>>> Reading symbols from /libexec/ld-elf.so.1...done.
>>>>>> Loaded symbols for /libexec/ld-elf.so.1
>>>>>> [Switching to Thread 0x800a62e00 (LWP 100076)]
>>>>>> 0x0000000801f01d6c in select () from /lib/libc.so.7
>>>>>> (gdb) directory /usr/ports/net/samba32-devel/work/samba-3.2.7/source/
>>>>>> Source directories searched:
>>>>>> /usr/ports/net/samba32-devel/work/samba-3.2.7/source:$cdir:$cwd
>>>>>> (gdb) bt
>>>>>> # 0  0x0000000801f01d6c in select () from /lib/libc.so.7
>>>>>> # 1  0x0000000801d0f4d4 in select () from /lib/libthr.so.3
>>>>>> # 2  0x00000000006749fe in sys_select (maxfd=24,
>>>>>> readfds=0x7fffffffd420, writefds=0x7fffffffd3a0, errorfds=0x0,
>>>>>> tval=0x7fffffffd500)
>>>>>>    at lib/select.c:93
>>>>>> # 3  0x00000000004df64c in smbd_process () at smbd/process.c:839
>>>>>> # 4  0x0000000000854074 in main (argc=2, argv=0x7fffffffd638) at
>>>>>> smbd/server.c:1450
>>>>>> (gdb) frame 2
>>>>>> # 2  0x00000000006749fe in sys_select (maxfd=24,
>>>>>> readfds=0x7fffffffd420, writefds=0x7fffffffd3a0, errorfds=0x0,
>>>>>> tval=0x7fffffffd500)
>>>>>>    at lib/select.c:93
>>>>>> 93              ret = select(maxfd,readfds2,writefds,errorfds,tval);
>>>>>> (gdb) print tval
>>>>>> $1 = (struct timeval *) 0x7fffffffd500
>>>>>> (gdb) print *tval
>>>>>> $2 = {tv_sec = 59, tv_usec = 999977}
>>>>>> (gdb) The program is running.  Quit anyway (and detach it)? (y or
>>>>>> n) y
>>>>>> Detaching from program: /usr/local/sbin/smbd, process 7828
>>>>>>
>>>>>> The following is a truss of the process until I have seen the switch
>>>>>> to root as owner:
>>>>>> # time truss -p 8307
>>>>>> gettimeofday({1234648077.989004 },0x0)           = 0 (0x0)
>>>>>> gettimeofday({1234648077.989081 },0x0)           = 0 (0x0)
>>>>>> select(24,{6 23},{},0x0,{21.288167 })            = 0 (0x0)
>>>>>> gettimeofday({1234648099.279293 },0x0)           = 0 (0x0)
>>>>>> gettimeofday({1234648099.279370 },0x0)           = 0 (0x0)
>>>>>> gettimeofday({1234648099.279417 },0x0)           = 0 (0x0)
>>>>>> select(24,{6 23},{},0x0,{59.989982 })            = 1 (0x1)
>>>>>> gettimeofday({1234648102.286493 },0x0)           = 0 (0x0)
>>>>>> read(23,"\0\0\0r",4)                             = 4 (0x4)
>>>>>> read(23,"\M^?SMB2\0\0\0\0\^X\a\M-H\0\0\0"...,114) = 114 (0x72)
>>>>>> geteuid(0x3e8,0x3e8,0x2,0x800adf750,0x2,0x800adf750) = 0 (0x0)
>>>>>> getegid(0x3e8,0x3e8,0x2,0x801eadb8c,0xffffff006cf16a50,0x7fffffffd13
>>>>>> 8)
>>>>>>
>>>>>> = 0 (0x0)
>>>>>> __sysctl(0x7fffffffd0a0,0x2,0x7fffffffd0bc,0x7fffffffd0b0,0x0,0x0)
>>>>>> = 0
>>>>>> (0x0)
>>>>>> 0.000u 0.001s 2:36.56 0.0%    0+0k 0+0io 0pf+0w
>>>>>>
>>>>>> # sockstat | grep 445
>>>>>> glz      smbd       8307  23 tcp4   10.255.253.1:445
>>>>>> 10.255.253.2:57438
>>>>>> root     smbd       76917 19 tcp4   127.0.0.1:445         *:*
>>>>>> root     smbd       76917 20 tcp4   10.255.253.1:445      *:*
>>>>>> # ps -awxl | grep 8307
>>>>>> 1000  8307  8556   0  44  0 34672  7984 select IX    ??    0:04.57
>>>>>> /usr/local/sbin/smbd -D -s /usr/local/etc/smb.conf
>>>>>>    0  8556  3273   0   8  0  4600  1204 wait   I+    p0    0:00.00
>>>>>> truss -p 8307
>>>>>> # sockstat | grep 445
>>>>>> root     smbd       8307  23 tcp4   10.255.253.1:445
>>>>>> 10.255.253.2:57438
>>>>>> root     smbd       76917 19 tcp4   127.0.0.1:445         *:*
>>>>>> root     smbd       76917 20 tcp4   10.255.253.1:445      *:*
>>>>>> # ps -awxl | grep 8307
>>>>>>    0  8307  8556   0  44  0 34672  7984 select SX    ??    0:04.57
>>>>>> /usr/local/sbin/smbd -D -s /usr/local/etc/smb.conf
>>>>>>    0  8556  3273   0   8  0  4600  1204 wait   I+    p0    0:00.00
>>>>>> truss -p 8307
>>>>>>
>>>>>>
>>>>>> I can recreate this at any time and the condition by killing the
>>>>>> offending smbd process and the PC reconnects just fine.
>>>>>>
>>>>>> Hope this helps to pin this down. As I can recreate the hang, please
>>>>>> let me know if there is any more information I can supply.
>>>>>>
>>>>>> /glz
>>>>>>
>>>>>> ................................................... the future
>>>>>> isMobile
>>>>>>
>>>>>>  Goran Lowkrantz <goran.lowkrantz at ismobile.com>
>>>>>>  System Architect, isMobile, Aurorum 2, S-977 75 Luleå, Sweden
>>>>>>  Phone: +46(0)920-75559
>>>>>>  Mobile: +46(0)70-587 87 82 Fax: +46(0)70-615 87 82
>>>>>>
>>>>>> http://www.ismobile.com
>>>>>> ...............................................
>>>>
>>>>
>>>>
>>>> ................................................... the future isMobile
>>>>
>>>>  Goran Lowkrantz <goran.lowkrantz at ismobile.com>
>>>>  System Architect, isMobile AB
>>>>  Sandviksgatan 81, PO Box 58, S-971 03 Luleå, Sweden
>>>>  Mobile: +46(0)70-587 87 82
>>>> http://www.ismobile.com ...............................................
>>
>>
>>
>> ................................................... the future isMobile
>>
>>  Goran Lowkrantz <goran.lowkrantz at ismobile.com>
>>  System Architect, isMobile AB
>>  Sandviksgatan 81, PO Box 58, S-971 03 Luleå, Sweden
>>  Mobile: +46(0)70-587 87 82
>> http://www.ismobile.com ...............................................



................................................... the future isMobile

  Goran Lowkrantz <goran.lowkrantz at ismobile.com>
  System Architect, isMobile, Aurorum 2, S-977 75 Luleå, Sweden
  Phone: +46(0)920-75559
  Mobile: +46(0)70-587 87 82 Fax: +46(0)70-615 87 82

http://www.ismobile.com ...............................................


More information about the samba-technical mailing list