[Samba] client hangs

Doug Tucker tuckerd at lyle.smu.edu
Thu Oct 3 13:13:28 MDT 2013


Additionally, this has happened from time to time (again, no idea what 
it means exactly), but it doesn't necessarily correllate with when users 
are seeing the hang.  Any idea if this is fatal?

Oct  3 08:31:57 agentsmith2 kernel: INFO: task smbd:26597 blocked for 
more than 120 seconds.
Oct  3 08:31:57 agentsmith2 kernel: "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct  3 08:31:57 agentsmith2 kernel: smbd          D ffffffff80157f0a     
0 26597   6359         26677 26482 (NOTLB)
Oct  3 08:31:57 agentsmith2 kernel:  ffff81172b963af8 0000000000000082 
ffff81183f0db400 ffffffff884cfe7a
Oct  3 08:31:57 agentsmith2 kernel:  ffff8115f9621888 0000000000000009 
ffff81183fbd70c0 ffff810c3ff110c0
Oct  3 08:31:57 agentsmith2 kernel:  0001fafb7157cdc0 0000000000000909 
ffff81183fbd72a8 000000113fb24bf8
Oct  3 08:31:57 agentsmith2 kernel: Call Trace:
Oct  3 08:31:57 agentsmith2 kernel:  [<ffffffff884cfe7a>] 
:sunrpc:xprt_end_transmit+0x2c/0x39
Oct  3 08:31:57 agentsmith2 kernel:  [<ffffffff8006ed98>] 
do_gettimeofday+0x40/0x90
Oct  3 08:31:57 agentsmith2 kernel:  [<ffffffff80029172>] sync_page+0x0/0x43
Oct  3 08:31:57 agentsmith2 kernel:  [<ffffffff800637de>] 
io_schedule+0x3f/0x67
Oct  3 08:31:57 agentsmith2 kernel:  [<ffffffff800291b0>] 
sync_page+0x3e/0x43
Oct  3 08:31:57 agentsmith2 kernel:  [<ffffffff80063a0a>] 
__wait_on_bit+0x40/0x6e
Oct  3 08:31:57 agentsmith2 kernel:  [<ffffffff800355f7>] 
wait_on_page_bit+0x6c/0x72
Oct  3 08:31:57 agentsmith2 kernel:  [<ffffffff800a3cfd>] 
wake_bit_function+0x0/0x23
Oct  3 08:31:57 agentsmith2 kernel:  [<ffffffff800482e8>] 
pagevec_lookup_tag+0x1a/0x21
Oct  3 08:31:57 agentsmith2 kernel:  [<ffffffff8004a2d0>] 
wait_on_page_writeback_range+0x62/0x133
Oct  3 08:31:57 agentsmith2 kernel:  [<ffffffff800ca3ee>] 
filemap_write_and_wait+0x26/0x31
Oct  3 08:31:57 agentsmith2 kernel:  [<ffffffff8852cc9c>] 
:nfs:nfs_setattr+0x8e/0xfc
Oct  3 08:31:57 agentsmith2 kernel:  [<ffffffff8000d01d>] 
do_lookup+0x8f/0x24b
Oct  3 08:31:57 agentsmith2 kernel:  [<ffffffff8000d57f>] dput+0x2c/0x114
Oct  3 08:31:57 agentsmith2 kernel:  [<ffffffff8000a7b9>] 
__link_path_walk+0xf10/0xf39
Oct  3 08:31:57 agentsmith2 kernel:  [<ffffffff8002d0f6>] 
mntput_no_expire+0x19/0x89
Oct  3 08:31:57 agentsmith2 kernel:  [<ffffffff8000e4a2>] 
current_fs_time+0x3b/0x40
Oct  3 08:31:57 agentsmith2 kernel:  [<ffffffff8000ec03>] 
link_path_walk+0xac/0xb8
Oct  3 08:31:57 agentsmith2 kernel:  [<ffffffff8002cf2d>] 
notify_change+0x145/0x2f5
Oct  3 08:31:57 agentsmith2 kernel:  [<ffffffff800e401f>] 
do_utimes+0x106/0x129
Oct  3 08:31:57 agentsmith2 kernel:  [<ffffffff8000d73b>] 
inotify_inode_queue_event+0xad/0xe8
Oct  3 08:31:57 agentsmith2 kernel:  [<ffffffff80016bbf>] 
vfs_write+0x13f/0x174
Oct  3 08:31:57 agentsmith2 kernel:  [<ffffffff800e407e>] 
sys_futimesat+0x3c/0x4b
Oct  3 08:31:57 agentsmith2 kernel:  [<ffffffff8005d116>] 
system_call+0x7e/0x83
Oct  3 08:31:57 agentsmith2 kernel:
Oct  3 08:31:57 agentsmith2 kernel: INFO: task smbd:29945 blocked for 
more than 120 seconds.
Oct  3 08:31:57 agentsmith2 kernel: "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct  3 08:31:57 agentsmith2 kernel: smbd          D ffffffff80157f0a     
0 29945   6359         29946 29942 (NOTLB)
Oct  3 08:31:57 agentsmith2 kernel:  ffff8115f260fd98 0000000000000082 
ffff8115f260fd48 ffffffff8000d01d
Oct  3 08:31:57 agentsmith2 kernel:  ffff8115f260fd58 000000000000000a 
ffff8102ae715040 ffff810c3fea7040
Oct  3 08:31:57 agentsmith2 kernel:  0001fb0e52713c02 000000000002b635 
ffff8102ae715228 0000000fca752ca8
Oct  3 08:31:57 agentsmith2 kernel: Call Trace:
Oct  3 08:31:57 agentsmith2 kernel:  [<ffffffff8000d01d>] 
do_lookup+0x8f/0x24b
Oct  3 08:31:57 agentsmith2 kernel:  [<ffffffff8000a7b9>] 
__link_path_walk+0xf10/0xf39
Oct  3 08:31:58 agentsmith2 kernel:  [<ffffffff80063c63>] 
__mutex_lock_slowpath+0x60/0x9b
Oct  3 08:31:58 agentsmith2 kernel:  [<ffffffff80063cad>] 
.text.lock.mutex+0xf/0x14
Oct  3 08:31:58 agentsmith2 kernel:  [<ffffffff8852c9bb>] 
:nfs:nfs_getattr+0x45/0xd9
Oct  3 08:31:58 agentsmith2 kernel:  [<ffffffff80028f4a>] 
vfs_stat_fd+0x32/0x4a
Oct  3 08:31:58 agentsmith2 kernel:  [<ffffffff800671cf>] 
do_page_fault+0x4cc/0x842
Oct  3 08:31:58 agentsmith2 kernel:  [<ffffffff80023cc3>] 
sys_newstat+0x19/0x31
Oct  3 08:31:58 agentsmith2 kernel:  [<ffffffff8005ddf9>] 
error_exit+0x0/0x84
Oct  3 08:31:58 agentsmith2 kernel:  [<ffffffff8005d116>] 
system_call+0x7e/0x83
Oct  3 08:31:58 agentsmith2 kernel:

Sincerely,

Doug Tucker

On 10/03/2013 12:11 PM, Jeremy Allison wrote:
> On Thu, Oct 03, 2013 at 12:03:39PM -0500, Doug Tucker wrote:
>> I see a lot of this in the logs, but can't determine if it really
>> means anything:
>>
>> Oct  2 09:45:28 agentsmith2 smbd[21954]:   getpeername failed. Error
>> was Transport endpoint is not connected
>> Oct  2 09:45:28 agentsmith2 smbd[25948]:   write_data: write failure
>> in writing to client 129.119.104.44. Error Connection reset by peer
>> Oct  2 09:45:28 agentsmith2 smbd[25971]:   write_data: write failure
>> in writing to client 129.119.105.246. Error Connection reset by peer
>> Oct  2 09:45:28 agentsmith2 smbd[25883]:   write_data: write failure
>> in writing to client 129.119.103.96. Error Connection reset by peer
>> Oct  2 09:45:28 agentsmith2 smbd[25987]:   getpeername failed. Error
>> was Transport endpoint is not connected
>> Oct  2 09:45:28 agentsmith2 smbd[25988]:   getpeername failed. Error
>> was Transport endpoint is not connected
>> Oct  2 09:45:28 agentsmith2 smbd[25986]:   getpeername failed. Error
>> was Transport endpoint is not connected
>> Oct  2 09:45:29 agentsmith2 smbd[25985]:   getpeername failed. Error
>> was Transport endpoint is not connected
>> Oct  2 09:45:29 agentsmith2 smbd[25989]:   getpeername failed. Error
>> was Transport endpoint is not connected
>> Oct  2 09:45:29 agentsmith2 smbd[25704]:   write_data: write failure
>> in writing to client 129.119.105.119. Error Broken pipe
>> Oct  2 09:45:29 agentsmith2 smbd[21702]:   write_data: write failure
>> in writing to client 129.119.105.139. Error Connection reset by peer
> All this is saying is that the client disconnected - smbd doesn't
> know why. I'd start suspecting a network failure somewhere. Check
> switches, cables and other hardware.
>
> Jeremy.



More information about the samba mailing list