Increasing response times for byte range unlock requests.

Jeremy Allison jra at samba.org
Tue Jun 24 18:13:29 MDT 2014


On Mon, Jun 23, 2014 at 07:09:13PM +0530, Hemanth Thummala wrote:
> Hi All,
> 
> We are running samba 3.6.12 stack. And seeing a strange issue where in the
> response time for byte-range unlock requests are increasing when running a
> specific test. Each run response time is growing exponentially.
> 
> Script used for the test is:
> http://msdn.microsoft.com/en-us/library/windows/desktop/aa365204%28v=vs.85%29.aspx.
> Its the second test.
> 
> This is basically a multi-thread based byte range lock test where in all
> threads will request for a specific range either exclusively or shared mode
> to access a particular record. All these requests are blocking type(no fail
> immediately flag in request)
> 
> >From code walk through and debug logs, I found that these blocking requests
> are getting added to pending queue and supposed to be retried at some retry
> interval. Also these requests are registering the call backs to notify when
> the specific byte range is available.
> 
> Here I could two problems.
> 
> 1) There is no check if the lock requests already in pending queue. Due to
> this lot of duplicate request are getting added. And
> in brl_unlock_windows_default() , we go through each of these entries are
> send the unlock messages to PIDs in each entry. This is causing the delay
> to complete the byte-range unlock requests.

Windows lock requests do stack, so I think merging them is incorrect.

> I could see the following pending lock entries from the debug logs:
> 
>   [0]: smblctx = 10438, tid = 48718, pid = 44535, start = 0, size = 768,
> fnum = 10438, PENDING_WRITE WINDOWS_LOCK
> [2014/06/22 21:21:50.686555, 10] locking/brlock.c:58(print_lock_struct)
> 
>   [4]: smblctx = 10438, tid = 48718, pid = 44535, start = 0, size = 768,
> fnum = 10438, PENDING_WRITE WINDOWS_LOCK
> [2014/06/22 21:21:50.686555, 10] locking/brlock.c:58(print_lock_struct)
> 
>   [6]: smblctx = 10438, tid = 48718, pid = 44535, start = 0, size = 768,
> fnum = 10438, PENDING_WRITE WINDOWS_LOCK
> [2014/06/22 21:21:50.686555, 10] locking/brlock.c:58(print_lock_struct)
> 
>   [8]: smblctx = 10438, tid = 48718, pid = 44535, start = 0, size = 768,
> fnum = 10438, PENDING_WRITE WINDOWS_LOCK
> [2014/06/22 21:21:50.686555, 10] locking/brlock.c:58(print_lock_struct)
> 
>   [10]: smblctx = 10438, tid = 48718, pid = 44535, start = 0, size = 768,
> fnum = 10438, PENDING_WRITE WINDOWS_LOCK
> [2014/06/22 21:21:50.686555, 10] locking/brlock.c:58(print_lock_struct)
> 
>   [14]: smblctx = 10438, tid = 48718, pid = 44535, start = 0, size = 768,
> fnum = 10438, PENDING_WRITE WINDOWS_LOCK
> [2014/06/22 21:21:50.696556, 10] locking/brlock.c:58(print_lock_struct)
> 
> All these entries looks same.

> 2) I could see the pending wait queue is never cleaned up even after test
> runs for completion. And on every run this queue is piling up and adding
> more time to finish processing the unlock requests.

That looks like a bug, let me look closely at it. The original
design and implementation for this certainly removed pending records
on completion, but this code has been refactored many times.

Can you log this as a bugzilla bug so I can track it ?

Jeremy.


More information about the samba-technical mailing list