[jcifs] OutOfMemoryError unable to create new native thread

Michael B Allen ioplex at gmail.com
Mon Mar 8 10:56:05 MST 2010


On Sun, Mar 7, 2010 at 3:03 PM, Peter <peter_zavadsky at symantec.com> wrote:
> Michael B Allen <ioplex <at> gmail.com> writes:
>
>>
>> Hi Peter,
>>
>> The whole point of the QueryThread is to allow simultaneous lookups
>> such that the first valid response is used immediately without
>> waiting. So by adding join you basically completely defeat the purpose
>> of using QueryThreads at all. You might as well just dump the
>> QueryThreads entirely and perform to synchronous lookups.
>>
>> First, try playing with the name service properties. Try setting
>> jcifs.resolveOrder=DNS. I think that might stop the NetBIOS lookups
>> entirely. Of course you'll then need to use only fully qualified DNS
>> hostnames.
>>
>> Otherwise, you'll need to look at why these lookups are being
>> generated at such a high frequency in the first place. You must be
>> using an awful lot of threads if you eventually end up spawing 1900
>> lookups in which case you might want to analyze and determine if using
>> that many threads is really increasing overall throughput. Or prevent
>> all of the threads from starting at the same time so that the result
>> of the lookups have a chance to be cached.
>>
>> Regarding the cache, if it turns out that increasing
>> jcifs.netbios.cachePolicy reduces the frequency of the lookups, then
>> that is actually a good solution. These names do not change
>> frequently. The default value is low only because in almost all cases
>> it doesn't need to be higher. It's only when someone is trying to do
>> 100 lookups at the same exact instant that it matters. This is why we
>> provide for changing these properties.
>>
>> Mike
>
> Hi Michael,
>
> Thanks for responding. And for pointing to the possible tuning.
>
> However, probably I wasn't clear enough in the description of the problem.
> Let me try to clarify that.
>
> The problem is not that our app creates too many threads. Our app doesn't do
> that. The problem is that the JCIFS library creates those too many threads. In
> our case when we hit the OufOfMemoryError, there are more than 1800 JCIFS
> QueryThreadS alive, causing the problem.
> The tuning might push the problem further (or hide in some cases), but not fix
> it.

Hi Peter,

I understood what you said. My point was simply that to trigger Jespa
to create 1900 QueryThreads it would require multiple threads of your
own.

Just try what I recommended and see how far you get with that.
Unfortunately I just don't have time to try your example (or even
really read your message completely). If you find something clearly
wrong with JCIFS then I am interested in at least diagnosing the issue
but right now I just don't see any obvious evidence of a problem.

Mike

>
> Please, check the test case I've sent. It's very simple and it should be easy
> to understand, and if you hit the issue it dumps the threads alive, so you can
> see that those threads are JCIFS QueryThreadS.
> Please, also remove the property setting the cache to 0 - it seems I was wrong
> in the assumption it could be tuned based on that property.
>
> As you can see, the test itself doesn't create any new threads, it only sends
> many lookup requests to the JCIFS. Those threads, which are getting created, are
> JCIFS QueryThreads.
> I hope you are able to reproduce the problem, it should be only a matter
> of increasing the amount of the requests (and providing some DNS name).
>
> I was debugging the test, and here is a rough description of what happens, and
> what goes wrong:
>
> 1) The app sends many requests to resolve the same DNS name (from same thread),
> which should be perfectly valid thing to do.
> 2) For the first request R1, the JCIFS creates two QueryThreadS (R1-Q1, R1-Q2),
> where each resolves for specific flavor of the DNS name creating two Name
> instances -> Name1, and Name2.
> 3) One of the QueryThreads (R1-Q1) resolves the Name1 successfully, puts it
> into cache, and provides the result to the caller.
> 4) The caller of the request gets the result and continues.
> 5) However at the same time, the second QueryThread (R1-Q2) is resolving the
> Name2, it get stuck in the resolving call (CLIENT.getName or similar call). At
> that point the lookupTable is set that is resolving Name2, the thread is alive.
> 6) Then the next lookup request R2 comes to resolve the same DNS name.
> 7) Again two new QueryThreads (R2-Q1, R2-Q2) are created to resolve the Name
> instance types, i.e. the same Name1 and Name2.
> 8) The QueryThread (R2-Q1) resolving Name1, gets the name resolved
> successfully, either from cache, or repeating the steps from 3).
> 9) The caller gets the results and continues.
> 10) However the second QueryThread (R2-Q2) resolving the Name2, gets into wait
> state, because there is still another QueryThread (R1-Q2) from the first request
> R1 trying to resolve the same Name2, and keeps blocking QueryThread (R2-Q2)
> because the lookupTable for the Name2 is set.
> 11) If you repeat this process (steps 6)-10)) many times (up to Rn, where n is
> large enough),
> and the R1-Q1 is still trying to resolve the Name2, which sometime happens in
> our app, then you eventually get that many QueryThreads (Rx-Q2) blocked and
> lingering around, that you hit the OutOfMemoryError (no new native thread could
> be created).
>
>
> To the fix:
> You'd be correct in dismissing the fix, if there are just the join calls for
> the threads.
> The request would be blocked for longer time, because even when the first
> QueryThread Q1 already provided a result, you'd need to wait for the second Q2,
> which is still trying to resolve the Name2, which will be not successful, and
> it seems it takes usually much longer (probably because of some timeout).
>
> However before calling the join on such thread, there is also an interrupt call
> on that thread, which in our case interrupts the unsuccessful attempts to
> resolve the Name2.
> That works really well on my machine.
> I experienced no measurable performance penalty, and on top of it also no
> unnecessary threads lingering around doing wasteful job and also no
> OutOfMemoryError.
>
> Please, reconsider it.
>
> Thanks,
> -Peter
>
>
>
>
>
>



-- 
Michael B Allen
Java Active Directory Integration
http://www.ioplex.com/


More information about the jCIFS mailing list