[jcifs] OutOfMemoryError unable to create new native thread

Mon Mar 8 12:32:15 MST 2010

On Mon, Mar 8, 2010 at 1:59 PM, Peter <peter_zavadsky at symantec.com> wrote:
> Michael B Allen <ioplex <at> gmail.com> writes:
>>
>> Hi Peter,
>>
>> I understood what you said. My point was simply that to trigger Jespa
>> to create 1900 QueryThreads it would require multiple threads of your
>> own.
>>
>> Just try what I recommended and see how far you get with that.
>> Unfortunately I just don't have time to try your example (or even
>> really read your message completely). If you find something clearly
>> wrong with JCIFS then I am interested in at least diagnosing the issue
>> but right now I just don't see any obvious evidence of a problem.
>>
>> Mike
>
> Mike,
>
> I tried your suggestion, unfortunately it doesn't work for us, because
> lot of customers use non fully qualified DFS names.

You mean DNS not DFS.

> Sorry, I’ll try to be more brief. Thanks so much for your patience.
>
> I just have ONE SINGLE THREAD that is calling the JCIFS library at least 10,000
> times in a loop.
>
> The essence of the problem is that one of the two QueryThreads is returning
> with success very quickly, about 5 ms, but the other QueryThread is taking a
> very long time to complete, about 5 seconds.
>
> So, when I call the JCIFS library 10,000 times in less than 5 seconds, I end up
> with 10,000 threads that are all still waiting to complete.  Unfortunately, I
> only have enough memory for about 1,800 threads, so I run out of memory long
> before I get to 10,000 calls.
>
> I don’t know why sometimes the first QueryThread in a lookup takes 5 ms and the
> second QueryThread in a lookup takes 5 seconds, but this happens at some of our
> customer sites as well as on one of our test machines.

The reason is because if you have a URL like:

  smb://somename/

JCIFS has no way of knowing if 'somename' is a workgroup name in which
case it should try to list servers in that workgroup or a servername
in which case it would list shares. So we use two threads
(QueryThread) to simultaneously try both lookups. Which ever one
returns first tells us whether the name was a server or workgroup.

Your problem is that the workgroup query involves trying various
methods including broadcasting for a NetBIOS master browser. JCIFS
does this three times every two seconds before it gives up. Thus it
takes 6 seconds.

See this page:

  http://jcifs.samba.org/src/docs/resolver.html

In particular the broadcasting is refered to as the 'BCAST' method. So
again, set netbios.resolveOrder=DNS. If your customer is really
actually using WINS then use netbios.resolveOrder=WINS,DNS. Whatever
the case, just leave BCAST out of it.

Basically you just have a name service problem.

> So, my simplistic solution to the problem was to simply interrupt the second
> QueryThread when the first one finishes successfully so that both threads
> finish as soon as the first one succeeds.  This prevents orphan threads from
> hanging around.

I think if you read the above link carefully, you should be able to
solve your problem. Interrupting those QueryThreads is a very ugly
solution to what is effectively a configuration issue. You just need
to understand better what name service lookups are timing out and
adjust the configuration to stop them.

Mike

-- 
Michael B Allen
Java Active Directory Integration
http://www.ioplex.com/