[jcifs] too much timeout when multithreaded

Dan Dumont Dan at canofsleep.com
Wed Apr 9 05:52:13 EST 2003


Hi, I am sending this on the list in hopes that it will help both me and any
other interested.   My question is to Mike:
You said that thread creation is costly, I have yet to take the operating
systems course at my university ( I will be doing so next semester) and I am
interested at workarounds for this problem.

Is there an algorithm you can suggest, or some documentation that you can
point me to that would explain the benefits and downfalls of various
implementations?

-----Original Message-----
From: jcifs-bounces+dan=canofsleep.com at lists.samba.org
[mailto:jcifs-bounces+dan=canofsleep.com at lists.samba.org] On Behalf Of
Michael B.Allen
Sent: Tuesday, April 08, 2003 3:30 PM
To: Benoit Daviaud
Cc: jcifs at lists.samba.org
Subject: Re: [jcifs] too much timeout when multithreaded

On Tue, 8 Apr 2003 14:11:17 +0200
"Benoit Daviaud" <benda472 at student.liu.se> wrote:

> Hej,
> 
> Thank you for your answer.
> I've been trying your ThreadedSmbCrowler. But I could just find the
version

The website is not up to date. We're making some changes. The latest
version of jCIFS is here:

  http://users.erols.com/mballen/jcifs/

It has the ThreadedSmbCrawler with listFiles(). The list() version is
broken with 0.7 and will generate nothing but errors.

> using list in the exemple directory. Your approch is more to have one
thread
> for every directory. My crowler is design to affect one thread for each

Well once it starts on a directory it drills down from there until it
runs out out of children. I don't remember the algorithm exactly but I
think it's pretty good.

> host.This thread performs a recursive crowling of the host. It is not
> possible that more than one thread would work on the same host. When a

Ok. That's a good way too.

> thread is finished with a host. It dies and a new one is created to crowl
> the next host on the list. I check the number of living threads using

That's not good. Thread creation is costly.

> activeCount() in ThreadGroup.
> About the BCAST option it was in the list but after WINS and we are
actually
> using a wins server. But I followed your advise and removed it even if it
> didn't changed the problem.

If you're WINS query fails (and some will because Network Neighborhood
isn't in sync with WINS) it will try BCAST and that takes 6 seconds to
try and fail. Leave it out. In fact use resolveOrder=WINS only so you're
not wasting time even trying the other methods.

> It is very clear that with 100 threads working in parallele on 100
different
> hosts, I get almost no output, it seems that almost all call to
listFiles()
> on the directories result in a timeout exception. Even when the directory
is

I'm not surprised. 100 threads is a lot considering each host is going
to require it's own transport and each transport requires a thread of
it's own. So your 100 threads is really 200 threads. I suspect you're
maxing out some reasource and just timing out due to natrual causes
(soTimeout). If you're going to use 1 thread per host then keep the
number of threads low. The ThreadedSmbCrawler algorithm would perform
much better however. Having 2 or 3 threads per host would probably be
ideal but the ThreadedSmbCrawler doesn't do that either.

> After some experimentations, it looks like the more threads I use, the
more
> files I lose. And there is no clear border.

I think you're just exhausting the machines resorces so everything is
timing out.

Mike

-- 
A  program should be written to model the concepts of the task it
performs rather than the physical world or a process because this
maximizes  the  potential  for it to be applied to tasks that are
conceptually  similar and, more important, to tasks that have not
yet been conceived. 





More information about the jcifs mailing list