[jcifs] too much timeout when multithreaded

Michael B.Allen mba2000 at ioplex.com
Wed Apr 9 05:30:00 EST 2003


On Tue, 8 Apr 2003 14:11:17 +0200
"Benoit Daviaud" <benda472 at student.liu.se> wrote:

> Hej,
> 
> Thank you for your answer.
> I've been trying your ThreadedSmbCrowler. But I could just find the version

The website is not up to date. We're making some changes. The latest
version of jCIFS is here:

  http://users.erols.com/mballen/jcifs/

It has the ThreadedSmbCrawler with listFiles(). The list() version is
broken with 0.7 and will generate nothing but errors.

> using list in the exemple directory. Your approch is more to have one thread
> for every directory. My crowler is design to affect one thread for each

Well once it starts on a directory it drills down from there until it
runs out out of children. I don't remember the algorithm exactly but I
think it's pretty good.

> host.This thread performs a recursive crowling of the host. It is not
> possible that more than one thread would work on the same host. When a

Ok. That's a good way too.

> thread is finished with a host. It dies and a new one is created to crowl
> the next host on the list. I check the number of living threads using

That's not good. Thread creation is costly.

> activeCount() in ThreadGroup.
> About the BCAST option it was in the list but after WINS and we are actually
> using a wins server. But I followed your advise and removed it even if it
> didn't changed the problem.

If you're WINS query fails (and some will because Network Neighborhood
isn't in sync with WINS) it will try BCAST and that takes 6 seconds to
try and fail. Leave it out. In fact use resolveOrder=WINS only so you're
not wasting time even trying the other methods.

> It is very clear that with 100 threads working in parallele on 100 different
> hosts, I get almost no output, it seems that almost all call to listFiles()
> on the directories result in a timeout exception. Even when the directory is

I'm not surprised. 100 threads is a lot considering each host is going
to require it's own transport and each transport requires a thread of
it's own. So your 100 threads is really 200 threads. I suspect you're
maxing out some reasource and just timing out due to natrual causes
(soTimeout). If you're going to use 1 thread per host then keep the
number of threads low. The ThreadedSmbCrawler algorithm would perform
much better however. Having 2 or 3 threads per host would probably be
ideal but the ThreadedSmbCrawler doesn't do that either.

> After some experimentations, it looks like the more threads I use, the more
> files I lose. And there is no clear border.

I think you're just exhausting the machines resorces so everything is
timing out.

Mike

-- 
A  program should be written to model the concepts of the task it
performs rather than the physical world or a process because this
maximizes  the  potential  for it to be applied to tasks that are
conceptually  similar and, more important, to tasks that have not
yet been conceived. 


More information about the jcifs mailing list