[jcifs] Re: jcifs-1.2.4 stability fixes

Lars Heete hel at admin.de
Fri Oct 7 09:07:42 GMT 2005


Hello,
> Lars heete <hel at admin.de> wrote:
> > > I get the feeling you're not particularly interesting in persuing this
> > > further. That's ok with me. To be honest, I don't actually use JCIFS
> > > anymore so I'm doing this 100% for other people at this point.
> >
> > Im not using jcifs for own projects, but a customer had asked me to
> > resolve the threading issues. My changes to jcifs seem to work quite well
> > for his application so he is interested in getting them merged back.
> > I also have a new version pending, that simplifies my changes and alters
> > the response wakeup-behavior to be more like jcifs-1.1, which was
> > actually much faster on transports with many active requests. (in jcifs
> > 1.2 the notify on response_map then causes many context switches).
> > I also have some ideas to resolve the tree/session/transport locking
> > issues you a trying to address, but this is definitly more for a 2.0
> > version than a 1.x.
>
> One of the main problems with your original patch was that it was mostly
> irrelevant stuff. If you resubmit it with only the changes necessary to
> make the client behave the way you want then I will reconsider it. Aside
> from the bug regarding adding the session back to transport.sessions
> introduced in 1.2.5 (and your original "live lock" issue), that code
there also is the deadlock between sessionSetup and send introduced in 1.2.4 
(doSend0/sendrecv calling disconnect on IOError). Anyway I think 
disconnecting on  errors in other places then the transport thread is bogous, 
because on a busy transport this will generate a queue of threads waiting on 
the transport monitor trying to call disconnect. This would be no problem 
alone, but there may be also threads waiting on the transport monitor for 
sessionSetup or connect, so you get something like this

disconnect
connect
disconnect
disconnect
connect
disconnect
connect

untill all threads with sending errors are done.

here actually is an example using JCIFS-1.2.5 with my Crawler test  after a 
single server-side transport close (actually not simple to trigger without 
generating deadlocks, even when using only one session). I inserted a log 
message in connect and disconnect.  

disconnecting
smb://194.39.184.7/test/test/CopyTo/DOMEditor/CVS/DOMEditor/domeditor/view/:
jcifs.smb.SmbException:
java.net.SocketException: Connection reset
        at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:96)
        at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
        at jcifs.smb.SmbTransport.doSend(SmbTransport.java:398)
        at jcifs.util.transport.Transport.sendrecv(Transport.java:68)
        at jcifs.smb.SmbTransport.send(SmbTransport.java:580)
        at jcifs.smb.SmbSession.send(SmbSession.java:231)
        at jcifs.smb.SmbTree.send(SmbTree.java:102)
        at jcifs.smb.SmbFile.send(SmbFile.java:688)
        at jcifs.smb.SmbFile.doFindFirstNext(SmbFile.java:1730)
        at jcifs.smb.SmbFile.listFiles(SmbFile.java:1575)
        at jcifs.smb.SmbFile.listFiles(SmbFile.java:1483)
        at SmbThreadTest.traverse(SmbThreadTest.java:42)
        at SmbThreadTest.traverse(SmbThreadTest.java:63)
        at SmbThreadTest.traverse(SmbThreadTest.java:63)
        at SmbThreadTest.traverse(SmbThreadTest.java:63)
        at SmbThreadTest.traverse(SmbThreadTest.java:63)
        at SmbThreadTest.traverse(SmbThreadTest.java:63)
        at SmbThreadTest.traverse(SmbThreadTest.java:63)
        at SmbThreadTest.traverse(SmbThreadTest.java:63)
        at SmbThreadTest.run(SmbThreadTest.java:97)

        at jcifs.smb.SmbTransport.send(SmbTransport.java:585)
        at jcifs.smb.SmbSession.send(SmbSession.java:231)
        at jcifs.smb.SmbTree.send(SmbTree.java:102)
        at jcifs.smb.SmbFile.send(SmbFile.java:688)
        at jcifs.smb.SmbFile.doFindFirstNext(SmbFile.java:1730)
        at jcifs.smb.SmbFile.listFiles(SmbFile.java:1575)
        at jcifs.smb.SmbFile.listFiles(SmbFile.java:1483)
        at SmbThreadTest.traverse(SmbThreadTest.java:42)
        at SmbThreadTest.traverse(SmbThreadTest.java:63)
        at SmbThreadTest.traverse(SmbThreadTest.java:63)
        at SmbThreadTest.traverse(SmbThreadTest.java:63)
        at SmbThreadTest.traverse(SmbThreadTest.java:63)
        at SmbThreadTest.traverse(SmbThreadTest.java:63)
        at SmbThreadTest.traverse(SmbThreadTest.java:63)
        at SmbThreadTest.traverse(SmbThreadTest.java:63)
        at SmbThreadTest.run(SmbThreadTest.java:97)
connecting
disconnecting
connecting

This was the behavior I learned when anlyzing the problems in jcifs 1.1 that 
never worked stable for the app i try to support. In jcifs-1.2 this was 
changed to disconnecting only in the transport thread, but reintroduced in 
1.2.5 order to implement error handling when server drops open transport.
Actually it may not be needed anymore even without my changes, since the 
transport now handles peekKey()==null (have to check this).
But the cancellation of active request on hard disconnect is something 
definitly needed for correct error reporting (even with locking done right).
My transport changes may be discussable, but are in no way irrelevant.

> is relatively stable so make your patch against that. At this point I'd
> say the "test" branch is on the shelf.
>
> As for changing response wakeup-behavior [1] or your 2.0 ideas, please be
> aware that my priority is stability and simplicity. Java synchronization
> and I/O multiplexing is so pathetic I want to try to keep this thing
> "as simple as possible, but not simpler". The list has been relatively
> quite (until you came along :-) and I like it that way.
Sorry about that...

> So if your solution is not clear and clean it will probably not be
> accepted. I got the feeling from your first patch that you just moved
> code around until it suited your application. That's ok but the changes
> have to be clean enough to see that locking is correct and that they
> will not affect the majority use cases. Note that by "simple" I don't
> necessarily mean that the number of lines changed needs to be low but
> it has to be clear to me that locking is correct.
>
> Mike
>
> [1] Note that lowering jcifs.smb.client.ssnLimit would probably suffice
>     to resolve this issue. It was not really anticipated that the number
>     of outstanding requests on a transport at one time would be more
>     than say 2 or 3.
Not really an option if you may have hundreds of sessions per server.

Lars


More information about the jcifs mailing list