[jcifs] Another fix for Transport deadlock issue ?

Sat Apr 29 01:16:59 GMT 2006

On Fri, 28 Apr 2006 16:38:09 -0500
"Dalton, Tim" <Daltontf at AGEDWARDS.com> wrote:

> We tried using the 1.2.9 release of jcifs, but noticed throughput is significantly lower due to the new Transport.setupDiscoLock monitor. We were only experience the deadlock when a Transport experiences an IOException:
> 
> Found one Java-level deadlock:
> =============================
> "DniDeviceReaderJms-TDMS1_SAMBA49":
>   waiting to lock monitor 0x080b89ec (object 0x47320060, a java.util.HashMap),
>   which is held by "DniDeviceReaderJms-TDMS1_SAMBA27"
> "DniDeviceReaderJms-TDMS1_SAMBA27":
>   waiting to lock monitor 0x5190724c (object 0x47320088, a jcifs.smb.SmbTransport),
>   which is held by "DniDeviceReaderJms-TDMS1_SAMBA49"
> 
> Java stack information for the threads listed above:
> ===================================================
> "DniDeviceReaderJms-TDMS1_SAMBA49":
>         at jcifs.util.transport.Transport.sendrecv(Transport.java:63)
>         - waiting to lock <0x47320060> (a java.util.HashMap)
>         at jcifs.smb.SmbTransport.send(SmbTransport.java:595)
>         at jcifs.smb.SmbSession.sessionSetup(SmbSession.java:264)
>         - locked <0x47320088> (a jcifs.smb.SmbTransport)
>         at jcifs.smb.SmbSession.send(SmbSession.java:223)
>         at jcifs.smb.SmbTree.treeConnect(SmbTree.java:144)
>         - locked <0x47320088> (a jcifs.smb.SmbTransport)
>         at jcifs.smb.SmbFile.connect(SmbFile.java:792)
> "DniDeviceReaderJms-TDMS1_SAMBA27":
>         at jcifs.util.transport.Transport.disconnect(Transport.java:192)
>         - waiting to lock <0x47320088> (a jcifs.smb.SmbTransport)
>         at jcifs.util.transport.Transport.sendrecv(Transport.java:83)
>         - locked <0x47320060> (a java.util.HashMap)
>         at jcifs.smb.SmbTransport.send(SmbTransport.java:595)
>         at jcifs.smb.SmbSession.send(SmbSession.java:229)
> 
> Found 1 deadlock.
>
> 	I allow the thread to release the monitor on the response_map while performing the disconnect and re-acquire it when the request needs to be removed from it. In my testing, I have not been able to recreate the above deadlock nor have I experienced any significant performance degradation.

This is a different deadlock albeit a very easy one to solve - your
code should work fine. It's pretty much the same thing - one thread gets
the map and tries and fails to get the transport while another gets the
transport and tries and fails to get the map. This is also protected by
the "disco lock" in 1.2.9.

The real case you have to watch out for is when Transport.loop() calls
disconnect. The disconnect method is synchronized so it locks the
transport. Then the transport thread calls the subcless doDisconnect
which in the case of SMB tries to send a logoff messages and such. It
tries and fails to lock the repsonse map because another thread calling
sendrecv has it locked but is trying and failing to lock the transport.

The problem is really kind of a design flaw. The transport thread probably
shouldn't call the subclass disconnect.

One solution here might be to always call disconnect with the parameter
always set to 'true' to mean, "don't try to use the transport or you
may deadlock". This will indicate to the SMB layer that it should not
try to send messages to logoff.

Mike

PS: Actually the REAL problem with all of this is that Java thread locking
sucks like a hoover. If we had a way to unlock a lock or just try to
lock without actually blocking the solution would be trivial.