[jcifs] Multiple Transport threads with same transportId

Adam Morgan adam.morgan at Q1Labs.com
Thu Jan 13 11:44:10 MST 2011


Hi Mike

First off, Happy New Year!  Now to business...

While tracking down an issue on one of our clients' systems, I discovered that multiple SmbTransport(Transport) threads existed with the same 'transportId'.  I think I'm correct when I say this shouldn't happen, and so I'm trying to track down how this could happen.  The snippets from a stack dump below.  You will notice that the thread names reflect the changes I submitted a few weeks ago.  The second number (added as part of those changes) reflects the number of threads that have been created by the Transport class instances over the course of a jvm up time.

Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary "jcifs-transport19-2530" daemon prio=10 tid=0x00002aab343fb800 nid=0x1f74 runnable [0x000000007466d000..0x000000007466db10]
Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary    java.lang.Thread.State: RUNNABLE
Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary     at java.net.SocketInputStream.socketRead0(Native Method)
Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary     at java.net.SocketInputStream.read(SocketInputStream.java:129)
Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary     at jcifs.util.transport.Transport.readn(Transport.java:31)
Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary     at jcifs.smb.SmbTransport.doRecv(SmbTransport.java:499)
Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary     - locked <0x00002aaac7470140> (a [B)
Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary     at jcifs.util.transport.Transport.loop(Transport.java:108)
Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary     - locked <0x00002aaac84ab020> (a jcifs.smb.SmbTransport)
Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary     at jcifs.util.transport.Transport.run(Transport.java:261)
Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary     at java.lang.Thread.run(Thread.java:619)

Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary "jcifs-transport19-2257" daemon prio=10 tid=0x00002aab26dbec00 nid=0x7d38 runnable [0x00002aab27276000..0x00002aab27276d90]
Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary    java.lang.Thread.State: RUNNABLE
Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary     at java.net.SocketInputStream.socketRead0(Native Method)
Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary     at java.net.SocketInputStream.read(SocketInputStream.java:129)
Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary     at java.net.SocketInputStream.read(SocketInputStream.java:182)
Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary     at jcifs.smb.SmbTransport.peekKey(SmbTransport.java:421)
Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary     at jcifs.util.transport.Transport.loop(Transport.java:98)
Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary     at jcifs.util.transport.Transport.run(Transport.java:261)
Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary     at java.lang.Thread.run(Thread.java:619)

The system we have seen this on was regularly hitting the responseTimeout threshold due to network load spikes, but would recover in time as the network load dropped.  I'm able to reproduce the response timeouts by inducing delay on packet delivery from the smb server (using the linux tc command), but so far have not seen multiple threads with the same transportId value.

My best guess at this point is that the first thread is being orphaned by a 'thread = null;' call while it is blocked on a socketRead0() call.  If the thread were blocking on the socket read, then potentially it would never return and remain until jvm shutdown.  Perhaps a thread.interrupt() call should precede each 'thread = null;' execution?  Granted, without being to reproduce the issue I'm just postulating but I was interested in your thoughts/comments.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.samba.org/pipermail/jcifs/attachments/20110113/12b6d2eb/attachment.html>


More information about the jCIFS mailing list