[jcifs] Multiple Transport threads with same transportId
Adam Morgan
adam.morgan at Q1Labs.com
Thu Jan 13 11:44:10 MST 2011
Hi Mike
First off, Happy New Year! Now to business...
While tracking down an issue on one of our clients' systems, I discovered that multiple SmbTransport(Transport) threads existed with the same 'transportId'. I think I'm correct when I say this shouldn't happen, and so I'm trying to track down how this could happen. The snippets from a stack dump below. You will notice that the thread names reflect the changes I submitted a few weeks ago. The second number (added as part of those changes) reflects the number of threads that have been created by the Transport class instances over the course of a jvm up time.
Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary "jcifs-transport19-2530" daemon prio=10 tid=0x00002aab343fb800 nid=0x1f74 runnable [0x000000007466d000..0x000000007466db10]
Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary java.lang.Thread.State: RUNNABLE
Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary at java.net.SocketInputStream.socketRead0(Native Method)
Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary at java.net.SocketInputStream.read(SocketInputStream.java:129)
Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary at jcifs.util.transport.Transport.readn(Transport.java:31)
Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary at jcifs.smb.SmbTransport.doRecv(SmbTransport.java:499)
Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary - locked <0x00002aaac7470140> (a [B)
Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary at jcifs.util.transport.Transport.loop(Transport.java:108)
Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary - locked <0x00002aaac84ab020> (a jcifs.smb.SmbTransport)
Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary at jcifs.util.transport.Transport.run(Transport.java:261)
Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary at java.lang.Thread.run(Thread.java:619)
Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary "jcifs-transport19-2257" daemon prio=10 tid=0x00002aab26dbec00 nid=0x7d38 runnable [0x00002aab27276000..0x00002aab27276d90]
Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary java.lang.Thread.State: RUNNABLE
Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary at java.net.SocketInputStream.socketRead0(Native Method)
Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary at java.net.SocketInputStream.read(SocketInputStream.java:129)
Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary at java.net.SocketInputStream.read(SocketInputStream.java:182)
Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary at jcifs.smb.SmbTransport.peekKey(SmbTransport.java:421)
Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary at jcifs.util.transport.Transport.loop(Transport.java:98)
Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary at jcifs.util.transport.Transport.run(Transport.java:261)
Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary at java.lang.Thread.run(Thread.java:619)
The system we have seen this on was regularly hitting the responseTimeout threshold due to network load spikes, but would recover in time as the network load dropped. I'm able to reproduce the response timeouts by inducing delay on packet delivery from the smb server (using the linux tc command), but so far have not seen multiple threads with the same transportId value.
My best guess at this point is that the first thread is being orphaned by a 'thread = null;' call while it is blocked on a socketRead0() call. If the thread were blocking on the socket read, then potentially it would never return and remain until jvm shutdown. Perhaps a thread.interrupt() call should precede each 'thread = null;' execution? Granted, without being to reproduce the issue I'm just postulating but I was interested in your thoughts/comments.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.samba.org/pipermail/jcifs/attachments/20110113/12b6d2eb/attachment.html>
More information about the jCIFS
mailing list