[jcifs] Multiple Transport threads with same transportId

Michael B Allen ioplex at gmail.com
Mon Jan 17 20:23:33 MST 2011


On Mon, Jan 17, 2011 at 1:16 PM, Adam Morgan <adam.morgan at q1labs.com> wrote:
> I need to clear up a little confusion here.  I'm not saying there are two instances of SmbTransport object with the same id, rather that there are two Threads running with the same id.
>
> If you check the patch I sent you late last year, you will see that I've changed the code that names the thread so that it now has a format of:
>
> jcifs-transport<transport-instance-count>-<thread-count>
>
> so, when a new SmbTransport object is instantiated, the static 'transport-instance-count' value is incremented (as it always was).  However, now when connect() is called on Transport, it will increment the static thread-count when instantiating the Thread object.
>
> So, based on this, I see potential for multiple Threads being created with the same 'transport-instance-count' value.  Especially since 'thread = null;' is done in several places without a respective 'thread.interrupt();' to kick existing threads out of blocking reads etc as per my initial email in this thread.
>
> Does that make it any clearer?

Hi Adam,

Yes! It does make more sense that multiple threads could be created
with the same transport id. But the extra threads should not
ultimately do any work. As soon as doConnect returns, Transport.run
should see that run_thread != thread and it will exit.

Note that I do not think thread.interrupt() should every be used. At
least not internally within the library.

I see your InetSocketAddress patch on the TODO list. Does the
duplicate thread name condition occur with that patch?

I assume the Threads with the duplicate IDs are just waiting for
connect to timeout (your thread dump fragment shows the one thread
sitting on peekKey.

Mike

-- 
Michael B Allen
Java Active Directory Integration
http://www.ioplex.com/


> -----Original Message-----
> From: Michael B Allen [mailto:ioplex at gmail.com]
> Sent: Monday, January 17, 2011 1:55 PM
> To: Adam Morgan
> Cc: jcifs at lists.samba.org
> Subject: Re: [jcifs] Multiple Transport threads with same transportId
>
> Well again, the 3 lines of code used to set the transport id speaks
> for itself. I don't see how it's theoretically possible for duplicate
> IDs to occur other than the two cases described - context switching in
> the middle of the transport id assignment or transports loaded in
> different ClassLoaders. It has to be one of those two cases. You may
> just not realize it (for example Servlet containers frequently use
> different ClassLoaders for things).
>
> Mike
>
> On Mon, Jan 17, 2011 at 12:47 PM, Adam Morgan <adam.morgan at q1labs.com> wrote:
>> Mike
>>
>> No, I'm not using any custom classloading... just using the new operator.
>>
>>
>> -----Original Message-----
>> From: Michael B Allen [mailto:ioplex at gmail.com]
>> Sent: Monday, January 17, 2011 1:45 PM
>> To: Adam Morgan
>> Cc: jcifs at lists.samba.org
>> Subject: Re: [jcifs] Multiple Transport threads with same transportId
>>
>> Hi Adam,
>>
>> Are you using multiple ClassLoaders? If yes, that could certainly
>> explain the behavior. Otherwise, the code is very simple and therefore
>> I think it speaks for itself.
>>
>> Mike
>>
>> On Mon, Jan 17, 2011 at 10:48 AM, Adam Morgan <adam.morgan at q1labs.com> wrote:
>>> Hi Mike
>>>
>>> I guess I'm just not convinced this is the reason it is happening, as I saw the situation multiple times.  There would have been about 80 transport threads on the box that was exhibiting the behaviour, and they'd all have been getting initialized around the same time.  Still, a long shot that they'd context switch at the exact same place, multiple times over a few jvm restarts.
>>>
>>> That being said, you may very well be correct.  Adding the synchronized block would allow us to know this for sure.  IMHO it would be worth inclusion in next release.
>>>
>>> Cheers
>>> Adam
>>>
>>> -----Original Message-----
>>> From: Michael B Allen [mailto:ioplex at gmail.com]
>>> Sent: Saturday, January 15, 2011 3:01 PM
>>> To: Adam Morgan
>>> Cc: jcifs at lists.samba.org
>>> Subject: Re: [jcifs] Multiple Transport threads with same transportId
>>>
>>> Hi Adam,
>>>
>>> The code that sets the id looks like this:
>>>
>>>    public abstract class Transport implements Runnable {
>>>        static int id = 0;
>>>        String name = "Transport" + id++;
>>>
>>> Because the name assignment is not synchronized, it is possible
>>> (albeit unlikely) for the name to be set to the same value if a
>>> context switch occurs after the name is set but before the id is
>>> incremented.
>>>
>>> The proper fix would be to move the assignment into a static block and
>>> synchronize on a static object like:
>>>
>>> static {
>>>    synchronized (log) {
>>>        name = "Transport" + id++;
>>>    }
>>> }
>>>
>>> This seems a little overboard though. It's not clear to me at the
>>> moment how important it is to have a truly unique transport id. It is
>>> really just used for debugging I think.
>>>
>>> Mike
>>>
>>> --
>>> Michael B Allen
>>> Java Active Directory Integration
>>> http://www.ioplex.com/
>>>
>>> On Thu, Jan 13, 2011 at 1:44 PM, Adam Morgan <adam.morgan at q1labs.com> wrote:
>>>> Hi Mike
>>>>
>>>>
>>>>
>>>> First off, Happy New Year!  Now to business.
>>>>
>>>>
>>>>
>>>> While tracking down an issue on one of our clients' systems, I discovered
>>>> that multiple SmbTransport(Transport) threads existed with the same
>>>> 'transportId'.  I think I'm correct when I say this shouldn't happen, and so
>>>> I'm trying to track down how this could happen.  The snippets from a stack
>>>> dump below.  You will notice that the thread names reflect the changes I
>>>> submitted a few weeks ago.  The second number (added as part of those
>>>> changes) reflects the number of threads that have been created by the
>>>> Transport class instances over the course of a jvm up time.
>>>>
>>>>
>>>>
>>>> Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary "jcifs-transport19-2530" daemon
>>>> prio=10 tid=0x00002aab343fb800 nid=0x1f74 runnable
>>>> [0x000000007466d000..0x000000007466db10]
>>>>
>>>> Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary    java.lang.Thread.State:
>>>> RUNNABLE
>>>>
>>>> Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary     at
>>>> java.net.SocketInputStream.socketRead0(Native Method)
>>>>
>>>> Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary     at
>>>> java.net.SocketInputStream.read(SocketInputStream.java:129)
>>>>
>>>> Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary     at
>>>> jcifs.util.transport.Transport.readn(Transport.java:31)
>>>>
>>>> Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary     at
>>>> jcifs.smb.SmbTransport.doRecv(SmbTransport.java:499)
>>>>
>>>> Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary     - locked
>>>> <0x00002aaac7470140> (a [B)
>>>>
>>>> Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary     at
>>>> jcifs.util.transport.Transport.loop(Transport.java:108)
>>>>
>>>> Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary     - locked
>>>> <0x00002aaac84ab020> (a jcifs.smb.SmbTransport)
>>>>
>>>> Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary     at
>>>> jcifs.util.transport.Transport.run(Transport.java:261)
>>>>
>>>> Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary     at
>>>> java.lang.Thread.run(Thread.java:619)
>>>>
>>>>
>>>>
>>>> Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary "jcifs-transport19-2257" daemon
>>>> prio=10 tid=0x00002aab26dbec00 nid=0x7d38 runnable
>>>> [0x00002aab27276000..0x00002aab27276d90]
>>>>
>>>> Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary    java.lang.Thread.State:
>>>> RUNNABLE
>>>>
>>>> Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary     at
>>>> java.net.SocketInputStream.socketRead0(Native Method)
>>>>
>>>> Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary     at
>>>> java.net.SocketInputStream.read(SocketInputStream.java:129)
>>>>
>>>> Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary     at
>>>> java.net.SocketInputStream.read(SocketInputStream.java:182)
>>>>
>>>> Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary     at
>>>> jcifs.smb.SmbTransport.peekKey(SmbTransport.java:421)
>>>>
>>>> Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary     at
>>>> jcifs.util.transport.Transport.loop(Transport.java:98)
>>>>
>>>> Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary     at
>>>> jcifs.util.transport.Transport.run(Transport.java:261)
>>>>
>>>> Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary     at
>>>> java.lang.Thread.run(Thread.java:619)
>>>>
>>>>
>>>>
>>>> The system we have seen this on was regularly hitting the responseTimeout
>>>> threshold due to network load spikes, but would recover in time as the
>>>> network load dropped.  I'm able to reproduce the response timeouts by
>>>> inducing delay on packet delivery from the smb server (using the linux tc
>>>> command), but so far have not seen multiple threads with the same
>>>> transportId value.
>>>>
>>>>
>>>>
>>>> My best guess at this point is that the first thread is being orphaned by a
>>>> 'thread = null;' call while it is blocked on a socketRead0() call.  If the
>>>> thread were blocking on the socket read, then potentially it would never
>>>> return and remain until jvm shutdown.  Perhaps a thread.interrupt() call
>>>> should precede each 'thread = null;' execution?  Granted, without being to
>>>> reproduce the issue I'm just postulating but I was interested in your
>>>> thoughts/comments.
>>>
>>
>>
>>
>> --
>> Michael B Allen
>> Java Active Directory Integration
>> http://www.ioplex.com/
>>
>
>
>
> --
> Michael B Allen
> Java Active Directory Integration
> http://www.ioplex.com/
>


More information about the jCIFS mailing list