[jcifs] Multiple Transport threads with same transportId

Michael B Allen ioplex at gmail.com
Mon Jan 17 10:55:00 MST 2011


Well again, the 3 lines of code used to set the transport id speaks
for itself. I don't see how it's theoretically possible for duplicate
IDs to occur other than the two cases described - context switching in
the middle of the transport id assignment or transports loaded in
different ClassLoaders. It has to be one of those two cases. You may
just not realize it (for example Servlet containers frequently use
different ClassLoaders for things).

Mike

On Mon, Jan 17, 2011 at 12:47 PM, Adam Morgan <adam.morgan at q1labs.com> wrote:
> Mike
>
> No, I'm not using any custom classloading... just using the new operator.
>
>
> -----Original Message-----
> From: Michael B Allen [mailto:ioplex at gmail.com]
> Sent: Monday, January 17, 2011 1:45 PM
> To: Adam Morgan
> Cc: jcifs at lists.samba.org
> Subject: Re: [jcifs] Multiple Transport threads with same transportId
>
> Hi Adam,
>
> Are you using multiple ClassLoaders? If yes, that could certainly
> explain the behavior. Otherwise, the code is very simple and therefore
> I think it speaks for itself.
>
> Mike
>
> On Mon, Jan 17, 2011 at 10:48 AM, Adam Morgan <adam.morgan at q1labs.com> wrote:
>> Hi Mike
>>
>> I guess I'm just not convinced this is the reason it is happening, as I saw the situation multiple times.  There would have been about 80 transport threads on the box that was exhibiting the behaviour, and they'd all have been getting initialized around the same time.  Still, a long shot that they'd context switch at the exact same place, multiple times over a few jvm restarts.
>>
>> That being said, you may very well be correct.  Adding the synchronized block would allow us to know this for sure.  IMHO it would be worth inclusion in next release.
>>
>> Cheers
>> Adam
>>
>> -----Original Message-----
>> From: Michael B Allen [mailto:ioplex at gmail.com]
>> Sent: Saturday, January 15, 2011 3:01 PM
>> To: Adam Morgan
>> Cc: jcifs at lists.samba.org
>> Subject: Re: [jcifs] Multiple Transport threads with same transportId
>>
>> Hi Adam,
>>
>> The code that sets the id looks like this:
>>
>>    public abstract class Transport implements Runnable {
>>        static int id = 0;
>>        String name = "Transport" + id++;
>>
>> Because the name assignment is not synchronized, it is possible
>> (albeit unlikely) for the name to be set to the same value if a
>> context switch occurs after the name is set but before the id is
>> incremented.
>>
>> The proper fix would be to move the assignment into a static block and
>> synchronize on a static object like:
>>
>> static {
>>    synchronized (log) {
>>        name = "Transport" + id++;
>>    }
>> }
>>
>> This seems a little overboard though. It's not clear to me at the
>> moment how important it is to have a truly unique transport id. It is
>> really just used for debugging I think.
>>
>> Mike
>>
>> --
>> Michael B Allen
>> Java Active Directory Integration
>> http://www.ioplex.com/
>>
>> On Thu, Jan 13, 2011 at 1:44 PM, Adam Morgan <adam.morgan at q1labs.com> wrote:
>>> Hi Mike
>>>
>>>
>>>
>>> First off, Happy New Year!  Now to business.
>>>
>>>
>>>
>>> While tracking down an issue on one of our clients' systems, I discovered
>>> that multiple SmbTransport(Transport) threads existed with the same
>>> 'transportId'.  I think I'm correct when I say this shouldn't happen, and so
>>> I'm trying to track down how this could happen.  The snippets from a stack
>>> dump below.  You will notice that the thread names reflect the changes I
>>> submitted a few weeks ago.  The second number (added as part of those
>>> changes) reflects the number of threads that have been created by the
>>> Transport class instances over the course of a jvm up time.
>>>
>>>
>>>
>>> Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary "jcifs-transport19-2530" daemon
>>> prio=10 tid=0x00002aab343fb800 nid=0x1f74 runnable
>>> [0x000000007466d000..0x000000007466db10]
>>>
>>> Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary    java.lang.Thread.State:
>>> RUNNABLE
>>>
>>> Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary     at
>>> java.net.SocketInputStream.socketRead0(Native Method)
>>>
>>> Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary     at
>>> java.net.SocketInputStream.read(SocketInputStream.java:129)
>>>
>>> Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary     at
>>> jcifs.util.transport.Transport.readn(Transport.java:31)
>>>
>>> Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary     at
>>> jcifs.smb.SmbTransport.doRecv(SmbTransport.java:499)
>>>
>>> Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary     - locked
>>> <0x00002aaac7470140> (a [B)
>>>
>>> Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary     at
>>> jcifs.util.transport.Transport.loop(Transport.java:108)
>>>
>>> Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary     - locked
>>> <0x00002aaac84ab020> (a jcifs.smb.SmbTransport)
>>>
>>> Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary     at
>>> jcifs.util.transport.Transport.run(Transport.java:261)
>>>
>>> Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary     at
>>> java.lang.Thread.run(Thread.java:619)
>>>
>>>
>>>
>>> Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary "jcifs-transport19-2257" daemon
>>> prio=10 tid=0x00002aab26dbec00 nid=0x7d38 runnable
>>> [0x00002aab27276000..0x00002aab27276d90]
>>>
>>> Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary    java.lang.Thread.State:
>>> RUNNABLE
>>>
>>> Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary     at
>>> java.net.SocketInputStream.socketRead0(Native Method)
>>>
>>> Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary     at
>>> java.net.SocketInputStream.read(SocketInputStream.java:129)
>>>
>>> Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary     at
>>> java.net.SocketInputStream.read(SocketInputStream.java:182)
>>>
>>> Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary     at
>>> jcifs.smb.SmbTransport.peekKey(SmbTransport.java:421)
>>>
>>> Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary     at
>>> jcifs.util.transport.Transport.loop(Transport.java:98)
>>>
>>> Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary     at
>>> jcifs.util.transport.Transport.run(Transport.java:261)
>>>
>>> Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary     at
>>> java.lang.Thread.run(Thread.java:619)
>>>
>>>
>>>
>>> The system we have seen this on was regularly hitting the responseTimeout
>>> threshold due to network load spikes, but would recover in time as the
>>> network load dropped.  I'm able to reproduce the response timeouts by
>>> inducing delay on packet delivery from the smb server (using the linux tc
>>> command), but so far have not seen multiple threads with the same
>>> transportId value.
>>>
>>>
>>>
>>> My best guess at this point is that the first thread is being orphaned by a
>>> 'thread = null;' call while it is blocked on a socketRead0() call.  If the
>>> thread were blocking on the socket read, then potentially it would never
>>> return and remain until jvm shutdown.  Perhaps a thread.interrupt() call
>>> should precede each 'thread = null;' execution?  Granted, without being to
>>> reproduce the issue I'm just postulating but I was interested in your
>>> thoughts/comments.
>>
>
>
>
> --
> Michael B Allen
> Java Active Directory Integration
> http://www.ioplex.com/
>



-- 
Michael B Allen
Java Active Directory Integration
http://www.ioplex.com/


More information about the jCIFS mailing list