[jcifs] Issues with connections to servers that reboot
Michael B Allen
ioplex at gmail.com
Wed Jun 22 11:29:15 MDT 2011
Hi Sean,
I was not able to reproduce this issue. Windows Server 2008r2 produces
the "timedout waiting for response" error (as opposed to Windows
Server 2003 which produces "connection reset") but after about a
minute, the program recovered and correctly listed the target
directory.
I have applied Simon's try / catch anyway (but in
util/transport/Transport.java) since an exception from doDisconnect is
clearly bad for the Transport.java state machine. I don't know if it
will help with your issue but I recommend trying the
soon-to-be-released 1.3.16.
Mike
--
Michael B Allen
Java Active Directory Integration
http://www.ioplex.com/
On Tue, Jun 21, 2011 at 10:10 PM, Sean Daley <spdaley at gmail.com> wrote:
> I seem to be running into a random issue with JCIFS re-connecting to a
> server that is
> rebooted. I've attached a simple Java program which connects to the
> Admin$ share
> and calls listFiles on it. It then repeats this every second.
>
> Sometimes, when I reboot a server and/or shut it down for a few
> minutes and re-start
> it, the JCIFS connection to that server never seems to recover. This
> doesn't seem to
> happen to all of my servers but it does happen to some of them.
>
> I'm currently using Fedora 14 x86_64 as the JCIFS client connecting to
> a wide-variety
> of windows boxes. The biggest windows culprit I have seems to be a
> Windows 2008r2
> box.
>
> For this particular box, I get the following logs from this test class:
> 0: fileList returned 80 and took 190(ms).
> 1: fileList returned 80 and took 11(ms).
> ...
> 33: fileList returned 80 and took 5(ms).
> 34: fileList failed: Transport1[testhost/10.20.14.15:445] timedout
> waiting for response to
> Trans2FindFirst2[command=SMB_COM_TRANSACTION2,received=false,errorCode=0,flags=0x0018,flags2=0xC803,signSeq=0,tid=2048,pid=63708,uid=2048,mid=73,wordCount=15,byteCount=19,totalParameterCount=18,totalDataCount=0,maxParameterCount=10,maxDataCount=65535,maxSetupCount=0,flags=0x00,timeout=0,parameterCount=18,parameterOffset=66,parameterDisplacement=0,dataCount=0,dataOffset=84,dataDisplacement=0,setupCount=1,pad=1,pad1=0,searchAttributes=0x16,searchCount=200,flags=0x00,informationLevel=0x104,searchStorageType=0,filename=\]
> took 30001(ms).
> 35: ... (repeats the exact same thing as 34: every 30 seconds).
>
> I've let it run for awhile now and it will just continuously report
> the "timedout waiting for ..."
> error every 30 seconds.
>
> If I stop and re-start the program though it will re-connect just
> fine. If I enable
> jcifs.Config.setProperty("jcifs.smb.client.ssnLimit", "1");
> the problem also does not occur but I'd really rather not do that as
> I'm going to potentially
> be working with the same set of hosts many times and I rather like the
> caching that's
> being done here.
>
> I've played around with this program and differing target servers as
> well as changing things
> around to do something else other than a listFiles check (like an
> exists) check and I've
> received differing behaviors along the way. For some of my
> environment, with the
> exists check, I got similar timeout behavior but it was a more
> straight-forward exception
> of "connection timed out". What was worse though was that each time I
> got that, I
> was left with a new Thread running with the following stack trace:
>
> #########
> Daemon Thread [Transport1] (Suspended)
> PlainSocketImpl.socketConnect(InetAddress, int, int) line: not
> available [native method]
> SocksSocketImpl(PlainSocketImpl).doConnect(InetAddress, int, int) line: 333
> SocksSocketImpl(PlainSocketImpl).connectToAddress(InetAddress, int,
> int) line: 195
> SocksSocketImpl(PlainSocketImpl).connect(SocketAddress, int) line: 182
> SocksSocketImpl.connect(SocketAddress, int) line: 366
> Socket.connect(SocketAddress, int) line: 529
> Socket.connect(SocketAddress) line: 478
> Socket.<init>(SocketAddress, SocketAddress, boolean) line: 375
> Socket.<init>(String, int) line: 189
> SmbTransport.ssn139() line: 185
> SmbTransport.negotiate(int, ServerMessageBlock) line: 240
> SmbTransport.doConnect() line: 302
> SmbTransport(Transport).run() line: 232
> Thread.run() line: 662
> #########
>
> So every 30 seconds, I'd get the connection timedout error, then we'd
> try to connect
> again and a new Daemon Thread Transport1 would start. These threads would take
> upwards of 4 - 5 minutes (at least) before they finally terminated.
> During that time though
> we'll keep on accumulating more and more of them as we try to
> re-connect. Once again, if I
> stop and re-start the test program it works just fine again right away.
>
> Is there any way to force a new SmbTransport to get created without
> setting ssnLimit to 1?
> I briefly tried setting it to 1 but I have some concerns about doing
> that because we lose
> the benefit of caching, plus, unless I'm misreading the code, it looks like the
> CONNECTIONS LinkedList can grow unbounded. So with ssnLimit == 1, we're just
> constantly creating new SmbTransports and adding them to CONNECTIONS. I didn't
> find any place where we were removing them from the list though.
>
> Any thoughts on this? Or is there any additional information I can get you?
> Any help would be greatly appreciated.
>
> Sean
>
More information about the jCIFS
mailing list