[jcifs] Thread leak problem on connect timeout
Michael B Allen
ioplex at gmail.com
Wed Jul 1 18:15:13 GMT 2009
On Wed, Jul 1, 2009 at 1:50 PM, Data Shock<datashock at hotmail.com> wrote:
> I am using the JCIFS library to write files to a Samba server. In my implementation, the program checks a local directory for new files and writes them to the Samba server. If the connect fails, it will try again after the next 30 seconds.
> The problem is, I've found that if the server connect times out instead of a fast failure, threads are orphaned and steadily pile up. The reason is that the connect itself doesn't timeout after 30 seconds, just the thread that's waiting for the the connect. The connect itself doesn't time out for over 3 minutes. The waiting thread simply orphans the connecting thread and moves on. Additionally, the original connecting thread is still holding a lock, so all subsequent connect attempts must wait for the first to fail before they can continue. This results in a steadily increasing number of threads. Even though the original connecting thread will eventually timeout and exit, the waiting threads pile up faster. This process will eventually exhaust the number of available system file descriptors and/or memory.
> Perhaps it was coded this way for backwards Java version compatibility, but it would seem that a few things could be done to address the problem.
> The simplest may be to interrupt the connecting thread then join with it. However, I have read reports of Windows socket implementations not handling interrupts in blocking IO but I don't know if that problem still exists in more recent releases.
> Another approach may be to update any TCP socket connect attempts to use:
> Socket socket = new Socket();
> socket.connect(socketaddress, timeout);
> Instead of the simple constructor approach. The connect with timeout has been available since Java 1.4. I suppose this may be a Java compatibility problem, however I have not found any JCIFS documentation that specifies Java version compatibility.
Ok, this looks like something worth fixing. I think using the 1.4
timeout looks like the right track but I haven't really looked at the
problem so I don't know. Aslo, unfortunately this is the sort of thing
that could take a really long time to make it into the code since I
don't really work on JCIFS for free or Free anymore. But I've added
this to the TODO.
Thanks for the report.
Michael B Allen
Java Active Directory Integration
More information about the jcifs