[jcifs] Re: DFS-Links and jcifs.smb.client.soTimeout

Fri May 20 20:00:32 GMT 2005

On Fri, 20 May 2005 13:35:12 +0000 (UTC)
Kevin Tapperson <kevin.tapperson at hcahealthcare.com> wrote:

> We have been getting the "Invalid access to memory location" exception 
> intermittently, and I have finally tracked down why.  Here is the sequence of 
> events that can cause the exception to be thrown:
> 
<snip>
> 
> 
> So, it looks like there is a failure window here.  If the transport socket 
> times out after the type 2 message has been sent by jcifs, but before the type 
> 3 message has been received and the logon finished, then it will result in 
> the "Invalid access to memory location" exception.

Yes. This is true. If the socket is idle and times out and closes the
transport after the HTTP client is sent the NTLM type-2-message
but before the type-3-message is received and used, the said error will
occur. But this should be EXTREEMELY rare ...

> Is there a way to send a "keep alive" message over the transportA socket when 
> its encryption key is being used as a challenge (between steps 9-10 above)?  

Well that's not really the right way to fix it. Why not just set soTimeout to 0?

> Some sort of no-op SMB message to keep the socket open?  This would keep the 
> transportA socket open and allow the client up to jcifs.smb.client.soTimeout ms 
> to respond with a type 3 message.  As the case is now, it doesn't really matter 
> how fast the client is in responding with the type 3 message, it is simply dumb 
> luck if the client hits the window where the socket could be closed and cause 
> this failure.

Actually if the client (and server) are fast it will proportionally
reduce the frequency with which the error can occur.

> I suppose that setting a higher value for jcifs.smb.client.soTimeout could 
> potentially help this problem, as long as the app server is processing at least 
> one type 3 message (and SMB logon) every jcifs.smb.client.soTimeout ms.  If 
> this were the case, it would essentially keep the transport socket open 
> forever.  However, any time a transport socket is closed, there is the 
> possibility for failure.

Right. Which is to say that I don't think there is any way to actually
*solve* this problem. If you send the challenge and a network glitch
causes a socket exception what can the server do? Ans: nothing. You can't
just start over because if you sent a 403 error it will just cause the
Network Password Dialog to appear.

I didn't really think this could occur so frequently. I mean the
socket has to be totally idle and then timeout and close at the instant
between the time the client is sends a type-2-message but before the
server receives the corresponding type-3-message and get's a lock
on the transport to do the logon. By default the NtlmHttpFilter sets
jcifs.smb.client.soTimeout to *5 minutes* so I thought probability would
be in my favor. Are you setting soTimeout yourself?

As an aside --  the transport rewrite is going to have the potential to
deal with this a little better. This actually has a little to do with
concurrency error described some time back. There are two phases to the
new transport. The first part was to make the locking simplier. That
is done and in jcifs-1.1.11trans2. The second part has to do with
coordinating threads trying to build objects that require multiple
messages over the network. For that I need to work out extended security
which needs SPNEGO, which needs GSS-API, which delves into LoginModules,
which needs Principals, which needs Sids, etc. So at the moment I'm
pushing more onto the stack than I'm popping off but when I finally unwind
I'll be able to see where I might add ref-counting so I can supress the
soTimeout behavior alltogether if a multi-message negotiation is taking
place (or if there are open file descriptors).

Don't worry, this is all part of my master plan! Bwhaahahahaa :->

Mike