[jcifs] NullPointerException in Dfs.resolve
Martin Kutter
martin.kutter at fen-net.de
Mon Jun 15 05:45:22 MDT 2015
Thanks for your reply.
I've tested 1.3.18b - it does not fix this specific error.
Regarding the fix suggested by Conrad: In the meantime, I've also
implemented a fix for the issue - I changed SmbFile.resolveDfs()
(lines 671 and following) to
DfsReferral dr = null;
// disconnect is synchronized to transport, too.
// make sure our transport doesn't get disconnected while
// we're inside the synchronized block
synchronized (tree.session.transport) {
if (tree.session.transport.tconHostName == null) {
// disconnect properly if connection is lost
tree.treeDisconnect(false);
}
tree.session.transport.connect();
dr = dfs.resolve(tree.session.transport.tconHostName,
tree.share,
unc,
auth);
}
if (dr != null) {
As my knowledge of CIFS is quite limited, I have no idea whether this
is right (or has bad side effects).
The idea behind is similar (but not equal) to Conrad's fix: In case the
transport has disconnected, disconnect properly and reconnect.
Best regards,
Martin
On Sun, 14 Jun 2015 11:37:48 -0400, Michael B Allen <ioplex at gmail.com>
wrote:
> I have added this post to the list of people who have reported it to
> the TODO list so that it can be considered when I get around to
> looking at this.
>
> Note that the 1.3.18b mentioned in the link cited is here:
>
> http://jcifs.samba.org/old/jcifs-1.3.18b.jar
>
> Although I cannot recall what it actually does anymore it might be
> worth a try. We never received feedback about it.
>
> Mike
>
> On Sun, Jun 14, 2015 at 5:00 AM, Conrad Herrmann <conrad at primadesk.com>
> wrote:
>> Martin,
>>
>>
>>
>> I have recently run into the same problem.
>>
>>
>>
>> I think the problem is that SmbFile.resolveDfs() uses the currently
>> connected transport as the DFS resolver/domain server
>> (tree.session.transport.tconHostName), but it is very possible that
>> there is
>> no currently connected transport. In your case, that happens when the
>> file
>> server forces the TCP connection to close, and the transport tears
itself
>> down.
>>
>>
>>
>> And, although the resolveDfs() method calls connect0(), in fact this
does
>> nothing because doConnect() doesn't force creation of a new connection
>> if we
>> are talking about a DFS resolved path.
>>
>>
>>
>> It seems to me that in that case, we have to start over again at the
top
>> of
>> the referral tree, with the Domain.
>>
>>
>>
>> So my solution has this: change the code for SmbFile.resolveDfs()
lines
>> 671
>> (or so) so that it says:
>>
>>
>>
>> connect0();
>>
>>
>>
>>> String hostName = tree.session.transport.tconHostName;
>>
>>> String domainDfsServerName = getServerWithDfs();
>>
>>> if (hostName == null)
>>
>>> hostName = domainDfsServerName;
>>
>>
>>
>> DfsReferral dr = dfs.resolve(
>>
>>> hostName,
>>
>> tree.share,
>>
>> unc,
>>
>> auth);
>>
>>
>>
>> The code comes from the other use of tconHostName, in
>> SmbFile.doConnect():
>>
>> String hostName = getServerWithDfs();
>>
>> tree.inDomainDfs = dfs.resolve(hostName, tree.share, null,
auth)
>> !=
>> null;
>>
>> In this code, we are getting the DFS resolver (which might be the
domain
>> server) as the hostName, and asking it to resolve our share.
>>
>>
>>
>> Basically what this new code is saying is that:
>>
>> - in the case where the transport has closed (ie, because of a timeout
or
>> TCP close on the DFS server side) reconnect to the DFS domain server in
>> order to resolve a share's DFS server.
>>
>>
>>
>> I can imagine a case where this doesn't work--if we have multiple
levels
>> of
>> DFS redirection, where the domain server cannot redirect the client to
a
>> deep subdirectory. But, I don't even know if this is possible in DFS.
>> If
>> it is, then at least this solution removes the top level case and
>> identifies
>> the problem, which would require walking down the DFS resolution path
to
>> resolve the actual file server.
>>
>>
>>
>> Conrad Herrmann
>>
>> Primdaesk, Inc.
>>
>>
>>
>>> Hi,
>>
>>>
>>
>>> I encountered a NullpointerException similar to
>>
>>> https://lists.samba.org/archive/jcifs/2012-January/009856.html - at
>>
>>> least the stack traces are similar.
>>
>>>
>>
>>> My Environment (Client side):
>>
>>> - jcifs 1.3.18
>>
>>> - IBM JDK 7
>>
>>> - AIX 7.1
>>
>>>
>>
>>> The NPE occured in a (in-house) plugin for the Jenkins build server.
In
>>
>>> this system, JCIFs is used to recursively copy files from a Windows
>>
>>> share to an AIX machine.
>>
>>>
>>
>>> Re-running a build shortly after it finished triggered the NPE.
>>
>>>
>>
>>> After some debugging, it seems to me like the SmbFile’s underlying
>>
>>> transport is closed (by timeout), and when SmbFile.resolveDfs is
called,
>>
>>> the transport is not reconnected (unlike, for example, later in
>>
>>> SmbFile.resolve, or in SmbSession.getChallenge).
>>
>>>
>>
>>> I was able to reproduce the NPE during debugging using the following
>>
>>> steps:
>>
>>> - Trigger a build (recursively copying from a CIFS DFS tree)
>>
>>> - Wait until the transport objects disconnect by timeout (tracked by
>>
>>> breakpoint)
>>
>>> - Retrigger the build (recursively copying the same directory
structure)
>>
>>>
>>
>>> The Jenkins plugin usually runs the second JCIFS copy operation in the
>>
>>> same thread than the first (though that's not guaranteed).
>>
>>> Each run uses a new SmbFile object.
>>
>>>
>>
>>> Am I missing something (like some close operation on SmbFile)?
>>
>>> Is this a known error?
>>
>>> Can I do something to fix it?
>>
>>>
>>
>>> Best regards,
>>
>>>
>>
>>> Martin
>>
>>>
More information about the jCIFS
mailing list