[jcifs] NullPointerException in SmbFile.resolveDfs()

Brett Johnson brett.michael.johnson at gmail.com
Tue Nov 20 13:16:12 MST 2012


Here are the thread dumps from the deadlock in Dfs.
As I mentioned, they were nearly identical to those
posted by Xavier Roche.

Excerpt from 
http://code.google.com/p/google-enterprise-connector-file-system/source/detail?r=541
The diffs for the fix are visible from a link at 
the bottom of that page.

These are the deadlocked threads from a customer thread dump:

Thread 28 calls SmbSession.send(), which locks the SmbTransport that it uses.
That SmbTransport then tries to call Dfs.insert() a synchronized method, which
attempts to lock the Dfs instance.

Thread 37 calls Dfs.resolve() a synchronized method, which locks the Dfs
instance.
It then eventually calls SmbTransport.getSmbSession(), which attempts to lock
the SmbTransport instance.

Thread 28: (state = BLOCKED)
 - jcifs.smb.Dfs.insert(java.lang.String, jcifs.smb.DfsReferral) @bci=0,
line=289 (Compiled frame)
 - jcifs.smb.SmbTransport.checkStatus(jcifs.smb.ServerMessageBlock,
jcifs.smb.ServerMessageBlock) @bci=210, line=556 (Compiled frame)
 - jcifs.smb.SmbTransport.send(jcifs.smb.ServerMessageBlock,
jcifs.smb.ServerMessageBlock) @bci=297, line=640 (Compiled frame)
 - jcifs.smb.SmbSession.send(jcifs.smb.ServerMessageBlock,
jcifs.smb.ServerMessageBlock) @bci=135, line=238 (Compiled frame)
 - jcifs.smb.SmbTree.send(jcifs.smb.ServerMessageBlock,
jcifs.smb.ServerMessageBlock) @bci=405, line=119 (Compiled frame)
 - jcifs.smb.SmbFile.send(jcifs.smb.ServerMessageBlock,
jcifs.smb.ServerMessageBlock) @bci=11, line=777 (Compiled frame)
 - jcifs.smb.SmbFile.queryPath(java.lang.String, int) @bci=78, line=1365
(Compiled frame)
 - jcifs.smb.SmbFile.exists() @bci=143, line=1420 (Compiled frame)
 - jcifs.smb.SmbFile.isDirectory() @bci=14, line=1494 (Compiled frame)

Thread 37: (state = BLOCKED)
 - jcifs.smb.SmbTransport.getSmbSession(jcifs.smb.NtlmPasswordAuthentication)
@bci=0, line=126 (Compiled frame)
 - jcifs.smb.SmbTransport.getDfsReferrals(jcifs.smb.NtlmPasswordAuthentication,
java.lang.String, int) @bci=2, line=701 (Compiled frame)
 - jcifs.smb.Dfs.getReferral(jcifs.smb.SmbTransport, java.lang.String,
java.lang.String, java.lang.String, jcifs.smb.NtlmPasswordAuthentication)
@bci=71, line=143 (Interpreted frame)
 - jcifs.smb.Dfs.resolve(java.lang.String, java.lang.String, java.lang.String,
jcifs.smb.NtlmPasswordAuthentication) @bci=132, line=191 (Compiled frame)
 - jcifs.smb.SmbFile.doConnect() @bci=83, line=904 (Compiled frame)
 - jcifs.smb.SmbFile.connect() @bci=19, line=956 (Compiled frame)
 - jcifs.smb.SmbFile.connect0() @bci=1, line=882 (Compiled frame)
 - jcifs.smb.SmbFile.queryPath(java.lang.String, int) @bci=1, line=1337
(Compiled frame)
 - jcifs.smb.SmbFile.exists() @bci=143, line=1420 (Compiled frame)
 - jcifs.smb.SmbFile.isDirectory() @bci=14, line=1494 (Compiled frame)

This change resolves the deadlock by replacing Dfs' synchronized
methods with finer-grained locks on the two caches in Dfs.
Dfs.insert() only needed one of the caches and while Dfs.resolve()
uses both, it does so in two distinct blocks of code.

This change also makes Dfs.isTrustedDomain() thread-safe, where
it was not previously.  It reduces the scope of Dfs.getTrustedDomains()
from public to private, as it could not be make thread-safe without
switching to ConcurrentHashMap.

--
Brett M. Johnson


On Nov 20, 2012, at 7:39 AM, Michael B Allen <ioplex at gmail.com> wrote:

> On Wed, Nov 14, 2012 at 2:09 PM, Brett Johnson
> <brett.michael.johnson at gmail.com> wrote:
>> Mike,
>> 
>> We are using jCIFS v1.3.17.
>> 
>> I searched for "jcifs tconHostName NullPointerException"
>> and did not find it was fixed.  Several people had encountered
>> the problem.  Gabor Herr had a proposed patch that seemed to
>> work for him and another person, but you did not like it, and
>> left it as a TODO to fix it right.
>> 
>> Most users complained of encountering the NPE during listFiles.
>> We encountered it when closing an InputStream.  Our application
>> work-around was to catch the exception and ignore it, since we
>> had consumed the input stream and didn't care.
> 
> Hi Brett,
> 
> Ok, I have added your message as a reference to the TODO list item
> from Gabor Herr (although it's not crystal clear to me that it's the
> same problem).
> 
>> Coincidentally, the google search also returned this page:
>> http://blog.gmane.org/gmane.network.samba.java/month=20120301
>> 
>> We have also encountered this deadlock (with a different customer).
>> Although our app is multi-threaded, we were *not* using the same
>> SmbFile in different threads.  The deadlock occurred when accessing
>> two different files.  I fixed it in a manner nearly identical to
>> that proposed by Xavier Roche, and the customer reports no more
>> deadlock after a week of testing.
> 
> If you find what you think is a deadlock, it is customary to post a thread dump.
> 
> If someone posts "this is what I did and it fixes it so you should
> apply this patch" it will almost certainly not be applied. It will be
> treated simply as a data point to be considered in a real diagnosis. I
> do not blindly apply patches from users (no one should) and it is my
> experience that patches from users are almost always incorrect in some
> way. It might look like it fixes their particular problem and it may
> even be correct in this case but it doesn't prove that it is right. If
> I blindly applied patches from user's, JCIFS would slowly degenerate
> into a game of Whac-A-Mole like every other open source project out
> there.
> 
> Note that Xavier's deadlock post is already on the TODO. If you
> provide a thread dump I will add your post (or a reference to yours if
> it's the same deadlock).
> 
> Mike
> 
>> On Nov 14, 2012, at 7:08 AM, Michael B Allen <ioplex at gmail.com> wrote:
>> 
>>> Hi Brett,
>>> 
>>> I think this has been fixed long ago (Google for "jcifs tconHostName
>>> NullPointerException").
>>> 
>>> What version of JCIFS are you using?
>>> 
>>> Mike
>>> 
>>> On Mon, Oct 15, 2012 at 6:09 PM, Brett Johnson
>>> <brett.michael.johnson at gmail.com> wrote:
>>>> One of our customers have encountered a NullPointerException
>>>> thrown from deep within jCIFS while closing an input stream:
>>>> 
>>>> From a customer log:
>>>> Exception in thread "..." java.lang.NullPointerException
>>>> at jcifs.smb.SmbFile.resolveDfs(SmbFile.java:668)
>>>> at jcifs.smb.SmbFile.send(SmbFile.java:770)
>>>> at jcifs.smb.SmbFile.close(SmbFile.java:1018)
>>>> at jcifs.smb.SmbFile.close(SmbFile.java:1024)
>>>> at jcifs.smb.SmbFile.close(SmbFile.java:1028)
>>>> at jcifs.smb.SmbFileInputStream.close(SmbFileInputStream.java:104)
>>>> at java.io.BufferedInputStream.close(Unknown Source)
>>>> at com.google.enterprise.connector.filesystem.SmbInputStream.close(SmbInputStream.java:64)
>>>> at com.google.enterprise.connector.util.BasicChecksumGenerator.getDigest(BasicChecksumGenerator.java:117)
>>>> at com.google.enterprise.connector.util.BasicChecksumGenerator.getChecksum(BasicChecksumGenerator.java:131)
>>>> at com.google.enterprise.connector.filesystem.FileInfoCache.getChecksum(FileInfoCache.java:59)
>>>> at com.google.enterprise.connector.filesystem.FileDocumentSnapshot$GetUpdateSupport.getUpdate(FileDocumentSnapshot.java:384)
>>>> at com.google.enterprise.connector.filesystem.FileDocumentSnapshot.getUpdate(FileDocumentSnapshot.java:141)
>>>> at com.google.enterprise.connector.filesystem.FileDocumentSnapshot.getUpdate(FileDocumentSnapshot.java:35)
>>>> 
>>>> Looking more closely at SmbFile, the NPE is thrown from this statement:
>>>>  DfsReferral dr = dfs.resolve(
>>>>      tree.session.transport.tconHostName,
>>>>      tree.share,
>>>>      unc,
>>>>      auth);
>>>> 
>>>> I know dfs itself is not null, since it is statically initialized.
>>>> So the suspects are tree, tree.session, tree.session.transport.
>>>> Immediately before this statement is a call to connect0.  Examining
>>>> that code, tree starts out null but I don't think tree can be null after
>>>> a successful call to connect().
>>>> 
>>>> So that leaves tree.session and tree.session.transport.  Since
>>>> session is a constructor arg for SmbTree, that probably is not
>>>> null.  That leaves only SmbSession.transport.  When I read SmbSession,
>>>> transport is initialized lazily and is always fetched via the
>>>> SmbSession.transport() method.  I think that is the problem here.
>>>> I think JCIFS could be patched as follows (note transport() method call):
>>>> 
>>>>       DfsReferral dr = dfs.resolve(
>>>>                   tree.session.transport().tconHostName,
>>>>                   tree.share,
>>>>                   unc,
>>>>                   auth);
>>>> 
>>>> However, even after this, tconHostName is very likely to be null in the
>>>> current failure case, since it is populated by a call to transport.doConnect().
>>>> If domain is null, dfs.resolve() will throw NPE when it dereferences
>>>> domain (which it does several times).  At this point, I don't know
>>>> enough about DFS and JCIFS to know how to fix the bug, so I modified
>>>> com.google.enterprise.connector.filesystem.SmbInputStream.close() to
>>>> catch NullPointerException and rethrow as IOException.
>>>> 
>>>> --
>>>> Brett M. Johnson
>>>> 
>>>> 
>>>> 
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> Michael B Allen
>>> Java Active Directory Integration
>>> http://www.ioplex.com/
>> 
> 
> 
> 
> -- 
> Michael B Allen
> Java Active Directory Integration
> http://www.ioplex.com/



More information about the jCIFS mailing list