[jcifs] JCIFS pagination

Tue Oct 28 19:52:13 MDT 2014

On Mon, Oct 27, 2014 at 10:37 AM, Sridhar Jonnalagadda
<jonnalagadda.sridhar at gmail.com> wrote:
> Hi Mike & Chris,
>
> Appreciate your response.
>
> I'm trying to consolidate both of your responses and provide more inputs on
> my requirement.
>
> Based on your feedback, I need to provide more inputs to make it more clear.
>
> Clarifications:
>
> 1. listFiles API will be not be changed.
>
> 2. There will be new API(s) to SmbFile, which takes page size as input and
> return array based on the size. This means page size is not specific to my
> needs. This API can be called multiple time to retrieve page worth of
> details until all elements are retrieved.
>
> 3. Yes, it is specific to my application needs to support pagination. The
> server is handling request from mobile devices. Mobile devices are
> requesting to view contents of SMB share. Since SMB share volumes keeps
> growing and, server need to handle multiple clients concurrently, there
> should be a way to handle thousands of clients by limiting memory usage. To
> do this only option is to support pagination.
>
> 4. Since JCIFS is blocking API, I took care in my application to make it
> look like asynchronous. I was thinking about users out there who might get
> benefit of making it as async library.
>
> 5. For my application sorting is not a concern. Does CIFS / SMB server
> protocol support sorting?
>
> 6. My concern is let say, the share has 1,000 objects and, 1000 devices
> trying to use the API, then all of them together might be using significant
> amount of memory.
>
>
> I changed SmbFile to have three new public APIs. These APIs help me to
> provide iteration behavior and keep memory usage at minimum possible.
>
> 1.  public SmbFile[] getFirstPageContents(final SmbFileFilter
> fileFilter,final SmbFilenameFilter fileNameFilter, int pageSize) throws
> SmbException
> 2.  public SmbFile[] getNextPageContents() throws SmbException
> 3.  public void endPaginationRequest()
>
>
> There are couple of additions constructors.
>
> 1.  Trans2FindFirst2( String filename, String wildcard, int
> searchAttributes, int pageSize )
> 2. Trans2FindNext2(final  int sid, final int pageSize, final int resumeKey,
> final String filename )

Hi Sridhar,

I don't think that would work. The Trans2Find{First,Next}2 commands
might look like they can be used for "paging" as you describe but that
is not the intent. The indent of those commands is simply to buffer
chunks of directory entries efficiently (because the disk is slow
compared to the client code). You cannot just leave the list handle
open for a long time and then expect to retrieve another "page" 10
minutes later. The connection would get closed, the list handle would
be invalidated because objects changed, the server would hang as it
struggles to resurrect the old list handle, etc. And you cannot use
the resumeKey to initiate a new request with an offset > 0. That
resumeKey has to be the resumeKey supplied in the previous command.

The only vaguely practical way to implement what you describe is to
just list everything and collect the desired segment of files using
FileFilter.accept(). And the implementation would be trivial. You
would just count and return false until you reach the start of the
"page", then return true until you reach the end of the page and then
return false for everything else. Then SmbFile.list() will return the
desired "page" of SmbFile[]. It's so mind numbingly elegant I honestly
don't know why you would want to do it any other way.

You should also realize that the network and the server's disk are
going to be a lot slower than anything JCIFS is doing. Meaning JCIFS
is going to spend a lot of time just waiting for
Trans2Find{First,Next}2 responses. So your optimization would have
almost no practical effect.

But returning false from FileFilter.accept() would save memory because
JCIFS will not convert the Trans2Find{First,Next} response data to an
SmbFile when accept() returns false. Creating objects in Java is
expensive. So if you want to save memory because many mobile devices
are "paging" through objects, using FileFilter.accept() would be
worthwhile.

Unlike a lot of Java code, JCIFS is very efficient. Share a drive with
lots of files and run the multi-threaded crawler example against it.
Run it a few times so that the server cache is hot. I bet you could
every object on the whole machine in under 30 seconds. When I wrote
that threaded crawler example (10 years ago), I listed every object on
my NT4 workstation in 10 seconds.

> At present I'm testing the changes and would like to contribute this change.
> Please let me know the process.

Create a new package like jcifs-paging-1.3.17.zip and put it on github
and then post a link here. Linux style development where everyone just
pushes their own complete stand-alone package that is ready-to-run is
vastly superior to mucking about with patches and version control
systems. If what you have done is really good, people will use your
package. Then maybe we'll have to consider what you have done.

Mike