[jcifs] Re: directory must end with '/'

Michael B Allen mba2000 at ioplex.com
Thu Dec 9 08:35:32 GMT 2004


On Thu, 9 Dec 2004 00:43:37 -0600
"Christopher R. Hertel" <crh at ubiqx.mn.org> wrote:

> The RFC's grammar includes the following:
> 
>   path_segments = segment *( "/" segment )
> 
> The thing to notice (the point that got this conversation going) is that 
> the slashes are at the *front* of the authority and of each segment.  
> Syntactically, that means that neither the authority section (the 
> user at hostname:port part) nor the path segments (the directory names) 
> require a trailing slash.

No, no, no, no, NO! The grammer does not apply to a fragment of a URL. The
grammer in 2396 and in your draft is correct. It just doesn't apply to an
incomplete URL. A complete URL is:

smb://server/share/path/file

The 2396 grammer is completely in line with this. The following is NOT a
valid 2396 URL:

smb://server/share/path/

It is mearly a fragment of a URL but it so happends that some applications
operate on URL fragments. JCIFS interprets such URLs as a directory.

> I have a lot of comments inline below, but the results are...
> 
> 1) Note that the segment is made up of zero or more pchars followed by 
>    zero or more (";" param)'s.  That means... a segment can be empty.
>    The significance is that trailing slashes are permitted by the syntax.
>    That's important, because (as we all know) they're used all over the 
>    place.

Ok.

> 2) Web browsers commonly add a trailing slash after the authority section 
>    or a directory name if there isn't one there already.  This is the 
>    quirk we're trying to work out, I believe.  I don't really care whether
>    
>    jCIFS does this or makes it easy for the calling application to do
>    this.  The point is, it needs to be done.  I *do* care that the user of
>    
>    an application built on jCIFS shouldn't have to do this manually
>    (example programs don't count).

Negative. Web browsers do not add a trailing slash after anything. That is a
URL given to it in response to an HTTP redirect response. The browser is
simply displaying what it was given.

> 3) The point of adding the "missing" trailing slash appears to be twofold.
>    First, it deals with an oddity and/or ambiguity in RFC2396.  The RFC 
>    doesn't clearly explain how to deal with an authority section that
>    doesn't have a trailing slash.  Also, the RFC is dealing with URI 
>    strings in very general terms and doesn't distinguish between 
>    directories and files, so if there is no trailing slash it instructs 
>    us to remove the final segment.  That would give a very unexpected 
>    result, so that's not what we want to do.

Ok.

> > Web browsers
> > will interpret an HTTP redirect and display the URL to which the browser
> > has been redirected thus providing the appearance that a trailing slash
> > has been "added".
> 
> Well, the trailing slash *was* added... by the server in this case.  It 
> sends back the modified URI and the client retries without ever annoying 
> the user.

Not really. The server said "Error: the page you requested has been
perminently moved to <new URL>".

> I don't know why the trailing slash is required by HTTP.  Perhaps it is
> something in the HTTP specifications.  The SMB URI specifications are

Because if you didn't and you navigated to http://server/path/dir and
clicked on a link <a href="page.html" you would get
http://server/path/page.html whereas if you had http://server/path/dir/
you'll get what you really want which is http://server/path/dir/page.html.

> built from the RFCs mentioned above, and those RFCs clearly show leading
> slashes, not trailing slashes, in the grammar and in some of the examples.

Again, the grammer is only for complete URLs. This is the key thing to
understand.

> does it.  I think the only area where Mike and I disagree on this is in 
> how it gets handled by jCIFS.)

No, we disagree on this:

smb://[server/[share/[path/[file.txt]]]]
vs
smb://[server[/share[/path[/file.txt]]]]

which is what I thought we were *really* talking about. This representation
(that we made up) is intuative but it is NOT a complete grammer and
therefore cannot be compared to the 2396 grammer. It is a condensed bastard
grammer that just shows optional parts of a URL that if left out yeild a
parent fragment.

> > and the
> > client cannot unconditionally mask the redirect response because the
> > caller most likely needs to know where it has been redirected.
> 
> What, in this case, constitutes the caller?  Sorry, you were talking HTTP 
> here and I got a little lost...

The caller would be the web browser. I was just pointing out that the HTTP
client cannot transparently reinitate the GET request to the new URL. That's
basically what you are suggesting I do with jCIFS and I want to make it
clear that HTTP does not exhibit that behavior.

> > > That in mind, I still feel that jCIFS could easily handle the semantic
> > > issues just as it handles the semantic differences between a server
> > > and a workgroup.
> > 
> > Actually the more I think about this it's totally impractical to do this
> > with JCIFS because URLs are immutable and are not resolved until they
> > are used.
> 
> One part that I was missing here, and have since figured out, is that 
> you're not even going over the wire in some cases.  For example, if I try 
> to .list() on an SmbFile that is based on an SMB URI that has no trailing 
> slash... then I get the exception before any network activity occurs at 
> all.

Well that's because I know list() only applies to a directory so it's a good
place to check for the '/'. If you try to do exists() it's not possible to
tell without going to the wire.

> Makes sense.  If you assign semantic meaning to a trailing slash then the
> lack of the trailing slash would indicate a file, not a directory (or
> would indicate "ambiguous").  The .list() method isn't defined for a 
> non-directory, so I see why it throws an exception.

But again the exception is not thrown universally. Which is a bug in itself.
It should either be thrown consistantly or not at all. Unfortunately neither
of those cases is acceptible.

> Perhaps the most important thing to do is to explain to coders using jCIFS
> 
> exactly what is going on and why jCIFS behaves as it does.  To anyone 
> who's used a web browser for more than a week, modifications to the URI in
> 
> their Location bar seem natural.  It's a surprise to get an error message 
> that tells you you've got to be picky about things like adding a trailing 
> slash.

JCIFS is a low-level client library. A low level HTTP client library would
behave the same (not automatically "add a trailing slash").

> > is a non-standard permutation of command line option definitions that
> > was made up by someone during the original SMB URL discussion on
> > samba.technical. It is not a real grammer and cannot be compared to the
> > said section in 2396.
> 
> Well, it can be.  It's a common short-hand, that's all.  In any case, 
> there is an RFC 2396-compliant grammar provided in the current Internet 
> Draft, and that grammar follows the convention and uses leading slashes, 
> not following slashes.

The RFC 2396 grammer is correct. But it only applies to complete URLs that
refer to leaf nodes. Parent nodes (addressed by a parent fragment of the
complete URL) do not apply. So the trailing slash is optional in which case
both:

smb://[server/[share/[path/[file.txt]]]]
and
smb://[server[/share[/path[/file.txt]]]]

are legal but which one makes more sense? Would you rather encourage users
to write:

smb://server/share/path
or
smb://server/share/path/

?

>   smb:/foo.bar.biz/share  +  path/to/file
> 
> That makes it a little clearer.  The expected result is:
>   smb://foo.bar.biz/share/path/to/file
> 
> What you actually get (per RFC 2396's algorithms) is:
>   smb://foo.bar.biz/path/to/file

Right. And this is exactly what jcifs (actually the java.net.URL class)
would do.

> So...   (Hoping you've stuck with me this far without blowing a gasket...)

I'm not mad I'm just pulling my hair out trying to get you to see that our
goofy condensed syntactopath cannot be compared to the "path_segments =
segment *( "/" segment )" part in 2396.

> ...and we reach agreement.  Kewl.

Phew.

-- 
Greedo shoots first? Not in my Star Wars.


More information about the jcifs mailing list