[jcifs] Re: directory must end with '/'
Michael B Allen
mba2000 at ioplex.com
Thu Dec 9 08:35:32 GMT 2004
On Thu, 9 Dec 2004 00:43:37 -0600
"Christopher R. Hertel" <crh at ubiqx.mn.org> wrote:
> The RFC's grammar includes the following:
> path_segments = segment *( "/" segment )
> The thing to notice (the point that got this conversation going) is that
> the slashes are at the *front* of the authority and of each segment.
> Syntactically, that means that neither the authority section (the
> user at hostname:port part) nor the path segments (the directory names)
> require a trailing slash.
No, no, no, no, NO! The grammer does not apply to a fragment of a URL. The
grammer in 2396 and in your draft is correct. It just doesn't apply to an
incomplete URL. A complete URL is:
The 2396 grammer is completely in line with this. The following is NOT a
valid 2396 URL:
It is mearly a fragment of a URL but it so happends that some applications
operate on URL fragments. JCIFS interprets such URLs as a directory.
> I have a lot of comments inline below, but the results are...
> 1) Note that the segment is made up of zero or more pchars followed by
> zero or more (";" param)'s. That means... a segment can be empty.
> The significance is that trailing slashes are permitted by the syntax.
> That's important, because (as we all know) they're used all over the
> 2) Web browsers commonly add a trailing slash after the authority section
> or a directory name if there isn't one there already. This is the
> quirk we're trying to work out, I believe. I don't really care whether
> jCIFS does this or makes it easy for the calling application to do
> this. The point is, it needs to be done. I *do* care that the user of
> an application built on jCIFS shouldn't have to do this manually
> (example programs don't count).
Negative. Web browsers do not add a trailing slash after anything. That is a
URL given to it in response to an HTTP redirect response. The browser is
simply displaying what it was given.
> 3) The point of adding the "missing" trailing slash appears to be twofold.
> First, it deals with an oddity and/or ambiguity in RFC2396. The RFC
> doesn't clearly explain how to deal with an authority section that
> doesn't have a trailing slash. Also, the RFC is dealing with URI
> strings in very general terms and doesn't distinguish between
> directories and files, so if there is no trailing slash it instructs
> us to remove the final segment. That would give a very unexpected
> result, so that's not what we want to do.
> > Web browsers
> > will interpret an HTTP redirect and display the URL to which the browser
> > has been redirected thus providing the appearance that a trailing slash
> > has been "added".
> Well, the trailing slash *was* added... by the server in this case. It
> sends back the modified URI and the client retries without ever annoying
> the user.
Not really. The server said "Error: the page you requested has been
perminently moved to <new URL>".
> I don't know why the trailing slash is required by HTTP. Perhaps it is
> something in the HTTP specifications. The SMB URI specifications are
Because if you didn't and you navigated to http://server/path/dir and
clicked on a link <a href="page.html" you would get
http://server/path/page.html whereas if you had http://server/path/dir/
you'll get what you really want which is http://server/path/dir/page.html.
> built from the RFCs mentioned above, and those RFCs clearly show leading
> slashes, not trailing slashes, in the grammar and in some of the examples.
Again, the grammer is only for complete URLs. This is the key thing to
> does it. I think the only area where Mike and I disagree on this is in
> how it gets handled by jCIFS.)
No, we disagree on this:
which is what I thought we were *really* talking about. This representation
(that we made up) is intuative but it is NOT a complete grammer and
therefore cannot be compared to the 2396 grammer. It is a condensed bastard
grammer that just shows optional parts of a URL that if left out yeild a
> > and the
> > client cannot unconditionally mask the redirect response because the
> > caller most likely needs to know where it has been redirected.
> What, in this case, constitutes the caller? Sorry, you were talking HTTP
> here and I got a little lost...
The caller would be the web browser. I was just pointing out that the HTTP
client cannot transparently reinitate the GET request to the new URL. That's
basically what you are suggesting I do with jCIFS and I want to make it
clear that HTTP does not exhibit that behavior.
> > > That in mind, I still feel that jCIFS could easily handle the semantic
> > > issues just as it handles the semantic differences between a server
> > > and a workgroup.
> > Actually the more I think about this it's totally impractical to do this
> > with JCIFS because URLs are immutable and are not resolved until they
> > are used.
> One part that I was missing here, and have since figured out, is that
> you're not even going over the wire in some cases. For example, if I try
> to .list() on an SmbFile that is based on an SMB URI that has no trailing
> slash... then I get the exception before any network activity occurs at
Well that's because I know list() only applies to a directory so it's a good
place to check for the '/'. If you try to do exists() it's not possible to
tell without going to the wire.
> Makes sense. If you assign semantic meaning to a trailing slash then the
> lack of the trailing slash would indicate a file, not a directory (or
> would indicate "ambiguous"). The .list() method isn't defined for a
> non-directory, so I see why it throws an exception.
But again the exception is not thrown universally. Which is a bug in itself.
It should either be thrown consistantly or not at all. Unfortunately neither
of those cases is acceptible.
> Perhaps the most important thing to do is to explain to coders using jCIFS
> exactly what is going on and why jCIFS behaves as it does. To anyone
> who's used a web browser for more than a week, modifications to the URI in
> their Location bar seem natural. It's a surprise to get an error message
> that tells you you've got to be picky about things like adding a trailing
JCIFS is a low-level client library. A low level HTTP client library would
behave the same (not automatically "add a trailing slash").
> > is a non-standard permutation of command line option definitions that
> > was made up by someone during the original SMB URL discussion on
> > samba.technical. It is not a real grammer and cannot be compared to the
> > said section in 2396.
> Well, it can be. It's a common short-hand, that's all. In any case,
> there is an RFC 2396-compliant grammar provided in the current Internet
> Draft, and that grammar follows the convention and uses leading slashes,
> not following slashes.
The RFC 2396 grammer is correct. But it only applies to complete URLs that
refer to leaf nodes. Parent nodes (addressed by a parent fragment of the
complete URL) do not apply. So the trailing slash is optional in which case
are legal but which one makes more sense? Would you rather encourage users
> smb:/foo.bar.biz/share + path/to/file
> That makes it a little clearer. The expected result is:
> What you actually get (per RFC 2396's algorithms) is:
Right. And this is exactly what jcifs (actually the java.net.URL class)
> So... (Hoping you've stuck with me this far without blowing a gasket...)
I'm not mad I'm just pulling my hair out trying to get you to see that our
goofy condensed syntactopath cannot be compared to the "path_segments =
segment *( "/" segment )" part in 2396.
> ...and we reach agreement. Kewl.
Greedo shoots first? Not in my Star Wars.
More information about the jcifs