[jcifs] Re: SMB URL encoding/decoding
James Nord
teilo at teilo.net
Sun Feb 24 13:28:55 EST 2002
Christopher R. Hertel wrote:
>Michael B Allen wrote:
>
>>On Sat, 23 Feb 2002 15:10:11 -0600
>>"Christopher R. Hertel" <crh at ubiqx.org> wrote:
>>
>>>Well, the first comment is that the SMB URL needs to meet the
>>>requirements of a URL string as described in RFC 2396. There are
>>>two reasons for this. The first is that IETF will never accept it if
>>>it doesn't, and the second is
>>>
>>Why wouldn't they if it followed the ftp RFC James is referencing?
>>
>
>I didn't say it wouldn't. I was speaking generally.
>
>>>that most of the browsers out there have generic URL parsers. Java
>>>also has a generic URL parser, IIRC.
>>>
>>But it's very limited. You have to pretty much override it entirely with
>>the exception of the most trivial of URLs. It only decodes the host,
>>port, file (path), and ref components and it will certainly fail to do
>>even that with anything that has arbitrary characters in the password
>>of some auth info.
>>
>> http://java.sun.com/products/jdk/1.2/docs/api/java/net/URL.html
>>
>
>Which arbitrary characters? This is where precedence comes in. If you
>parse based on the '/' character first (and escape any '/' characters in the
><server> string) then you will never get confused by '@'s elsewhere in the
>URL.
>
>I will try to get time to look at the link above.
>
>>Perhaps we could just drop the authentication creadentials and pass them
>>as QUERY_STRING parameters instead.
>>
>
>No, because the generic URL/URI syntax already supports the use of
>credentials as part of the <server> field. RFC2396:
>
>
>3.2.2. Server-based Naming Authority
>
> URL schemes that involve the direct use of an IP-based protocol to a
> specified server on the Internet use a common syntax for the server
> component of the URI's scheme-specific data:
>
> <userinfo>@<host>:<port>
>
> where <userinfo> may consist of a user name and, optionally, scheme-
> specific information about how to gain authorization to access the
> server. The parts "<userinfo>@" and ":<port>" may be omitted.
>
> server = [ [ userinfo "@" ] hostport ]
>
> The user information, if present, is followed by a commercial at-sign
> "@".
>
> userinfo = *( unreserved | escaped |
> ";" | ":" | "&" | "=" | "+" | "$" | "," )
>
> Some URL schemes use the format "user:password" in the userinfo
> field. This practice is NOT RECOMMENDED, because the passing of
> authentication information in clear text (such as URI) has proven to
> be a security risk in almost every case where it has been used.
>
>
>Now, as you point out, the RFC does not recommend the inclusion of the
>password field. I agree (and put notes about it in the draft) but I
>seriously doubt that the commercial implementors would be willing to drop
>the password string.
>
And I don't think it should be dropped either, it makes it nicer for
spiders to just get a url and that will get them the file, and not have
to get a url and then - oh get the password.
>>Let's not loose the focus of the real problem here though. SMB servers
>>allow the '@' sign but the various URL RFC's do not. If we can circumvent
>>this one issue it will not be necessary to URL encode/decode anything.
>>
>
>The '@' sign has special meaning in the <server> field. Outside of the
>server field, it does not have meaning. More below...
>
Exactly
>
>>>As with most parsers, there is an operator hierarchy. I *think* that
>>>the '/' character is higher precedence than th '@' but I am not sure,
>>>as I haven't looked at it recently.
>>>
>>I don't see how assigning precidence helps.
>>
>
>Consider:
>
>ftp://user:pass@foo.com/some/dir/im@home/text.txt
>
>If the '/' has precedence over '@' then the parse tree is (roughly):
>
> ["://"]
> / \
> [scheme: "FTP"] ['/']
> / \
>[server: "user:pass at foo.com"] [abs_path: "some/dir/im at home/text.txt"]
>
>
>The <abs_path> then parses into <path_segments> and the '@' in "im at home"
>is protected. There is no way to confuse it with the '@' in the <server>
>field.
>
>I did sort-cut some of the syntax in the above, but the RFC makes it pretty
>clear. The '/' has higher precedence than the '@', so the <server> field is
>separated from the <path> before you even look for the '@'.
>
No I don't reasoning here (although the outcome is the same).
Neither / nor @ have precedence over anything. It is just that is the
only legal combination.
it is not legal to have ftp://user:pass@foo@foo:21/some/thing/else@/here
As we are only allowed zero or one @'s between the first 2 /'s and the
next /
and if that @appears we are only alowed zero or 1 ':' between the
second / and the first @
Maybe it is my interpritation of the word precedence (to mean rank
higher) which is getting in my way, but I would not say that / comes
before @
to say that implies (to me at least) you could have
ftp://user:pass@foo@foo:21/some/thing/else@/here
>
>Am I making any sense?
>
I think we may be arguing the same, but from the earlier posts I thought
we where slightly different.
More information about the jcifs
mailing list