[jcifs] Re: SMB URL encoding/decoding

Sun Feb 24 13:45:54 EST 2002

Christopher R. Hertel wrote:

>James Nord wrote:
>:
>
>>>Perhaps we could just drop the authentication creadentials and pass them
>>>as QUERY_STRING parameters instead.
>>>
>>Oh god I hope not...  All the popular URLs have user:pass at host so can we
>>keep to convention?
>>
>
>We have to.  With the exception of the ":pass", that's the syntax of a
>URL.  The <userinfo> field is commonly parsed as "user:password" (again,
>RFC2396, section 3.2.2).  We could add a query string that supported
>passwords, but we couldn't get rid of the <userinfo> field and just about
>every commercial implementation will parse it into "user:password".
>
>>Just because the tool is there doesn't meen you have to use it...
>>
>
>...and again, the vulnerability is in the publication of the URL reference.
>For example, if I built a web page with the reference:
>
<snip>

I was refering to the tool of it being implemented as opposed to the 
tool being in the URL draft.
I should be more exact.

>
>>>Let's not loose the focus of the real problem here though. SMB servers
>>>allow the '@' sign but the various URL RFC's do not.
>>>
>>As do HTTP servers - FTP servers...  why is the smb url the special case?
>>
>
>It shouldn't be.  The SMB URL should be parsed as any other URL string, and
>then the the resulting subfields should be parsed again in the context of
>the requirements of SMB.
>
See the mail that started this thread.  jCifs then has a url parsing bug.  
I have been mixing implementation with drafts and confusing myself (and 
others?)

URL is smb://HI137/D$/Documents and Settings/ryar/seti at home.txt
java.net.UnknownHostException: home.txt

>So, once we have parsed the URL using the RFC-standard parsing, we might
>have to do our own work to parse the <userinfo> into <username> and
><password>, and further parse <username> to extract an NTDomain name
>(which is an additional level of parsing provided by the SMB URL because
>of the special needs of SMB).
>
>>>If we can circumvent
>>>this one issue it will not be necessary to URL encode/decode anything.
>>>
>>>>As with most parsers, there is an operator hierarchy.  I *think* that the
>>>>'/' character is higher precedence than th '@' but I am not sure, as I
>>>>haven't looked at it recently.
>>>>
>>>I don't see how assigning precidence helps.
>>>
>>ditto.  I think it just makes things more complex.
>>
>
>Quite the opposite, I believe.
>
>
>Unlike C or Java, you can't use parentheses to group subexpressions in a
>URL.  Going back to programming, what does  5 + 4 * 3 mean?  Because of
>operator precedence, you know that it means (5 + (4 * 3)) rather than
>((5 + 4) * 3).
>

>
>
>Likewise with URLs.  The separators are assigned precedence so that you know
>that <scheme>://<server>/<path> breaks into <scheme>, <server>, and <path>. 
>Then you also know that the '@' is significant in <server> but not in the
>others, which solves the problem.
>
I think (see my other email) that I am just taking the word "precedence" 
to mean something slightly different in the overall context - I agree 
with you but see comments in the other mail ;-)

/James