[jcifs] Re: SMB URL encoding/decoding

Sun Feb 24 13:28:04 EST 2002

James Nord wrote:
:
> >Perhaps we could just drop the authentication creadentials and pass them
> >as QUERY_STRING parameters instead.
> >
> Oh god I hope not...  All the popular URLs have user:pass at host so can we
> keep to convention?

We have to.  With the exception of the ":pass", that's the syntax of a
URL.  The <userinfo> field is commonly parsed as "user:password" (again,
RFC2396, section 3.2.2).  We could add a query string that supported
passwords, but we couldn't get rid of the <userinfo> field and just about
every commercial implementation will parse it into "user:password".

> Just because the tool is there doesn't meen you have to use it...

...and again, the vulnerability is in the publication of the URL reference.
For example, if I built a web page with the reference:

  smb://crh:foobanana@scred/share/path/file.txt

and then published the web page it's *my* fault that the password
"foobanana" is being published.

On the other hand, if someone types the above URL string into a browser, the
browser should parse the string and handle whatever encryption or key
exchange or whatever the SMB protocol allows (including kerberos).  The
password is only as vulnerable as the protocol makes it in that case.  Also,
if someone gave the URL string:

  smb://crh@scred/share/path/file.txt

to a browser (or other application) and the browser then prompted for a
password, you'd be in the same situation.  So, as you say, the vulnerability
is the in the hands of the user.

> >Let's not loose the focus of the real problem here though. SMB servers
> >allow the '@' sign but the various URL RFC's do not.
> >
> As do HTTP servers - FTP servers...  why is the smb url the special case?

It shouldn't be.  The SMB URL should be parsed as any other URL string, and
then the the resulting subfields should be parsed again in the context of
the requirements of SMB.

So, once we have parsed the URL using the RFC-standard parsing, we might
have to do our own work to parse the <userinfo> into <username> and
<password>, and further parse <username> to extract an NTDomain name
(which is an additional level of parsing provided by the SMB URL because
of the special needs of SMB).

> >If we can circumvent
> >this one issue it will not be necessary to URL encode/decode anything.
> >
> 
> >>As with most parsers, there is an operator hierarchy.  I *think* that the
> >>'/' character is higher precedence than th '@' but I am not sure, as I
> >>haven't looked at it recently.
> >>
> >I don't see how assigning precidence helps.
> >
> ditto.  I think it just makes things more complex.

Quite the opposite, I believe.

Unlike C or Java, you can't use parentheses to group subexpressions in a
URL.  Going back to programming, what does  5 + 4 * 3 mean?  Because of
operator precedence, you know that it means (5 + (4 * 3)) rather than
((5 + 4) * 3).

Likewise with URLs.  The separators are assigned precedence so that you know
that <scheme>://<server>/<path> breaks into <scheme>, <server>, and <path>. 
Then you also know that the '@' is significant in <server> but not in the
others, which solves the problem.

Chris -)-----

-- 
Samba Team -- http://www.samba.org/     -)-----   Christopher R. Hertel
jCIFS Team -- http://jcifs.samba.org/   -)-----   ubiqx development, uninq.
ubiqx Team -- http://www.ubiqx.org/     -)-----   crh at ubiqx.mn.org