[jcifs] Re: SMB URL encoding/decoding
Christopher R. Hertel
crh at ubiqx.org
Mon Feb 25 11:39:46 EST 2002
James Nord wrote:
:
> Why is
>
> smb://HI137/D$/Documents%20and%20Settings/ryar/seti@home.txt
>
> incorect? @ is a valid pchar.
Because it's a typo. Ooops. ;)
Only the spaces needed to be escaped.
> We could alwasy escape all the valid characters-
>
> smb://HI137/D$/%44%6f%63...
>
> I see no reason why that is more or less correct?
Right. You could escape every character. Remember, though, that we are
talking about people typing the URL string at a command prompt or a browser
window or somesuch.
> >RFC2396 goes into detail regarding
> >the use of whitespace in a URL string (they don't like it), but many
> >browsers will accept the spaces anyway. (Just as many browsers will
> >accept really bad HTML code and render it anyway...browsers are in the
> >business of making things easy when they can.)
> >
> Always be strict on forming and forgiving on parsing.
Yep.
> > A note... I looked all through the RFC and found nothing about
> > translating the '+' into a space. Annoying, as I know it was there
> > in the early days.
> >
> Is this not just special for form data in HTTP? I don't recall seeing
> this escape sequence in a path before. (not really important now in any
> case)
You may be very right. I know that the + is sometimes used, but I can't
remember where or why. It's a big "I haven't a clue" from me on this one.
> >So the problem is that you need to escape a ';' if you want to use it
> >in a path, but not if you use it in <userinfo>, and you have to escape
> >an '@' if you want to use it in <userinfo> but not in the path.
> >
> And then the domain bit falls over as it is permited to have many ;'s in
> there :-(
Um, no... The <userinfo> field allows as many ';'s as you like, but the SMB
URL specifies a new (descendent) syntax for the <userinfo> field that makes
the ';' a delimiter within that field.
So, we replace:
userinfo = *( unreserved | escaped |
";" | ":" | "&" | "=" | "+" | "$" | "," )
with:
userinfo = user [ ":" password ]
user = [ ntdomain ";" ] username
and then to be pedantic we would specify the valid characters for
username, password, and ntdomain.
> >It may contain anything that <userinfo> may contain, including a colon.
> >The *first* colon in <userinfo> is used as a delimiter.
> >
> But the *smb url draft* does not allow for more than one colon in the
> userinfo part.
>
> smb_server = [ [ smb_userinfo "@" ] smb_srv_name [ ":" port ] ]
>
> smb_userinfo = [ ntdomain ";" ] username [ ":" password ]
> ntdomain = *( unreserved | escaped |
> "&" | "=" | "+" | "$" | "," )
> username = *( unreserved | escaped |
> "&" | "=" | "+" | "$" | "," )
> password = *( unreserved | escaped |
> "&" | "=" | "+" | "$" | "," )
Yes. You're right. I did a better job in the draft than in my e'mail.
:)
> Ah, doesn't the draft have precedence in this case over the URL RFC as
> it is more explicit.
> (yes draft changing work in progress etc...) but I always look at the
> most specific ;-)
I think of the draft as defining a descendant type. Again, you're correct
here.
Mike wrote:
>
> Correction: An unescaped '%' would also cause it to fail because we
> have to URL decode regardless of how forgiving the parser is and the
> '%' might be interpreted as the beginning of an escape.
Yeah. I like James' comment about when to be strict and when to be
forgiving.
Chris -)-----
--
Samba Team -- http://www.samba.org/ -)----- Christopher R. Hertel
jCIFS Team -- http://jcifs.samba.org/ -)----- ubiqx development, uninq.
ubiqx Team -- http://www.ubiqx.org/ -)----- crh at ubiqx.mn.org
More information about the jcifs
mailing list