[jcifs] Creating file with hash ('#') in filename

Christopher R. Hertel crh at ubiqx.mn.org
Thu Jan 16 13:25:24 EST 2003


On Wed, Jan 15, 2003 at 09:03:09PM -0500, Allen, Michael B (RSCH) wrote:
> 
> 
> > -----Original Message-----
> > From:	Christopher R. Hertel [SMTP:crh at ubiqx.mn.org]
> > 
> > > The java.net.URL class parses the URL and *before* the jcifs.smb.Hanlder
> > > gets it. So the '#ref' is getting picked out. I was just saying perhaps
> > > I can append it back on to create an internal path that retains it.
> > 
> > Only if you want to bypass the standard syntax for URLs.  :)
> > 
> 	You mean HTTP URLs. It's the HTTP URL that uses '#'. We don't have any
> 	use for it.

No, I mean that the parsing of URLs is standard, based on the RFC.  The # 
is defined as part of the syntax of generic URLs.  It's just that the 
semantics have no meaning for the SMB URL.

> > The # (if unescaped) in that position should be a delimiter and the
> > pedantic way to handle it is to cough back an error.
> > 
> 	For HTTP URLs. For SMB URLs this remains to be seen. We cannot
> 	conform to the HTTP URL closely without a cost.

We are trying to conform to the generic specification for URLs.  We have 
(by necessity) overloaded the general form by adding NBT name syntax, and 
further defining subfields within existing fields.  Nothing in the SMB URL 
syntax actually overrides generic URL syntax.

>       The main problem is that
> 	SMB path names need to represent just about any character including
> 	Unicode which we haven't even touched on. I personally do not want to
> 	decode paths. That is very costly.

We only need to unescape them.

>       It is very likely that SMB URLs will contain
> 	reserved characters like space, '@', and '#'. We cannot accept both encoded
> 	and non encoded URLs because URLs returned by jCIFS will need to be
> 	encoded.

If by "encoded" you mean "escaped" (I'm being pedantic).

Think of it as a translation.  There is the name in SMB format and the 
same name in URL format.  In the latter case, characters which are not 
permitted by URL syntax must be escaped.  So, when translating from URL 
format to SMB format, you unescape.  When translating from SMB to URL, you 
gotta escape them again.

>       Even if you pass back whatever was passed in how do you handle
> 	URLs derived from a parent during a list() operation.

Go through the string one character (ASCII or Unicode) at a time and 
rewrite it.

>       It get's very messy.

Two methods:  urlEscape() and urlUnEscape().

>       Can
> 	someone give me a reason why we *have* to require URL encoding of the
> 	path component? Otherwise I think we should punt the '#ref' and just
>       integrate it into the path. Anything we would use it for can be
>       done with a query_string parameter.

The '#' character isn't the only problem.  You could fudge that one.  
There are other characters (eg., spaces) which are not legal URL
characters.  Non-english language characters, for example.

The key thing, though, is that a user may type in an SMB URL with a URL
escape sequence included.

> 	Incedentally speaking of query_string parameters we got lucky with the '?'
> 	character. That *is* reserved in SMB pathnames. It's a wildcard character.
> 	Otherwise we really *would* have to require escaping path components.

We still do.  :)

> 	Anyway it looks like just tacking the '#ref' back onto the path component in
> 	Handler.java is going to do the trick.

...for that *one* case, and it is still a user convenience at the expense 
of correct syntax.

>	NetworkExporer doesn't like it but that's
> 	because the SMB URLs are going through the browser as part of the path. In
> 	this case they *are* HTTP URLs and as such need to be escaped. I'll leave
> 	the NetworkExplorer fixes till later I think.

The HTTP URL is just an instance of a URL.  A descendant type.  The rules 
apply to all URLs.

Sorry.  :(

Chris -)-----

-- 
Samba Team -- http://www.samba.org/     -)-----   Christopher R. Hertel
jCIFS Team -- http://jcifs.samba.org/   -)-----   ubiqx development, uninq.
ubiqx Team -- http://www.ubiqx.org/     -)-----   crh at ubiqx.mn.org
OnLineBook -- http://ubiqx.org/cifs/    -)-----   crh at ubiqx.org



More information about the jcifs mailing list