[jcifs] URL encoding

Michael B Allen mballen at erols.com
Fri Mar 8 19:08:34 EST 2002


On Fri, 08 Mar 2002 16:04:10 +0900
"Talbot David" <chukhonets at hotmail.com> wrote:

> Hello
> 
> This is my first direct posting to the list. I've been following it for a 
> while and just started delving into the API.
> 
> Come up against a number of strange things with URL encodings of non-ascii 
> characters. Both jcifs server and the NT server are using the same charset.
> 
> The code below doesn't give any errors but the result is wrong: it says the 
> directory refered to by "smb://share/path" is empty and it isn't.It is 
> recognizing that the path is correct or not (throwing file not found 
> exception when the unicode path is set incorrectly.)
> 
> String path="\u3042";
> String encoded = URLEncoder.encode(path);
> SmbFile file = new SmbFile( "smb://share/"+encoded");
> SmbFile [] files = file.listFiles();
> 
>     for( int i = 0; i < files.length; i++ ) {
>        System.out.print( " " + files[i] );
>       }
> 
> Also when the non-ascii section to the path is higher up the tree jcifs 
> seems to be able to recognize whether the file/directory actually exists or 
> not but doesn't return any contents to the directoy. Always one file.
> 
> Other examples:
> Calling isDirectory() on an SmbFile with a non-ascii section to the path 
> returns true even for a file.
> 
> Calling listFile() on the same SmbFile returns one SmbFile with a 
> non-existant path where the final part of the path is just repeated 
> (smb://share/path/path)
> 
> What should I look at next to see what's going on? Or should I just steer 
> clear of non-ascii URLs for the time being?

I'm not really sure what the problem is with each of these but it sounds
like the encoding might just be wrong and therefore the paths are just
wrong.  Try the SmbCrawler and see how it spits out those paths.  You have
to encode the path in a way that can be URLDecoder.decode'd to generate
the right Unicode representation according to the server. This should
simply be a matter of iterating over each component in the path after
the share and URLEncoder.encodeing them. Also, this assumes the server
is using Unicode. You can determine this with a NetMon trace and more
specifically by examining the flags field of any SMB in a -Dlog=ALL trace.

Anyway, all of this is somewhat moot because the URL encoding in jCIFS
is wrong. There was just a big dicussion about this in which it was
determined that encoding the path is not necessary with the exception of
two or three particular characters. That would solve this problem your
having provided the server your running against really is using Unicode
(vanilla NT in an en locale should).

Sorry, but I think you'll just have to wait for proper SMB URL encoding ...

Mike

-- 
May The Source be with you.

#################################################################
#################################################################
#################################################################
#####
#####
#####
#################################################################
#################################################################
#################################################################




More information about the jcifs mailing list