[jcifs] URL encoding

david talbot chukhonets at hotmail.com
Sun Mar 10 21:11:28 EST 2002


Running SmbCrawler gives exactly the same results as I got with my own test
class. The Japanese directory.listFiles() just returns a non-existant file
with the same name as the directory. Trying to access that throws a
FileNotFound exception.

I had a go with some Russian share names and they worked fine as you say.
Problem is I'm trying to support Japanese.  As a matter of fact if I stick
to the single byte Japanese characters (rarely used alone in real-life)
jCIFS seems to work okay when client and server machines have the same
default encoding (e.g. Windows MS932/Shift_JIS) and the path is not
url-encoded.

I experimented with a few different share names and found that although when
the directory name is a single double-byte character (e.g. %91%E5), calling
list() only returns a non-existant file with the same name as the directory,
if the directory name is two characters or more i.e. 4+ bytes long ( for
instance %91%E5%8F%AC) then an invalid directory SmbException is thrown.

Doing a log=ALL for list() on the single character directory
(smb://server/share/%91%E5)  the value after "find with path=" is correctly
decoded into the Japanese character I want and in the  Trans2FindFirst2 the
hex values for the directory name are correct up to the final 2A (i.e. "*")
which is missing after the last 5C ("\") .


Trans2FindFirst2[command=SMB_COM_TRANSACTION2,received=false,errorCode=0x000
00000,flags=0x0018,flags2=0x0001,tid=55297,pid=4502,uid=0,mid=3,wordCount=15
,byteCount=18,totalParameterCount=17,totalDataCount=0,maxParameterCount=10,m
axDataCount=1200,maxSetupCount=0,flags=0x00,timeout=0,parameterCount=17,para
meterOffset=66,parameterDisplacement=0,dataCount=0,dataOffset=84,dataDisplac
ement=0,setupCount=1,pad=1,pad1=1,searchAttributes=0x16,searchCount=15,flags
=0x00,informationLevel=0x104,searchStorageType=0,filename=\大\*]

3 10 17:34:26.420 - smb sent
00000: FF 53 4D 42 32 00 00 00 00 18 01 00 00 00 00 00  |?SMB2...........|
00010: 00 00 00 00 00 00 00 00 01 D8 96 11 00 00 03 00  |.........?......|
00020: 0F 11 00 00 00 0A 00 B0 04 00 00 00 00 00 00 00  |.......°........|
00030: 00 00 00 11 00 42 00 00 00 00 00 01 00 01 00 12  |.....B..........|
00040: 00 00 16 00 0F 00 00 00 04 01 00 00 00 00 5C 91  |..............\.|
00050: E5 5C 00                                         |?\.             |



With the 2 character directory name (smb://server/share/%91%E5%8F%AC) the
value of "find with path=" is already wrong and the hex codes corresponding
to the directory name are not visible anywhere in the dump including the
Trans2FindFirst2. But the final 2A is there!

3 10 17:46:04.140 - smb sent
Trans2FindFirst2[command=SMB_COM_TRANSACTION2,received=false,errorCode=0x000
00000,flags=0x0018,flags2=0x0001,tid=59393,pid=32658,uid=0,mid=3,wordCount=1
5,byteCount=20,totalParameterCount=19,totalDataCount=0,maxParameterCount=10,
maxDataCount=1200,maxSetupCount=0,flags=0x00,timeout=0,parameterCount=19,par
ameterOffset=66,parameterDisplacement=0,dataCount=0,dataOffset=86,dataDispla
cement=0,setupCount=1,pad=1,pad1=1,searchAttributes=0x16,searchCount=15,flag
s=0x00,informationLevel=0x104,searchStorageType=0,filename=\?Fャ\*]

3 10 17:46:04.140 - smb sent
00000: FF 53 4D 42 32 00 00 00 00 18 01 00 00 00 00 00  |?SMB2...........|
00010: 00 00 00 00 00 00 00 00 01 E8 92 7F 00 00 03 00  |.........?......|
00020: 0F 13 00 00 00 0A 00 B0 04 00 00 00 00 00 00 00  |.......°........|
00030: 00 00 00 13 00 42 00 00 00 00 00 01 00 01 00 14  |.....B..........|
00040: 00 00 16 00 0F 00 00 00 04 01 00 00 00 00 5C 3F  |..............\?|
00050: 46 AC 5C 2A 00                                   |F¬\*.           |

Any final suggestions would be really appreciated.

Cheers
David

----- Original Message -----
From: "Michael B Allen" <mballen at erols.com>
To: "Talbot David" <chukhonets at hotmail.com>
Cc: <jcifs at samba.org>
Sent: Saturday, March 09, 2002 4:07 PM
Subject: Re: [jcifs] URL encoding


> On Sat, 09 Mar 2002 12:32:56 +0900
> "Talbot David" <chukhonets at hotmail.com> wrote:
>
> >
> >
> > >What "standard windows client"? The smbclient program will not
negotiate
> > >Unicode (actually the latest and greatest might but I don't think
vanilla
> > >2.2 does).
> >
> > I traced the packets generated when the same directory is accessed via
> > "Network Computer" on Win 98.
>
> Win98 does not negotiate Unicode (for most things). Does this machine
> properly exhibit the behavior you seek?
>
> > > > and the non-ascii characters seem to have their native
> > > > encoding values. I suppose treating the path as a byte array should
> > >work.
> > > > I'll give it go.
> >
> >
> > >It sounds like the server is negotiating ASCII (a.k.a the platform
> > >dependent 8 bit codepage). If this is indeed the case you are in
uncharted
> > >territory.
> >
> > Probably not the right territory for me just yet ;)
> >
> > By the way which class actually has the job of converting the url string
> > into bytes that make up the SMB request packet? I assume that's where
I'd
> > need to start.
>
> The codepage thing has nothing to do with jCIFS. On the other hand I
> can't help but think that's NOT the problem because it is only used to
> support foreign character sets. For example cp866 is Cryllic Russian:
>
>   http://czyborra.com/charsets/codepages.html#CP866
>
> so if you were trying to support Hebrew or Nordic Win 98 clients or
> somthing that's probably something you would have mentioned already
> right? Otherwise, again, you're just not encoding the paths properly
> ... try the SmbCrawler.
>
> Mike
>
> --
> May The Source be with you.
>




More information about the jcifs mailing list