[jcifs] problem encoding

Christopher R. Hertel crh at ubiqx.mn.org
Fri Jan 17 15:45:39 EST 2003


On Thu, Jan 16, 2003 at 09:27:55PM -0500, Allen, Michael B (RSCH) wrote:
:
:
> I don't know the details but the important thing is that this not be
> confused with the ongoing SMB	URL escaping discussion.

Agreed.

:
:
<snip a lot of interesting stuff about how Unicode may be encoded>
:
:

> SMB uses UCS-2LE and possibly UTF-16LE which are identical except in the
> UTF range used to identify characters that fall outside the supported UTF-16
> range.
> 
> See: http://www.cl.cam.ac.uk/~mgk25/unicode.html

I will take a look.  Unicode is not an area of expertise of mine.

In all that, however, I don't know if my point came across...

If both client and server support Unicode (in particular, UCS-2LE) then
the character encodings match and there is no problem.

If they don't--if, in particular, there are DOS codepages involved--then
there is nothing in the protocol that allows negotiation of the correct
codepage and, therefore, no way to match the extended characters between
client and server.

That is only related to the escaping issue in that--if Unicode is not
used--an escape sequence may not match the intended character.

This is a protocol bug, however, and not a problem to be solved in the SMB
URL.

Chris -)-----

-- 
Samba Team -- http://www.samba.org/     -)-----   Christopher R. Hertel
jCIFS Team -- http://jcifs.samba.org/   -)-----   ubiqx development, uninq.
ubiqx Team -- http://www.ubiqx.org/     -)-----   crh at ubiqx.mn.org
OnLineBook -- http://ubiqx.org/cifs/    -)-----   crh at ubiqx.org



More information about the jcifs mailing list