[jcifs] localization problems
Allen, Michael B (RSCH)
Michael_B_Allen at ml.com
Fri Jul 12 08:20:26 EST 2002
> -----Original Message-----
> From: Dmitry Khlonin [SMTP:dmitry at khlonins.com]
> Sent: Thursday, July 11, 2002 4:59 AM
> To: Allen, Michael B (RSCH)
> Cc: jcifs at lists.samba.org
> Subject: Re: [jcifs] localization problems
>
> Allen, Michael B (RSCH) wrote:
>
>
>
>
>
> -----Original Message-----
> From: Dmitry Khlonin [ SMTP:dmitry at khlonins.com <mailto:SMTP:dmitry at khlonins.com>]
> Sent: Tuesday, July 09, 2002 4:53 PM
> To: jcifs at lists.samba.org <mailto:jcifs at lists.samba.org>
> Subject: [jcifs] localization problems
>
> I use jcifs and find some problems in part of localization.
> Some history:
> In russian locale we have about 6 base encodings of russian language in
> one-byte size
> ( Cp866, Cp1251, KOI8_r, ISO 8859-5 ...)
> and Unicode and UTF8 at all.
>
> In the case of mixing in the Windows Network - share names in Cp866 and
> all other in Unicode
> (this is for NT/2k/XP hosts) and Cp866 for all (for 9x hosts).
> In the code of creating filenames and sharenames from packets I
> hard-code this encoding
> in the String creating methods (this is only for me), but question -
> will you have include some parameter
> for jcifs library for case of localization?
>
>
>
> No. JCIFS will negotiate Unicode with each server as necessary. Just pass
> normal Java strings. If you are manually manipulating btye arrays of various
> encodings it will be necesary to use convert them back into their Java String
> equivalent like: new String( bytearray, "cp1251" ) and getBytes( "cp1252" )...etc.
>
>
>
>
>
> Sorry you don't understood... I changed code in jcifs code - like this...
>
> NetShareEnumResponse.java
>
> int readDataWireFormat( byte[] buffer, int bufferIndex, int len ) {
> //...
> try {
> results[i].netName = new String( buffer, bufferIndex,
> readStringLength( buffer, bufferIndex, 13 ), "Cp866");
> } catch (UnsupportedEncodingException e) {
> results[i].netName = null;
> }
> //...
>
Ahhh. I see. I was under the impression that clients would run in an appropriate locale
like ru_RU.cp866 (not sure if that encoding is legal) in which case new String() would
automatically use Cp866 to decode byte[] arrays. What locale is the client running in?
ISO-8859-5? I suppose it would be reasonable to add a property to change this behavior.
Did you say you need to control this on a server by server basis or would a global
parameter do the job? I don't know enough about how locales are controlled in Java and
what typical client/server arrangements in non-english locations are like.
Mike
>
> and in Trans2FindFirst2Response
>
> String readString( byte[] src, int srcIndex, int len ) {
> String str = null;
> try {
> if( useUnicode ) {
> // should Unicode alignment be corrected for here?
> str = new String( src, srcIndex, len, "UnicodeLittle" );
> } else {
> /* On NT without Unicode the fileNameLength
> * includes the '\0' whereas on win98 it doesn't. I
> * guess most clients only support non-unicode so
> * they don't run into this.
> */
> /* UPDATE: Maybe not! Could this be a Unicode alignment issue. I hope
> * so. We cannot just comment out this method and use readString of
> * ServerMessageBlock.java because the arguments are different, however
> * one might be able to reduce this.
> */
> if( len > 0 && src[srcIndex + len - 1] == '\0' ) {
> len--;
> }
> str = new String( src, srcIndex, len, "Cp866" );
> }
> } catch( UnsupportedEncodingException uee ) {
> Log.printStackTrace( "smb exception", uee );
> }
> return str;
> }
>
>
>
>
>
>
>
>
> P.S. Sorry for my bad English
>
>
>
>
>
>
>
>
>
>
More information about the jcifs
mailing list