[jcifs] localization problems

Fri Jul 12 08:20:26 EST 2002

> -----Original Message-----
> From:	Dmitry Khlonin [SMTP:dmitry at khlonins.com]
> Sent:	Thursday, July 11, 2002 4:59 AM
> To:	Allen, Michael B (RSCH)
> Cc:	jcifs at lists.samba.org
> Subject:	Re: [jcifs] localization problems
> 
> Allen, Michael B (RSCH) wrote:
> 
> 
> 
> 	  
> 
> 		-----Original Message-----
> 		From:	Dmitry Khlonin [ SMTP:dmitry at khlonins.com <mailto:SMTP:dmitry at khlonins.com>]
> 		Sent:	Tuesday, July 09, 2002 4:53 PM
> 		To:	 jcifs at lists.samba.org <mailto:jcifs at lists.samba.org>
> 		Subject:	[jcifs] localization problems
> 		
> 		I  use jcifs and find some problems in part of localization.
> 		Some history:
> 		In russian locale we have about 6 base encodings of russian language in 
> 		one-byte size
> 		( Cp866, Cp1251, KOI8_r,  ISO 8859-5 ...)
> 		and Unicode and UTF8 at all.
> 		
> 		In the case of mixing in the Windows Network - share names in Cp866 and 
> 		all other in Unicode
> 		(this is for NT/2k/XP hosts) and Cp866 for all (for 9x hosts).
> 		In the code of creating filenames and sharenames from packets I 
> 		hard-code this encoding
> 		in the String creating methods (this is only for me), but question - 
> 		will you have include some parameter
> 		for jcifs library for case of localization?
> 		
> 		    
> 
> 		No. JCIFS will negotiate Unicode with each server as necessary. Just pass
> 		normal Java strings. If you are manually manipulating btye arrays of various
> 		encodings it will be necesary to use convert them back into their Java String
> 		equivalent like: new String( bytearray, "cp1251" ) and getBytes( "cp1252" )...etc.
> 	  
> 
> 
> 	  
> 
> Sorry you don't understood... I changed code in jcifs code - like this...
> 
> NetShareEnumResponse.java
> 
> int readDataWireFormat( byte[] buffer, int bufferIndex, int len ) {
> //...
>             try {
>                 results[i].netName = new String( buffer, bufferIndex,
>                             readStringLength( buffer, bufferIndex, 13 ), "Cp866");
>             } catch (UnsupportedEncodingException e) {
>                 results[i].netName = null;
>             }
> //...
> 
	Ahhh. I see. I was under the impression that clients would run in an appropriate locale
	like ru_RU.cp866 (not sure if that encoding is legal) in which case new String() would
	automatically use Cp866 to decode byte[] arrays. What locale is the client running in?
	ISO-8859-5? I suppose it would be reasonable to add a property to change this behavior.
	Did you say you need to control this on a server by server basis or would a global
	parameter do the job? I don't know enough about how locales are controlled in Java and 
	what typical client/server arrangements in non-english locations are like.

	Mike 
> 
> and in Trans2FindFirst2Response
> 
>     String readString( byte[] src, int srcIndex, int len ) {
>         String str = null;
>         try {
>             if( useUnicode ) {
>                 // should Unicode alignment be corrected for here?
>                 str = new String( src, srcIndex, len, "UnicodeLittle" );
>             } else {
>                 /* On NT without Unicode the fileNameLength
>                  * includes the '\0' whereas on win98 it doesn't. I
>                  * guess most clients only support non-unicode so
>                  * they don't run into this.
>                  */
>                 /* UPDATE: Maybe not! Could this be a Unicode alignment issue. I hope
>                  * so. We cannot just comment out this method and use readString of
>                  * ServerMessageBlock.java because the arguments are different, however
>                  * one might be able to reduce this.
>                  */
>                 if( len > 0 && src[srcIndex + len - 1] == '\0' ) {
>                     len--;
>                 }
>                 str = new String( src, srcIndex, len, "Cp866" );
>             }
>         } catch( UnsupportedEncodingException uee ) {
>             Log.printStackTrace( "smb exception", uee );
>         }
>         return str;
>     }
> 
> 
> 
> 
> 
> 
> 	  
> 
> 		P.S. Sorry for my bad English
> 		
> 		
> 		    
> 
> 
> 
> 	  
> 
> 
>