Ali Hasnain Baqri ali.baqri at
Fri May 3 05:28:32 GMT 2002

Hi Allen

Thanks for the reply. Indeed your support is faster than any one can 
get. But then that's an engineer helping. Not a business man! :)

UTF-8 modified? That's news!
I did try your option of setting the LANG to en_US.utf8. The corruption 
does change. But not in Solaris console. On KDE terminal
it changes. Moreover, some characters values come out to be  
65533,65389,65533,65416 in KDE. Are these Japanese?
So the Java program using InputStreamReader(in,"Shift_JIS") reads stream 
as ASCII in a Solaris console as none of the values exceed 255 but 
something else in a KDE terminal where some values exceed 65000!! 
However I wonder if the series of number printed by the program in KDE 
is correct either as there are very few numbers in the list that exceed 
65000. I guess that names contain more characters.
Also DataInpuStream.getUTF did not work. It threw EOFException. It may 
mean that as it read the bytes as UTF chars, the last few bytes did not 
form a proper UTF char.

Any comments?

Allen, Michael B (RSCH) wrote:

>>-----Original Message-----
>>From:	Allen, Michael B (RSCH) 
>>	InputStreamReader created for a certain encoding will read the stream 
>>	and give out a stream of correct characters. I have even tried reading 
>>	all the bytes into a ByteArrayOutputStream and creating a String with 
>>	all the encoding types. It still does not work. Funnily, none of the 
>>	characters in the String so created is above 255! They are all ASCII! 
>>	Some should be Japanese!
>>	How do you know it's not working but the display you're on can't handle
>>	the output? Try running a UTF-8 XTerm like:
>>	$ xterm -u8
>>	Then run your program in a UTF-8 locale like:
>>	xterm$ LANG=en_US.utf8 java MyProg blah blah blah
>>	Does the "coruption" change? Be carefull NOT to do stuff like:
>		Oops! I mean the other way around. You DO want to do like the below and NOT like the
>		example after it. In other words you do not want to use InputStreamReader. Actually I just found
>		out that the "UTF-8" used by Java and DataInput is modified!
>		Mike
>>	char[] d = (new String( b, 0, b ), "UTF8" )).toCharArray();
>>	This just doesn't work and I don't know why exactly. You have to do more like:
>>	BufferedReader in = new BufferedReader(
>>	                          new InputStreamReader( new FileInputStream( "testfile" )));
>>	char[] d = new char[4096]; 
>> d, 0, 4096 )
>>	...etc. How they know it's UTF-8 is beyond me but this is how it works.

More information about the smb-clients mailing list