[jcifs] Re: smbclient/solaris/smbclient

Michael B.Allen mballen at erols.com
Fri May 3 17:11:48 EST 2002


On Fri, 03 May 2002 11:43:35 +0530
Ali Hasnain Baqri <ali.baqri at sun.com> wrote:
> >>
> >>Thanks for the reply. Indeed your support is faster than any one can 
> >>get. But then that's an engineer helping. Not a business man! :)
> >>
> >>UTF-8 modified? That's news!

Yes, at least for DataInputStream and DataOutputStream (which I suspect is
used by InputStreamReader for UTF-8 somehow) a length marker is prepended.

> >>I did try your option of setting the LANG to en_US.utf8. The corruption 
> >>does change. But not in Solaris console. On KDE terminal
> >>it changes. Moreover, some characters values come out to be  
> >>65533,65389,65533,65416 in KDE. Are these Japanese?

No, this is probably an artifact of integer overflow. The value -3 is 65535.

> >>So the Java program using InputStreamReader(in,"Shift_JIS") reads stream 
> >>as ASCII in a Solaris console as none of the values exceed 255 but 
> >>something else in a KDE terminal where some values exceed 65000!! 

Are you sure it's Shift_JIS? Perhaps it's UTF-8. Pipe the output to
hexdump -c and post it.

> >>However I wonder if the series of number printed by the program in KDE 
> >>is correct either as there are very few numbers in the list that exceed 
> >>65000. I guess that names contain more characters.

You must consider that the terminal must convert the output to an encoding
suitable for display. If smbclient is really outputting Shift_JIS then
locale of the terminal must be changed to accomodate that. I don't know
how you would do that with a KDE terminal but you might try xterm -u8. In
other words, it might all be working, but you don't know it because you
cannot display it. Even if you get the encoding right you still need a
Japanese font!

> >>Also DataInpuStream.getUTF did not work. It threw EOFException. It may 
> >>mean that as it read the bytes as UTF chars, the last few bytes did not 
> >>form a proper UTF char.

You definately don't want to use readUTF. That's "modified" UTF-8. See the
API docs for DataInput.readUTF.

Mike

-- 
May The Source be with you.





More information about the jcifs mailing list