smbclient -M: charset conversion?

Tue Jun 3 19:00:54 GMT 2003

You may recall a few weeks ago that I mentioned and fixed a bug in
"libsmb/climessage.c".  In creating its SMDsend* messages, it was allowing
the endpoint names to be converted into arbitrary multibyte UNICODE
charsets when it should have been forcing a much more conservative
ASCII-like charset.

But this set a further nagging doubt in my mind: what about the text of
such message themselves?  Similarly what about our processing of it in
"client/client.c" (i.e. in "smbclient -M")?

Is it legal to have UNICODE in the text of SMBsend* messages?

If it is legal, then haven't we got a potential problem with packet
size?  As I understand it, the packets-on-the-wire in SMBsend sequences
can carry no more than about 127 bytes of data.  But smbclient splits the
potentially large message (in client/client.c) *before* the charset
translation is done (in libsmb/climessage.c).

Now suppose a nice simple byte-per-character ASCII message gets translated
into a multibyte-per-character UNICODE charset?  Then the 127(-ish)-byte
ASCII chunks are going to become too big for the SMBsend packets, aren't
they?

[ As you can tell, my knowledge of charsets is minimal!  Hopefully someone
out there who know charsets will be able quickly to confirm or refute my
concerns about this. ]

-- 

:  David Lee                                I.T. Service          :
:  Systems Programmer                       Computer Centre       :
:                                           University of Durham  :
:  http://www.dur.ac.uk/t.d.lee/            South Road            :
:                                           Durham                :
:  Phone: +44 191 334 2752                  U.K.                  :