Character set and client code page - one again.
Superuser
opole at pios.gov.pl
Wed Jan 5 15:03:26 GMT 2000
Hello!
Thank you for the fast reply. Below is the more detailed
description of problems with character set and client code
page mapings in samba ver. 2.0.6.
First of all some initial information about settings and
systems. All Windows clients (3.11,9x) use CP852. Samba is
installed on the SCO OpenServer 5.0.5 which uses ISO8859-2
character set. In the smb.conf file these two lines exist:
character set = ISO8859-2
client code page = 852
With these settings all clients are able to copy files with
CP852 characters in their names to the samba server.
Filenames are converted "in the fly" to ISO8859-2 equivalents
and can be easy manipulated from server console or terminals.
But for the Windows clients they are still visible as if they
had CP852 characters - everything works as expected. But what
with communication in the opposite direction - from server to
Windows client?
Some of the clients shares contain CP852 characters in their
names. I will use example from my network. One of the Win95
client named logos, has share named "Udział Samby" (it means
"Samba's share" in Polish) - the last letter in the first
word has code number 0x88 in CP852. Its equivalent in
ISO8859-2 character set has number 0xb3 and I'm using it now.
Let's look at logos shares:
./smbclient -L logos -U%
added interface ip=194.181.130.209 bcast=194.181.130.255 nmask=255.255.255.0
Sharename Type Comment
--------- ---- -------
TMP Disk
UDZIAŁ SAMBY Disk
IPC$ IPC Remote Inter Process Communication
Server Comment
--------- -------
Workgroup Master
--------- -------
The result is correctly displayed in ISO8859-2 character set.
Now we can connect to this share using the following command
on the server:
./smbclient //logos/"UDZIAŁ SAMBY" -U%
Share's name was written in ISO8859-2 (here only one letter).
Let's try to execute dir command:
smb: \> dir
added interface ip=194.181.130.209 bcast=194.181.130.255 nmask=255.255.255.0
. D 0 Thu Jul 8 15:10:38 1999
.. D 0 Thu Jul 8 15:10:38 1999
Katalog udziau D 0 Mon Dec 6 15:49:08 1999
Mae wtpliwoÂci.doc A 4608 Mon Dec 6 12:03:06 1999
archiw D 0 Mon Dec 6 12:04:34 1999
biera_95.xls A 22800 Mon Dec 6 15:59:38 1999
51345 blocks of size 4096. 8634 blocks available
The result is incorrect! It should be as below:
added interface ip=194.181.130.209 bcast=194.181.130.255 nmask=255.255.255.0
. D 0 Thu Jul 8 15:10:38 1999
.. D 0 Thu Jul 8 15:10:38 1999
Katalog udziału D 0 Mon Dec 6 15:49:08 1999
Małe wątpliwości.doc A 4608 Mon Dec 6 12:03:06 1999
archiw D 0 Mon Dec 6 12:04:34 1999
biera_95.xls A 22800 Mon Dec 6 15:59:38 1999
51345 blocks of size 4096. 8634 blocks available
By the way: samba completly ignores my LANG=pl_PL.ISO8859-2
setting and displays month and week days names in english. It
doesn't bother me, but...
We have no mean to cd to the first directory or to receive
its listing:
smb: \> cd "Katalog udziału"
cd \Katalog udziału\: ERRDOS - ERRbadpath (Directory invalid.)
The command was written using ISO8859-2 character set -
simliary to the first one when we were connecting to the
client. And one more remark: smbclient has the very bad
feature - it continues to execute next commands after failing
to complete the first one. For example this command:
./smbclient //logos/"UDZIAŁ SAMBY" -c "cd \"Katalog udziału\";del *"
will remove all files from the share "UDZIAŁ SAMBY" except
directories because it is not able to cd to "Katalog
udziału". Funny, isn't it?
The only way to manipulate PC's shares is to comment out
lines concerning character set and client code page in
smb.conf. After restarting the server it is possible to do
what is needed on Windows clients using smbclient and CP852
character set on the console, but none of these clients will
be able to copy files or create directories containing CP852
characters. The more exactly: it is possible to copy such a
file - but only once. If one tries to copy it again to local
disk, Windows will inform, that copying is impossible -
finding named file is impossible. Also newly created
directory can't be accessed - "Folder does not exist". With
these lines removed (commented out) from smb.conf, Windows
client will not be able to access any directory with CP852 or
ISO8859-2 characters - although both are visible. After
restoring these lines, directories with CP852 characters will
vanish. All of the above concerns also CP852 characters in
samba's share - it is visible, but cannot be accessed.
In previous samba versions with character set and client code
page set correctly in smb.conf, manipulating clients shares
was possible only when I was using ISO8859-2 characters to
connect to client's share, and then CP852 characters to copy,
remove etc. I was using appropriate scripts for these
operations. I have to apologise for my error from the first
letter where I assigned this feature to the current 2.0.6
version. I spent all day yesterday testing samba and I
think, that this time I haven't made any mistake.
Kind regards
K. Hrebeniuk
PS.
What is interesting, with character set and client code page
set in smb.conf, it is impossible to maniplulate files and
directories on samba server share conecting to it with
smbclient localy (server and smbclient on the same machine).
It doesn't matter if there are CP852 characters or ISO8859-2.
Only ASCII characters are allowed. After removing these lines
from smb.conf all files can be accessed - one should use
CP852 characters if filename contains them, or ISO8859-2 if
needed.
I think the best solution would be to accept, that if
somebody places character set = ISO8859-2 in the
configuration file, then he expects, that all operations on
files and directories can be carried out using this
characters - mixing CP852 in result of directory listing and
ISO8859-2 in commands, is a horrible idea.
-----------------------------------------------
Krzysztof Hrebeniuk
Wojewodzki Inspektorat Ochrony Srodowiska
ul. Nysy Luzyckiej 42
45-035 Opole
tel. (+48)(+77) 454-22-89, 453-99-06; fax: 453-00-69
More information about the samba-technical
mailing list