Character set and client code page - one again.

Superuser opole at
Wed Jan 5 15:03:26 GMT 2000


    Thank  you  for  the  fast  reply. Below is the more detailed
    description of problems with character set  and  client  code
    page mapings in samba ver. 2.0.6.

    First  of  all  some  initial  information about settings and
    systems.  All Windows clients (3.11,9x) use CP852.  Samba  is
    installed  on  the  SCO OpenServer 5.0.5 which uses ISO8859-2
    character set.  In the smb.conf file these two lines exist:

                      character set = ISO8859-2
                      client code page = 852

    With these settings all clients are able to copy  files  with
    CP852   characters  in  their  names  to  the  samba  server.
    Filenames are converted "in the fly" to ISO8859-2 equivalents
    and can be easy manipulated from server console or terminals.
    But for the Windows clients they are still visible as if they
    had CP852 characters - everything works as expected. But what
    with communication in the opposite direction - from server to
    Windows client?
    Some  of the clients shares contain CP852 characters in their
    names.  I will use example from my network. One of the  Win95
    client  named logos, has share named "Udział Samby" (it means
    "Samba's share" in Polish) - the last  letter  in  the  first
    word  has  code  number  0x88  in  CP852.   Its equivalent in
    ISO8859-2 character set has number 0xb3 and I'm using it now.
    Let's look at logos shares:

     ./smbclient -L logos -U%

    added interface ip= bcast= nmask=

         Sharename      Type      Comment
         ---------      ----      -------
         TMP            Disk
         UDZIAŁ SAMBY   Disk
         IPC$           IPC       Remote Inter Process Communication

         Server               Comment
         ---------            -------

         Workgroup            Master
         ---------            -------

    The result is correctly displayed in ISO8859-2 character set.
    Now we can connect to this share using the following  command
    on the server:

     ./smbclient //logos/"UDZIAŁ SAMBY" -U%

    Share's name was written in ISO8859-2 (here only one letter).
    Let's try to execute dir command:

    smb: \> dir
    added interface ip= bcast= nmask=
      .                                   D        0  Thu Jul  8 15:10:38 1999
      ..                                  D        0  Thu Jul  8 15:10:38 1999
      Katalog udziau                     D        0  Mon Dec  6 15:49:08 1999
      Mae wtpliwoÂci.doc                A     4608  Mon Dec  6 12:03:06 1999
      archiw                              D        0  Mon Dec  6 12:04:34 1999
      biera_95.xls                        A    22800  Mon Dec  6 15:59:38 1999

              51345 blocks of size 4096. 8634 blocks available

    The result is incorrect! It should be as below:
    added interface ip= bcast= nmask=
      .                                   D        0  Thu Jul  8 15:10:38 1999
      ..                                  D        0  Thu Jul  8 15:10:38 1999
      Katalog udziału                     D        0  Mon Dec  6 15:49:08 1999
      Małe wątpliwości.doc                A     4608  Mon Dec  6 12:03:06 1999
      archiw                              D        0  Mon Dec  6 12:04:34 1999
      biera_95.xls                        A    22800  Mon Dec  6 15:59:38 1999

              51345 blocks of size 4096. 8634 blocks available

    By the way: samba completly ignores  my  LANG=pl_PL.ISO8859-2
    setting and displays month and week days names in english. It
    doesn't bother me, but...
    We have no mean to cd to the first directory  or  to  receive
    its listing:

    smb: \> cd "Katalog udziału"

    cd \Katalog udziału\: ERRDOS - ERRbadpath (Directory invalid.)

    The  command  was  written  using  ISO8859-2  character set -
    simliary to the first one when  we  were  connecting  to  the
    client.   And  one  more  remark:  smbclient has the very bad
    feature - it continues to execute next commands after failing
    to complete the first one.  For example this command:

     ./smbclient //logos/"UDZIAŁ SAMBY" -c "cd \"Katalog udziału\";del *"

    will  remove  all  files from the share "UDZIAŁ SAMBY" except
    directories  because  it  is  not  able  to  cd  to  "Katalog
    udziału". Funny, isn't it?
    The  only  way  to  manipulate  PC's shares is to comment out
    lines concerning  character  set  and  client  code  page  in
    smb.conf.  After  restarting  the server it is possible to do
    what is needed on Windows clients using smbclient  and  CP852
    character  set on the console, but none of these clients will
    be able to copy files or create directories containing  CP852
    characters.  The  more exactly: it is possible to copy such a
    file - but only once. If one tries to copy it again to  local
    disk,  Windows  will  inform,  that  copying  is impossible -
    finding  named  file  is  impossible.  Also   newly   created
    directory  can't  be accessed - "Folder does not exist". With
    these lines removed (commented out)  from  smb.conf,  Windows
    client will not be able to access any directory with CP852 or
    ISO8859-2 characters  -  although  both  are  visible.  After
    restoring these lines, directories with CP852 characters will
    vanish.  All of the above concerns also CP852  characters  in
    samba's share - it is visible, but cannot be accessed.

    In previous samba versions with character set and client code
    page set correctly in smb.conf, manipulating  clients  shares
    was  possible  only  when I was using ISO8859-2 characters to
    connect to client's share, and then CP852 characters to copy,
    remove  etc.  I  was  using  appropriate  scripts  for  these
    operations. I have to apologise for my error from  the  first
    letter  where  I  assigned  this feature to the current 2.0.6
    version.  I spent all  day  yesterday  testing  samba  and  I
    think, that this time I haven't made any mistake.

    Kind regards

		    K. Hrebeniuk

    What  is interesting, with character set and client code page
    set in smb.conf, it is impossible to  maniplulate  files  and
    directories  on  samba  server  share  conecting  to  it with
    smbclient localy (server and smbclient on the same  machine).
    It doesn't matter if there are CP852 characters or ISO8859-2.
    Only ASCII characters are allowed. After removing these lines
    from  smb.conf  all  files  can  be accessed - one should use
    CP852 characters if filename contains them, or  ISO8859-2  if

    I  think  the  best  solution  would  be  to  accept, that if
    somebody  places   character   set   =   ISO8859-2   in   the
    configuration  file,  then he expects, that all operations on
    files  and  directories  can  be  carried  out   using   this
    characters  - mixing CP852 in result of directory listing and
    ISO8859-2 in commands, is a horrible idea.

Krzysztof Hrebeniuk
Wojewodzki Inspektorat Ochrony Srodowiska
ul. Nysy Luzyckiej 42
45-035 Opole
tel. (+48)(+77) 454-22-89, 453-99-06; fax: 453-00-69

More information about the samba-technical mailing list