[Samba] samba 3.0.22 and hebrew file names

Tue Jun 13 15:05:39 GMT 2006

On Tue, 13 Jun 2006, Shlomi . wrote:
> We had an old Sun server running Solaris 2.6 with samba 2.2.2,
> Now we upgrade it to Solaris 9 with Samba 3.0.22, but we have one problem.
> The file names that are in Hebrew looks on the Windows clients as lines or
> squares.
>
> On the old samba server there were no char settings, on the new samba server
> I set the char to 862
> and the display and unix chars to ISO8859-8 and UTF-8 - it didn't help.
>
> I guess that the samba doesn't know were to get the CP862 file.

I researched internationalization with Samba a while back, and this
is the conclusion I came to:

1.  Any given installation of Samba 3 uses three different
     character sets:  (1) the character set of filenames on disk,
     (2) unicode for speaking to (Windows) clients that support
     unicode in CIFS, and (3) a "legacy" codepage for clients
     that use an older version of CIFS and don't support Unicode.

2.  Samba 3 converts freely between these different character sets
     at runtime as needed.

3.  Samba 2 doesn't support Unicode at all (or at least not for
     filenames), so its on-disk character set is always the same
     as the character set it uses when communicating to clients,
     and it does no conversion.

Based on these three facts (if I'm remembering them right),
I would guess what has happened is this:  when using Samba 2,
you set your Samba server to use the Hebrew codepage (862,
I guess).  This means that all the filenames got created on
disk using that character set.  But then you upgraded to Samba
3 and are using the same set of files.  Now Samba 3 is expecting
to see Unicode filenames but the files are still codepage 862.

The best solution is probably to set Samba to use Unicode
on the disk, then rename all your files to Unicode names.
Somewhere out there is a script that can do this.  Samba should
automatically speak Unicode to newer Windows clients, so as
long as you work out the on-disk character set and have that
set up properly, everything should be good.

Once you have Samba set up to do Unicode on disk, you should
be able to connect from a Windows client and create some files
using Hebrew characters and they should show up properly.
That would be a good test and would help prove that all you
need to do is get the existing filenames into the right format.

One more thing:  since (as I understand it) Samba can also
speak with a fixed 8-bit codepage to legacy clients that do not
support Unicode, you might want to set that codepage to 862
in the configuration file.  I forget what the directive is,
but there is one that controls what Samba speaks on the wire
to clients that don't support Unicode.

   - Logan