[Samba] Languages and encoding: file system and file contents

tlaronde at kergis.com tlaronde at kergis.com
Mon Feb 5 13:59:03 UTC 2024


On Mon, Feb 05, 2024 at 04:16:27PM +0300, Michael Tokarev wrote:
> 05.02.2024 15:18, Thierry LARONDE via samba :
> > I'm rather unclear about the way CIFS/Samba deal with languages,
> > encoding, and the, perhaps, encoding of pathnames vs the encoding of
> > files considered by MS Windows to be text file and hence with, perhaps
> > a language and an encoding.
> > 
> > So I will try to formulate questions about the elementary points, and
> > I will be grateful to the ones who can share lights about these:
> > 
> > "dos charset" and "unix charset" are global parameters. As far as I
> > understand the description, this:
> > 	a) fixes the _contents_ of files;
> 
> Absolutely not. Samba does not do anything with contents of the files,
> it treats all files as binary objects, not changing contents in any
> way.

Then I will suggest that the man page smb.conf(5) be clarified,
because:

dos charset (G)

           DOS SMB clients assume the server has the same charset as they do.
           This option specifies which charset Samba should use to talk to DOS
           clients.

           The default depends on which charsets you have installed.
	   Samba tries to use charset 850 but falls back to ASCII in case it
	   is not available. Run testparm(1) to check the default on
	   your system.

or the same for "unix charset" does not specify that this is pathname
only related.

The other question then is: supposing that a Unix filesystem one wants
to share has C strings as pathnames (i.e.: no encoding at all; just a
string of arbitrary bytes, nul byte terminated), is there an encoding
like "POSIX" that can be specified in smb.conf so that Samba
translates this stream of bytes in, say, an UTF-8 string (encoding not
ASCII bytes in the corresponding UTF-8 sequence)?

Or have the pathnames on Unix to be compatible with the encoding
specified in smb.conf?

And what happens, even if on the Unix side the pathnames have some
encoding compatible with what is declared in smb.con, if this can not
be encoded in the peer encoding? (iconv -from unix -to ms returns
error)

Thanks for the main answer (pathname and not content)!
-- 
        Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
                     http://www.kergis.com/
                    http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C



More information about the samba mailing list