International charset in path/file names

Matthew Geier matthew at arts.usyd.edu.au
Tue Mar 27 12:25:16 GMT 2001


Kenichi Okuyama wrote:
> 
> >>>>> "STM" == Samba-JP TAKAHASHI Motonobu <monyo at samba.gr.jp> writes:
> >> client code page = 932
> >>
> >> is always required. Also one of :
> >>
> >> coding system = HEX
> >> coding system = EUC
> >> coding system = SJIS
> >> coding system = CAP
> >>
> >> are required too, for we have many way to describe same character.
> >>
> >> And even though you did this, it only means we now supports
> >> 'Japanese & English'.

 Our school teaches about 15 languages, just about every major language
is covered, modern and ancient!. A Bi-Lingual server isn't on - the left
out languages would scream!

 In fact there is a grouping dedicated to Asian languages - Japanese,
Korean and Chinese in various forms are all taught. Fortunately very few
people actually use these languages to name files - but if they want to,
we don't stop them.

> STM> As far as I know, CAP and HEX works regardless of "client code page".
> STM> So you may be happy to set "coding system = HEX or CAP" in your smb.conf.
> 
> Ah... But. If you did this, then it means character code will be
> shared between 'client code pages'.
> 
> I mean, if you had 2 byte character code of 0x8765 which points one
> character(with code 0x8765) on Japanese and totally different
> character(with code 0x8765) on Chinese, and if you shared same samba
> server between, you'll find filename of confusion.

 It isn't actually a problem for us - as long as what the user puts in
is what they get back!. Any thing common is likely to be in English and
the latin character set. If a Chinese teacher can save files their home
directory on the server in chinese names and get them back, and a
Japanese teacher can do the same, we are happy. We can happily live with
the incorrect (or I guess stupid sounding 'gibberish' filenames) if some
one uses a 16 bit char set name on a file in a common area. We can ask
that they restrict themselves to latin characters in shared areas. We
have more Macs than PCs, so we have a much bigger problem with sharing
files between machines.
(Now if NetAtalk and Samba could use the same code page maps and
encoding.... :-)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 2020 bytes
Desc: S/MIME Cryptographic Signature
Url : http://lists.samba.org/archive/samba-technical/attachments/20010327/8211813c/smime.bin


More information about the samba-technical mailing list