Where to patch for toupper/tolower, locale
Deniz Akkus Kanca
deniz at arayan.com
Fri Nov 16 22:23:02 GMT 2001
I need to formulate a patch for an issue seen in Linux samba installations
where system locale is set to Turkish.
Please note that this does not have anything to do with Turkish character
support *in* samba. It is simply how charset filenames are being formed by
taking in a parameter (iso8859-9) from smb.conf and deriving the associated
filename from it (unicode_map.ISO8859-9). This works if system locale is not
Turkish, but derives the filename unicode_map.İSO8859-9 if system locale is
Turkish (which is a correct upper casing for Turkish but the file does not
have that name.)
Problem arises from different lower/upper case mapping for i in Turkish,
specifically, i->İ (idot) and ı->I (dotless i). This feature of the alphabet
is also shared by all other Turkic languages, Azerí, Ozbek etc.
When system locale is not set to Turkish, smb.conf is read in, character set
field is recognized and the correct charmap file name is derived from
upper-casing the character set field and concatting in various ways.
When system locale is set to Turkish, character set field is read in and
upper cased to form the file name. Since the upper casing is different in
Turkish for i, the charset files looked for are xxx-İSO8859-9 (idot), which
Samba installations using Turkish charmaps may or may not have their system
locale set to Turkish. Currently, if they do not, everything works fine. If
they do, charset map files can't be located, which shows up in error message
in log.smbd .
There are various ways of getting around the problem:
1. smb.conf can be made to accept upper case charset definitions in
2. a setlocale can be done (there is an ifdef'd setlocale statement in
charset_initialize in lib/charset.c ) so that samba does not use Turkish
3. The specific function making up the file names can be made to do something
different if locale is Turkish ( load_unicode_unix_map in lib/util_unistr.c )
4. Definition of strupper can be changed.
And probably a lot more...
Any ideas on where to patch samba so that it is as trivial a patch as
possible while making sure charset filenames are derived correctly regardless
of the system locale?
More information about the samba-technical