CH_DISPLAY and gettext

TAKAHASHI Motonobu monyo at monyo.com
Tue Jun 21 17:20:41 MDT 2011


> > So I'd support dropping CH_DISPLAY, unless someone can point out a
> > good reason to keep it.
> 
> +1 on dropping CH_DISPLAY.

Basically I agree, I my self do not use "display charset" parameter
explicitly and recommend that this parameter should not be set.

"display charset" was first introduced to support SWAT i18n. In those
days most web browsers could not treat UTF-8 well. But after Samba
3.0.8, SWAT was changed to talk UTF-8 because of most browsers could
support UTF-8 well.

Currently (Samba 3.0.8 or later), "display charset" affects only to
display some messages of terminal commands. Currently the default
valus of "display charset" is LOCALE, which means to obey LANG
variable, actually equal to the libintl/gettext system.

So I think 
  (1) To remove "display charset" parameter and d_printf()
  (2) To remove d_printf() used in web/cgi.c (SWAT), simply always to
    show with UTF-8.
  (3) To change d_printf() used in others to gettext() if these
    messages are expected to be written to terminal, or CH_UNIX-nizing
    if these messages are expected to be written to (log)files.

Even if we usually like that the command outputs are shown in own
language, we sometimes want that the outputs are forcibly shown in
English because of scripting or translation bug, so gettext()
capability in commands is needed, I think.

From: Jeremy Allison <jra at samba.org>
Date: Tue, 21 Jun 2011 09:04:19 -0700

> > I suspect this sort of setup is either gone completely now, or so rare
> > as to hardly matter. Basically UTF-8 has won, and if you're not using
> > UTF-8 on unix these days then you're pretty crazy.
> 
> Japan :-).
> 
> > It's even arguable that we could drop CH_UNIX
> > and just do UTF-8 at some stage, although I suspect there are still
> > sites around who use something other than UTF-8, just not sites that
> > use something different for terminals and filesystems.
> 
> Japan :-). Last time I heard there were still sites using SJIS
> or EUC-JP. Maybe monyo can comment ? There are also some Eastern
> European sites using older char sets as we still get bugs on these.

SJIS and EUC-JP are still widely used in Japan because of backward
compatibility. character set is like a network protocol, so if one of
systems use EUC-JP, we can not easily use UTF-8 in another system.

Also changing character set is affected in the database, for example
increasing size, changing sorting order, performance issue. These are
also some conversion issue between Unicode and legacy charsets such as
SJIS, EUC-JP.

... to change charset set is highly difficult matter.

I myself do not want to use SJIS or EUC-JP, but explain the present
situation in Japan.

---
TAKAHASHI Motonobu <monyo at monyo.com> / @damemonyo
  http://damedame.monyo.com/ / http://facebook.com/monyot






More information about the samba-technical mailing list