samba-3.0.0beta1 codeset issue on non-Linux

Steve Langasek vorlon at netexpress.net
Thu Jun 12 16:35:24 GMT 2003


On Thu, Jun 12, 2003 at 10:51:02AM +0100, David Lee wrote:
> On Wed, 11 Jun 2003, Steve Langasek wrote:

> > On Wed, Jun 11, 2003 at 04:06:21PM +0100, David Lee wrote:

> > > 2. The AC_TRY_RUN test in "configure.in" is based upon:
> > >       iconv_open("ASCII", "UCS-2LE");
> > >    But Solaris (at least), while having support for codeset conversions,
> > >    seems not to have any involving "ASCII".  There are plenty involving
> > >    "UCS-2LE" (and "UTF-8") and a small number involving "CP850"
> > >    So, based on this one limited test, "configure" erroneously concludes
> > >    that it cannot do chareset translation: clearly wrong.

> > Hmm, this is a detail that I missed on first reading.  This autoconf
> > test really doesn't seem to belong, because on my systems, I see that
> > Samba handles the conversion between UCS-2LE and ASCII *internally*,
> > even though the system iconv supports it!  One way or another this test
> > needs to be changed, because it's not testing for the features that are
> > actually used.  Changing it to iconv_open("CP850", "UCS-2LE") would make
> > for a much more useful better test.

> Thanks for your replies, Steve.

> Before answering the detail, can I ask a more general question?  What is
> the _aim_ of this run-time "iconv_open(<X1>, <X2>)" test in configure.in?

> Is it to establish merely that there is some sort of iconv which can do
> some translation function?  That there is at least one <X1> and one <X2>
> for which this works, with the actual values of <X1> and <X2> at this
> stage being unimportant?

> Or is it more than that: that such an iconv which can handle a known,
> particular set of translations?  That we require, not only "iconv_open()"
> but also certain specific values of <X1> and <X2>.

Specifically, to be useful to Samba, the system iconv must support some
set of conversions to(from) UCS-2LE, from(to) charsets other than ASCII
and UTF8.  This is because Samba uses a two-step charset conversion
process, with UCS-2LE as an intermediate encoding (chosen because it's
pretty much guaranteed to support all characters that are also supported
by Windows clients, Unicode or not).  So the test should test for
features that will actually be used, and the specific charset values
chosen are indeed important: converting between UCS-2LE and ASCII isn't
useful.  Converting between UCS-2LE and CP850 definitely is.

> > It is required that the system iconv be able to support UCS-2LE, since
> > that's used internally by Samba as an intermediate encoding.  It just
> > doesn't make sense to test for ASCII, which Samba already knows how to
> > handle.

> I suspect that begins to answer my earlier, fumbling, question about the
> principle: it means we require a working iconv_open() which is known to
> support "UCS-2LE".  (Is that "from" or "to" or both?)

Support for bidirectional conversion is certainly needed for proper
functioning.  Whether it's necessary to test for bidirectional
conversion in configure.in, I don't know; I doubt it's a major problem
in practice.

> I had earlier just hacked "configure.in" to test a range a "from" and "to"
> charsets:
>    FROM="CP850 850 646"
>    TO="8859 UTF-8 UCS-2LE"
>    <foreach FROM>
>      <foreach TO>
>        run the "iconv_open()" and print succeed/fail
>      <>
>    <>

>    # THis next block is simply "from" and "to" reversed
>    <foreach TO>      # now used in "from" position
>      <foreach FROM>  # now used in "to" position
>        run test and print succeed/fail
>      <>
>    <>

Though it's a useful test for seeing what's available, I think that's
too much complexity to use in the distribution.  The default unix
charset, the default display charset, and the internal charset are all
handled internally by Samba; it's only the default DOS charset that's
missing.  So assuming CP850 is really a reasonable default, checking for
CP850<->UCS-2LE alone should be reasonable.

> The results:

> Solaris 2.5.1
>    (Sadly, so ancient that our environemnt has moved on, and I can no
>    longer build samba.  Suspect no 850-ish.)

> Solaris 7:

> Comment: apparently no 850-ish functionality.

> Solaris 8:
> Comment: have CP850<->UTF-8 but not CP850<->UCS-2LE

So, changing the configure test wouldn't help for Solaris. :/

> Redhat 9:
>    succeed: : CP850 UCS-2LE
>    succeed: : UCS-2LE CP850

Yep -- no problems at all with glibc...

> And what about IRIX, HPUX, *BSD flavours?

No idea, but I bet there are lots of platforms out there which would
need to have GNU iconv installed to take advantage of Samba charset
support.

-- 
Steve Langasek
postmodern programmer
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://lists.samba.org/archive/samba-technical/attachments/20030612/e570fde4/attachment.bin


More information about the samba-technical mailing list