UTF-8 support and other quirks in the LDAP backend (in 2.2.4).

Simo Sorce idra at samba.org
Tue Jun 18 15:09:08 GMT 2002


On Tue, 2002-06-18 at 20:30, Alexander Bokovoy wrote:
> On Tue, Jun 18, 2002 at 11:31:16AM -0700, Jeremy Allison wrote:
> > On Tue, Jun 18, 2002 at 01:24:12PM -0500, Steve Langasek wrote:
> > > 
> > > I do hope that tdb ends up going with UTF-8.  UCS2 is not particularly
> > > pleasant to work with under Unix; it's not endian-neutral, it doesn't
> > > provide ASCII as a compatibility subset, and it has to be converted to
> > > something else before it can be used by the majority of Unix tools.
> > > Granted, to a certain extent this is already true with tdb because it's
> > > a binary format, but making the import/export tools more complex gives
> > > you less margin for error.  Unless Samba chooses UCS-2 as an internal
> > > format for string processing (which I also don't think is the best idea
> > > in the world ;), using UCS-2 as a backend charset seems like an
> > > all-around bad idea, IMHO.
> > 
> > Yes, I think internal format (and format for tdbs) of utf8 seems
> > like the best idea (IMHO).
> There is a problem with utf8 for many fixed-size records in various tdbs.
> Also, most of data is in UCS-2 already.

Not only that, utf-8 is not easy to manipulate as characters are not
fixed lenght an upper case and lower case ones are not guaranted to be
long the same amount of bytes.

So UCS-2 is more suitable for most of the manipulations, utf8 is more
suitable to deal with unix system (file names, ecc..).

But, as windows yet speak ucs-2 with us, it is better to use that
internally, so that conversions are kept to a minimum, and manipulation
of data is much easier and faster.

Relegating utf8, in the long term to an internal vfs conversion for file
name storage purposes (yes I advocate an ucs2 vfs interface for the next
ntfs like semantic rewrite).


Simo.

> -- 
> / Alexander Bokovoy
> ---
> Most people have a mind that's open by appointment only.
> 
-- 
Simo Sorce
----------
Una scelta di liberta': Software Libero.
A choice of freedom: Free Software.
http://www.softwarelibero.it




More information about the samba-technical mailing list