UTF-8 support and other quirks in the LDAP backend (in 2.2.4).
Tim Potter
tpot at samba.org
Tue Jun 18 16:50:02 GMT 2002
On Wed, Jun 19, 2002 at 12:02:00AM +0200, Simo Sorce wrote:
> > > Yes, I think internal format (and format for tdbs) of utf8 seems
> > > like the best idea (IMHO).
> > There is a problem with utf8 for many fixed-size records in various tdbs.
> > Also, most of data is in UCS-2 already.
I don't think that's true. Most data should be in unix character set.
> Not only that, utf-8 is not easy to manipulate as characters are not
> fixed lenght an upper case and lower case ones are not guaranted to be
> long the same amount of bytes.
Why would you need to manipulate the string on a character by character
basis? The only case I can think of is the name mangling system. Every
other part of Samba only cares about the total length of the string.
> So UCS-2 is more suitable for most of the manipulations, utf8 is more
> suitable to deal with unix system (file names, ecc..).
>
> But, as windows yet speak ucs-2 with us, it is better to use that
> internally, so that conversions are kept to a minimum, and manipulation
> of data is much easier and faster.
>
> Relegating utf8, in the long term to an internal vfs conversion for file
> name storage purposes (yes I advocate an ucs2 vfs interface for the next
> ntfs like semantic rewrite).
Yuck. (-:
Tim.
More information about the samba-technical
mailing list