i18n question.

Peter Waechtler peter at helios.de
Wed Mar 10 08:12:26 GMT 2004

Am Dienstag, 9. März 2004 13:56 schrieb Benjamin Riefenstahl:
> Mac OS X uses de-composed UTF-8 for the file system.  This is a fixed,
> non-changeable constant (which otherwise is a good thing IMO).  The
> de-composition is enforced by the Mac OS X kernel.

Out of curiosity: can you tell one advantage of using decomposed UTF8
on MacOSX? I can't, and if it has none.. just complicates the things
because you have to 'look backwards' to get the multibyte sequence.

> De-composed Unicode means that characters like adieresis (ä) are
> represented not as <U+00E4> ("pre-composed"), but as the sequence
> <U+0061,U+0308> ("de-composed") where the character U+0061 is just
> ASCII 'a'.  In UTF-8 that's pre-composed {0xC3,0xA4} and de-composed
> {0x61,0xCC,0x88}.

More information about the samba-technical mailing list