i18n question.

Andrew Bartlett abartlet at samba.org
Sun Mar 7 04:26:44 GMT 2004


On Sun, 2004-03-07 at 15:04, Richard Sharpe wrote:
> On Sun, 7 Mar 2004, Kenichi Okuyama wrote:
> 
> > >> I do agree with you that one of the most favorable filesystem
> > >> charset is UTF-8, but FS charset is not something we can have
> > >> control over.
> > Michael> UTF-8 is favorable because it is a Unicode encoding that can be used
> > Michael> directly with the filesystem api. My understanding was that Japanese could
> > Michael> be adequately represented using Unicode. I would very much like to see
> > Michael> specific examples where this is not true. Please provide a link so that I
> > Michael> can educate myself. I sincerely want to make my C projects as accessible
> > Michael> as I can.
> > 
> > 
> > You believe that 'Unicode can be used' as long as interface is
> > 'UTF-8'. Well ofcourse that's true. But 'Unicode is not one and only
> > way to describe Japanese' or should I say 'Unicode only contains
> > very small subset of Japanese'. So very fact that UTF-8 is
> > supporting some Japanese, do not means they support ENOUGH.
> 
> Are you saying that UTF-8 does not encode all the Japanese glyphs that 
> Japanese people want to use?
> 
> This is a genuine question. I do not known the answer and it seems to me 
> like that is what you are saying. 

Along similar lines:

Does the failure to encode that Japanese glyph in UTF8 matter?  Ie, even
if we chose different internal and unix charsets, would we be able to
deal with that character on the wire, where UTF16 is the only available
encoding?

Or is the unwillingness to use UTF8 due to unix-side applications that
know how to use these Unicode-unsupported glyphs?  

If so, how do windows users create files with these glyphs, against a
windows server?  

Or do windows users never use these glyphs, and therefore never expect
to see files with these glyphs in their filenames?

Against a Samba server, assuming we find characters that are in our
file-system that are not in UTF16, what should we do with them?  Should
we mangle them (which would cause them to exist, but which would render
them even less readable)?

Andrew Bartlett

-- 
Andrew Bartlett                                 abartlet at pcug.org.au
Manager, Authentication Subsystems, Samba Team  abartlet at samba.org
Student Network Administrator, Hawker College   abartlet at hawkerc.net
http://samba.org     http://build.samba.org     http://hawkerc.net
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://lists.samba.org/archive/samba-technical/attachments/20040307/fa386345/attachment.bin


More information about the samba-technical mailing list