i18n question.

Mon Mar 8 16:31:42 GMT 2004

On Mon, 2004-03-08 at 17:16, Benjamin Riefenstahl wrote:
> Hi Simo,
> 
> 
> Simo Sorce <simo.sorce at xsec.it> writes:
> > What's the problem in beeing charset agnostic (with some rules of
> > course) ?
> 
> The problem is that some of the FE encodings and the variant of UTF-8
> mandated on Mac OS X don't conform to all of the rules stated before
> on this thread.  So things get complicated with exceptional handling,
> work-arounds, #ifdefs and even add-on modules.

Which rules does not conform exactly please? (real question I'm simply
not an expert of Mac OS X)

> This is, after all, the reason why Unicode and UTF-8 were invented in
> the first place.  Because being really encoding agnostic is hard in
> practice.

Yes but have to live with filesystems not able to support utf8

> > I would really like to know where really is the problem we are
> > trying to fix with this proposal :-)
> 
> I am not yet proposing anything tangible.  Just trying to find out why
> some of the strategies that I know about were not or can not be
> applied.
> I came to the list some time ago because I used to have problems
> porting Samba 3 to Mac OS X 10.2 because of a variant of the issues
> discussed now.  (Note that when Apple ported Samba 3 to 10.3, they
> actually used an early release candidate, maybe in part because of the
> fast-path optimizations in later versions which broke Samba 3 on Mac
> OS X.)
> 
> Some of the things said in this discussion sounded more like gut
> reactions than the results of investigation, so I tried to add input.
> When people say "this-or-that doesn't work" I propose ideas based on
> my own my experience.  I expect that ideas have to be investigated in
> context and even tried out to be really usefull and they can't be
> implemented right away.  But OTOH I don't like to accept a blanket
> "doesn't work" without good reason.  Even if there is a good reason
> against some strategy, it's good to know that reason.

have you read my post that explain why it is not good for us to have a
unix charset AND an internal charset that are not the same charset ?

I sent it on Mon, 08 Mar 2004 17:16:05 +0100

> > Truth is: the code is vast.  Changing the internals of such code is
> > a _big_ operation, you can't do that in a few days of hacking ...
> 
> No, of course not.  But I expect that adding optimizations based on
> assumptions that do actually *not* cover all cases isn't making it
> easier either.

If the optimization break any charset then it is a bug we need to fix

> In the short run you get to name the string constants.  I value that
> for its documentation value (if the names are chosen wisely).  In the
> medium run it can pay to look at the ways that those constants are
> used and making sure that all the places that use them, use them
> consistently and correctly.
> 
> You can achieve the same with thorough commenting and code review, but
> just having to name the strings can make it much more obvious.

Ok, thanks.

Simo.

-- 
Simo Sorce - simo.sorce at xsec.it
Xsec s.r.l. - http://www.xsec.it
via Garofalo, 39 - 20133 - Milano
mobile: +39 329 328 7702
tel. +39 02 2953 4143 - fax: +39 02 700 442 399