init_unistr2 length calculation

Fri Feb 14 18:55:29 GMT 2003

Thanks for clearing that up.

I took a look at the log for the file and saw that tridge expected the
'len' argument to init_unistr2() to be the character length, not the byte
length of the string. So it appears the callers will have to be fixed, not
the function as I thought.

Would be good to have a function that calculated the character length
after conversion to UCS2 since it's much more efficient to calculate (/2)
than that of a multi-byte charset. Maybe there is.. need to take a look.

Thanks,
Shirish

On Fri, 14 Feb 2003, Gerald (Jerry) Carter wrote:
>
>On Thu, 13 Feb 2003, Shirish Kalele wrote:
>
>> >> In init_unistr2, the string length for the UNISTR2 structure seems to
>> >> be set equal to the number of bytes occupied by the string when
>> >> encoded in the Unix charset (i.e. the value returned by strlen()).
>> >> This is not necessarily the number of characters in the string (given
>> >> UTF-8 and other variable-byte charsets).
>> >>
>> >> Shouldn't this actually be set to half the number of bytes occupied
>> >> by the string after encoding it in UCS2? Here's a patch that does
>> >> this.
>> >
>> >I think you might get into trouble here due to difference in the MS
>> >unicode marshalling "flexibility".
>>
>> I don't understand. Could you elaborate?
>
>i guess if (length_of_bytes_in_orig_string != num_character_in_string)
>then we would have a problem.  Had to think though this a bit.
>
>I think I misunderstood you to start with.  I thought we were talking
>about UNISTR2 length == num_characters.  My point was that sometimes this
>is actually == num_characters*2 (as you mentioned).
>