utf8 vs ucs2

Andrew Tridgell tridge at samba.org
Tue May 22 13:27:34 GMT 2001


> typedef struct sambastring_tag
> {
> 	UINT16* buffer;
> 	int length;
> 	...
> } sambastring;

yes, this is similar to what Tim proposed a while back. It would be a
good thing but it does require a *lot* of code rewriting. The
pstring/fstring stuff is stack allocated, and changing to explicit
allocation would require a lot of thought. 

The big step for me was realising that we don't have to do this string
structure change at the same time as the change to ucs2/utf8. So we
can keep going with the old pstring/fstring stuff until we have the
string formats sorted out, then deal with the allocation and string
structure problem later. That makes the problem much more tractable.

> Like this, and we can change the internal character code easily,
> because all the relevant character manipulations are only in those
> member functions.
> Also we can cache some string properties (character counts) in
> structure.

I also thought that the structure could have a flags field, that would
say (for example) what format the string is currently in. That would
make multi-format support much easier.

Some day we will have decent string handling in Samba, but it will
take a while :)





More information about the samba-technical mailing list