> >> What *really* sucks is the 2x (or 4x) overhead you pay for with UCS-2
> >> and UCS-4 - it impacts the amount of memory that applications need
> >> (bloat bloat bloat), the amount of bandwidth used, all of the
> >> standard functions people use, etc.  UTF-8 (and other variable-length
> >> encodings) can support Unicode > 64k, too, which is a big reason
> >> why the IETF is pushing UTF-8 instead of UCS-2 to support Unicode.

> >I still maintain that it's easy to make mistakes programming with
> >multi-length characters. I should know, I've made most of them :-).

> >Now for storage on disk, or traversal on the wire, utf8 is great.
> >But when you read that stuff into program memory for manipulation, then
> >fixed length is the way to go

> If we could prove (via benchmark/testcase) that using utf8 vs. ucs-2 over
> the wire improved performance measurably/significantly we would have a case
> to add a capabilities bit (or smb flag).   I will have to see how hard that
> would be to negotiate and implement in my Linux VFS client prototype &
> Samba server but it might be easier to roughly estimate more simply by
> benchmarking the difference in performance with Unicode disabled (i.e.
> single byte, ASCII ) vs. UCS-2 enabled in Samba - I don't know whether the
> performance difference would be significant enough to show up in benchmarks
> and it would depend on the client.  With Windows clients and netbench it
> would be tricky because Windows 2000 clients are presumably "natively"
> ucs-2 but with a Linux client to Samba it might be measurable.

You may want to look at client/server endianness when benchmarking, as
well.  If I'm not mistaken, the UCS-2 support in SMB is x86-endian
instead of being traditional network-endian.  UTF-8, being a bytewise
encoding, is of course endian-neutral.

