i18n question.

Alexander Bokovoy a.bokovoy at sam-solutions.net
Tue Mar 9 13:10:01 GMT 2004


On Tue, Mar 09, 2004 at 09:15:24PM +0900, Kenichi Okuyama wrote:
> Simo> Another thing that will break for sure are all getpw* calls so the
> Simo> system _need_ to be in utf8 (if you use utf8 as the default charset) and
> Simo> only some secondary disk can be in a different charset with samba3+vfs
> Simo> after all ...
> 
> Hmmm....
> 
> What you're saying is, that if /etc/passwd contains EUC character as
> username, we have to use EUC as 'unix charset'? Even if filesystem
> is managed in UTF-8? We have to have pass modules for converting
> EUC->UTF8 against all the file IO?
> 
> That sounds sad.. Though I have never heard of unix system allowing
> EUC nor CP932 as username... But there might be.
Basically, if you have system with inconsistent charset usage between
filenames and user/group names, you'd already screwed. :) On other hand, 
I see no problem in having user/group names in non-ASCII but in the same encoding 
used to encode all other components (file names, etc). We use this with UTF-8 
as long as corresponding UTF-8 string for group and user names does not exceed real
limits of 32 bytes per name which are enforced by GNU tar implementation
(for group name) and utmp struct (for user name). Oh, and on HP UX group
name is limited to 16 bytes -- where regular GNU/Linux shadow package
follows the thread during group creation if you would not patch it appropriately. 
Have fun. Real limitation is tar's implementation which simply strips everything 
after 32th byte in group and user name.


> Maybe we should have some modules for that purpose, independent of
> filesystem module layer.  So that if filesystem is UTF8, then we can
> have UTF-8 as 'unix charset', and take time on handling username,
> which does not happen as often as file IO.
This is what Tridge already suggested to do with per-request reencoding in
Samba4. 

-- 
/ Alexander Bokovoy
Samba Team                      http://www.samba.org/
ALT Linux Team                  http://www.altlinux.org/
Midgard Project Ry              http://www.midgard-project.org/


More information about the samba-technical mailing list