Proposal for modifying Get_Pwnam() [Re: couple of
getpwnam()questions]
Kenichi Okuyama
okuyamak at dd.iij4u.or.jp
Mon Nov 27 06:27:33 GMT 2000
>>>>> "GC" == Gerald Carter <gcarter at valinux.com> writes:
>> 1. Why did you not make "AS-IS" test first?
>> # I thought this is the easiest, and yet highest possible...
GC> Because I have never met a UNIX sysadmin which did not
GC> use lower case usernames. Also remember that all Win9x clients
GC> transmit the username in upper case.
Though I did have seen 'Administrator' in /etc/passwd before, I agree.
>> 2. I wonder what will happen if Japanese version of Windows Client
>> having username of Japanese......
>> Maybe we should make Get_Pwnam() as 'pointer to function',
>> and leave the way to switch to correct one according to 'coding
>> system' option.
GC> I would assume the local getpwnam() call should handle
GC> the foreighn characters. But I am by no means someone
GC> who should be talking about foreigh character support :-)
Ah... It's not as easy as you think.
Suppose you have to port Get_Pwnam() to system using
1) US-ASCII
2) EBCDIC
3) UTF-*
4) UCS-*
for password name.
# This is just an example to give you an image of how hard it is
# to make all into one. So never mind about whether you really have
# such an system or not. At least, if you think about porting Samba
# to System390, you do face EBCDIC.
You do know what kind of coding system will be passed as
Get_Pwnam()'s parameter. Its US-ASCII. But you don't know which
coding system you have to use for your system until you actually run
the samba.
# You might think you will know by compile time, but what if your system
# allow you to SELECT the coding system, and you're really making
# binary package for the system. In such a case, you can't use
# information given from your system as general. You have to make
# coding system selection dinamically.
Easiest way to solve this problem is to switch (*Get_Pwnam)() somehow
at somewhere, so that selected (*Get_Pwnam)() do match your
environment.
Just a 4 patterns? Well, in Japanese cases, you'll have chance of
following patterns required for output.
1) Your system does not support Japanese username.
2) Supports JIS ( or should I call it ISO-2022-jp ) username
3) Supports SJIS ( Microsoft Kanji Code ) username
4) Supports EUC username
5) Supports UTF-8 username
6) Supports UCS-4 username
7) Supports EBCDIC(IBM) username
8) Supports EBCDIC(Fujitu) username
9) Supports EBCDIC(Toshiba) username
#.... Ah... maybe more. 7,8, and 9 all differs. Slightly.
And currently, the given character code will be SJIS. But in future,
this will turn to UTF-8. And we have to treat each cases correctly,
and dynamically.
# Remember that, given character code can be MIXTURE of US-ASCII
# and SJIS too.
And still, think. I have not mentioned about Korians, Chinise,
Vietnum, etc....( mainly because I simply don't know about what kind
of selection do we have. But I'm very sure we do have a lot. ).
So, comes my suggestion. Just use pointer to function. When someone
start porting Samba to new system, he( or she ) will select ( or
simply create ) new_Get_Pwnam() that matchis one's request, and
change (*Get_Pwnam)() accordingly.
Leave problem to native speakers. It's easy to make I18N
structure. But L10N is not an easy job.
----
Kenichi Okuyama at Tokyo Research Lab., IBM-Japan, Co.
More information about the samba-technical
mailing list