Proposal for modifying Get_Pwnam() [Re: couple of getpwnam()questions]

Kenichi Okuyama okuyamak at dd.iij4u.or.jp
Mon Nov 27 06:27:33 GMT 2000


>>>>> "GC" == Gerald Carter <gcarter at valinux.com> writes:
>> 1. Why did you not make "AS-IS" test first?
>> # I thought this is the easiest, and yet highest possible...
GC> Because I have never met a UNIX sysadmin which did not 
GC> use lower case usernames.  Also remember that all Win9x clients
GC> transmit the username in upper case.

Though I did have seen 'Administrator' in /etc/passwd before, I agree.


>> 2. I wonder what will happen if Japanese version of Windows Client
>> having username of Japanese......
>> Maybe we should make Get_Pwnam() as 'pointer to function',
>> and leave the way to switch to correct one according to 'coding
>> system' option.
GC> I would assume the local getpwnam() call should handle 
GC> the foreighn characters.  But I am by no means someone
GC> who should be talking about foreigh character support :-)

Ah... It's not as easy as you think.


Suppose you have to port Get_Pwnam() to system using

1) US-ASCII
2) EBCDIC
3) UTF-*
4) UCS-*

for password name.

# This is just an example to give you an image of how hard it is
# to make all into one. So never mind about whether you really have
# such an system or not. At least, if you think about porting Samba
# to System390, you do face EBCDIC.

You do know what kind of coding system will be passed as
Get_Pwnam()'s parameter. Its US-ASCII. But you don't know which
coding system you have to use for your system until you actually run
the samba.

# You might think you will know by compile time, but what if your system
# allow you to SELECT the coding system, and you're really making
# binary package for the system. In such a case, you can't use
# information given from your system as general. You have to make
# coding system selection dinamically.

Easiest way to solve this problem is to switch (*Get_Pwnam)() somehow
at somewhere, so that selected (*Get_Pwnam)() do match your
environment.


Just a 4 patterns? Well, in Japanese cases, you'll have chance of
following patterns required for output.

1) Your system does not support Japanese username.
2) Supports JIS ( or should I call it ISO-2022-jp ) username
3) Supports SJIS ( Microsoft Kanji Code ) username
4) Supports EUC username
5) Supports UTF-8 username
6) Supports UCS-4 username
7) Supports EBCDIC(IBM) username
8) Supports EBCDIC(Fujitu) username
9) Supports EBCDIC(Toshiba) username
#.... Ah... maybe more. 7,8, and 9 all differs. Slightly.

And currently, the given character code will be SJIS. But in future,
this will turn to UTF-8. And we have to treat each cases correctly,
and dynamically.
# Remember that, given character code can be MIXTURE of US-ASCII
# and SJIS too.


And still, think. I have not mentioned about Korians, Chinise,
Vietnum, etc....( mainly because I simply don't know about what kind
of selection do we have. But I'm very sure we do have a lot. ).


So, comes my suggestion. Just use pointer to function. When someone
start porting Samba to new system, he( or she ) will select ( or
simply create ) new_Get_Pwnam() that matchis one's request, and
change (*Get_Pwnam)() accordingly.

Leave problem to native speakers.  It's easy to make I18N
structure. But L10N is not an easy job.
---- 
Kenichi Okuyama at Tokyo Research Lab., IBM-Japan, Co.




More information about the samba-technical mailing list