my trip to a big win2k/wk3 forest

Andreas andreas at conectiva.com.br
Tue Oct 19 14:55:57 GMT 2004


Hi all,

I would just like to tell how was my trip to a site with several dozens
Active Directory servers in a country-wide forest (several WAN links).

I installed samba-3.0.7 (+ the winbind group patch to allow for more than
64 group membership) on a linux desktop machine and was given the task to
integrate this machine to the domain and have it authenticate all users.
After a few days I also tried samba 3_0 from svn (latest was from yesterday),
but it didn't change much.

I joined the domain without problems and could authenticate either via kerberos5
or pam_winbind.

The first problem was with kde-3.2.3. A simple right-click on a folder and then
selecting properties would cause the machine to hang for about 6 minutes. The
network traffic went through the roof. It turns out KDE was enumerating all
groups via getgrent in order to present the user with a nice dialog box with
all groups the user belongs to so that the file permissions could be easily
changed. Well, in this situation it was trying to enumerate all groups in all
domains. I opened a ticket about this and tried a simple patch that made it use
getgrouplist() instead, and it worked.

(http://bugs.kde.org/show_bug.cgi?id=89646)

The patch is not very nice, it's just to prove a point. I hope some kde developer
can come up with a better one.

But before that I realised I had to limit the scope of my userbase. Having winbind
query all AD servers was slow and unecessary for the most part, since at this site
roaming users are not that frequent. So I added to smb.conf:

winbind trusted domains only = yes

This made even KDE (unpatched) behave better, since it would only enumerate groups
local to that domain (about 400 groups in this case).

Without the patch and the trusted domains limitation, while I was debugging this I
saw many times in the winbind logs messages about winbind not being able to contact
a specific AD server and giving up after 10s. At the same time it seemed that whatever
task was being done (like enumerating all groups) would stop, so I was left wondering
if winbind stopped even trying to reach the other AD servers.

Now, a question more about the windows side of this. How does a windows 2k client behave
in this scenario when one wants, for example, to give permission to a user from another
domain? Does it contact the AD server for that domain to retrieve the user list or does
it contact the local AD server and expect it to fetch this user list? From my observations,
it seems samba would try to contact all AD servers it could.

Also during testing I experienced some flaws related to groups. For instance, once
getgrouplist() returned gids which getgrgid() could not resolve to a group entry. Checking
the gids manually turned out that they were actually gids associated with several BUILTIN
groups.
I also saw some discrepancy between the outputs of getgrouplist() and "id user". This may
be due to a cache issue I think. I can't come up with a complete test scenario for this yet.


So, executive summary:
- I think setting "winbind trusted domains only = yes" is a must for this network, at least
  for now. I will loose the ability to authenticate roaming users, though.
- it seems group resolution is somewhat fragile at the moment. That or the idmap backend, I
  can't be sure. Several times I wiped out the cache and the idmap file, specially after
  upgrading to 3.0.8 (perhaps due to the lowercasing that is being done in that version).
  The BUILTIN groups are also causing some trouble it seems, being mixed with real groups.
  (what is the purpose of BUILTIN groups anyway? Are they like the local groups on a win machine?)
- getgrent() and friends should be erradicated :)



More information about the samba-technical mailing list