[Samba] winbind trouble under load?

Andrew Bartlett abartlet at samba.org
Tue Oct 1 13:42:00 GMT 2002


Samba Samba /pers wrote:
> 
> We have a large W2K domain with numerous terminalservers at the local
> sites. Those sites also have a linux-2.2.20 server with samba-2.2.5.
> The samba is used to store the profiles for both the terminalservers
> and for the windows 2000/xp clients.
> 
> I use winbind and have joined the server to the domain without problem. I
> can set rights on directories and so on. However from time to time when the
> users login to the W2K terminalserver they get a popup-message:
> 
> "
> Windows cant locate your roaming profile and is attempting to log you on
> with your local profile. Changes to the profile will not be propagated to
> the server.
> 
> DETAIL - The specified network password is not correct.
> "
> 
> However since the user can login there is nothing wrong with their
> password. One of
> my teories is that there is something wrong when samba tries to auth the
> user to
> the W2K domain. Either it has lost the connection (and can't reconnect
> automatically)
> or there is some other error. The user does get a logon but are of course
> missing their
> profiles and such. Since this is a school environment the users login much
> at the same
> time and another idea I have is that the problem seems to show up when
> many users
> login at the same time.

Yes, well Samba can do nasty things to a DC when it has to hit it like
that.  That's one connection to the DC per authenticaion. :-(

> I have tried both samba-2.2.5 and currently samba-2.2.6cvs (020926). The
> problem still persists. This is leading me to the maillist in search for
> an answer.
> 
> I have disable the "winbind enum user/groups" since if I enable them
> winbind goes
> into a nonresponsive state, probably due to that we have 10K users and
> more.

Yes, that would be 'a good idea' :-).

> Im also testing to let samba create the users profile directory but that
> didn't effect
> the problem.
> 
> Samba also seems to loose the ability to lookup the users name in the
> domain and display the
> as this:
> 
> drwx------    4 10283     SKOLA\Do     4096 Aug 22 23:30 dla0826
> 
> instead of:
> drwx------    4 SKOLA\dla0826     SKOLA\Do     4096 Aug 22 23:30 dla0826

This would happen when winbind get's itself stuck.

> I have enclosed all my logs and the configuration.
> 
> This is turning into a major problem with the users and if I cant get this
> fixed then my only other option is to move the profiles back to the
> windows2000 fileservers. However that option would leave me with needing
> to transfer the profiles over the WAN to the users site.
> 
> smb.conf (from testparm)

testparm now (2.2.6pre2) has an option to only display non-default
values.  That makes it easier to figure out what you have actually
changed...

I would avoid the exec on open, just becouse I see Win2k doing a *lot*
of tree connects/disconnects.  I would instead suggest using
pam_mkhomdir (or a modified varient) becouse they occour per session,
not per tree.

> -----------------
> /usr/local/bin/crehome.sh
> #!/bin/sh
> 
> # 1.0.1 (2002-09-23)
> 
> SMBUSER=$1
> 
> if [ ! -d /samba/profiler/$SMBUSER ]; then
>   echo creating $SMBUSER >> /tmp/crehome.txt
>   mkdir /samba/profiler/$SMBUSER >> /tmp/crehome.txt
>   mkdir /samba/profiler/$SMBUSER/nt >> /tmp/crehome.txt
>   mkdir /samba/profiler/$SMBUSER/ts >> /tmp/crehome.txt
>   chgrp -R "SKOLA\Domain Users" /samba/profiler/$SMBUSER >>
> /tmp/crehome.txt
>   chmod 700 /samba/profiler/$SMBUSER >> /tmp/crehome.txt
>   echo "-----------------" >> /tmp/crehome.txt
> fi
> 
> -----------------
> 
> Error on terminalserver:
> 
> Event Type:     Error
> Event Source:   Userenv
> Event Category: None
> Event ID:       1000
> Date:           2002-10-01
> Time:           09:26:03
> User:           SKOLA\llu0731
> Computer:       KA-WTS01
> Description:
> Windows cannot locate your roaming profile and is attempting to log you on
> with your local profile. Changes to the profile will not be propagated to
> the server.
> 
> DETAIL - The specified network password is not correct.
> 
> Event Type:     Error
> Event Source:   Userenv
> Event Category: None
> Event ID:       1000
> Date:           2002-10-01
> Time:           09:26:04
> User:           NT AUTHORITY\SYSTEM
> Computer:       KA-WTS01
> Description:
> Windows cannot find the local profile and is logging you on with a
> temporary profile. Changes you make to this profile will be lost when you
> log off.
> 
> ....
> 
> "The specified network password is not correct" is however bullshit.

Well, that very much depends on what Samba told Win2k.

> -------------------
> 
> Error on W2K DC
> 
> Event Type:     Error
> Event Source:   Srv
> Event Category: None
> Event ID:       2006
> Date:           2002-09-30
> Time:           12:28:58
> User:           N/A
> Computer:       DC01
> Description:
> The server received an incorrectly formatted request from \\193.180.x.y
> Data:
> 0000: 00 00 34 00 02 00 7c 00   ..4...|.
> 0008: 00 00 00 00 d6 07 00 c0   ....Ö..À
> 0010: 00 00 00 00 01 20 98 c0   ..... ?À
> 0018: 00 00 00 00 00 00 00 00   ........
> 0020: 00 00 00 00 00 00 00 00   ........
> 0028: b3 06 00 00 ff 53 4d 42   ³...ÿSMB
> 0030: 25 00 00 00 00 08 01 c0   %......À
> 0038: 00 00 00 00 00 00 00 00   ........
> 0040: 00 00 00 00 00 d0 6d 38   .....Ðm8
> 0048: 02 50 01 00 10 00 00 48   .P.....H
> 0050: 00 00 00 48 00 00 00 00   ...H....
> 0058: 00 00 00 00               ....

Now *this* is interesting.  I've only heard of it once, and it was not
reproducable.  Can you reproduce this error, and try to get a packet
sniff of it?  I would be interested to see what it actually is.

> -------------------
> 
> /var/log/samba/log.winbind (earlier errors not related in time to the
> login troubles)
> 
> [2002/09/30 01:01:31, 0] lib/util_sock.c:read_socket_with_timeout(300)
>   read_socket_with_timeout: timeout read. read error = Connection reset by
> peer.
> [2002/09/30 01:01:31, 0] rpc_client/cli_pipe.c:rpc_api_pipe(359)
>   cli_pipe: return critical error. Error was SUCCESS - 0
> [2002/09/30 01:25:31, 0] lib/util_sock.c:read_socket_with_timeout(300)
>   read_socket_with_timeout: timeout read. read error = Connection reset by
> peer.
> [2002/09/30 01:25:31, 0] rpc_client/cli_pipe.c:rpc_api_pipe(359)
>   cli_pipe: return critical error. Error was SUCCESS - 0
> [2002/09/30 07:51:59, 0] lib/util_sock.c:read_socket_with_timeout(300)
>   read_socket_with_timeout: timeout read. read error = Connection reset by
> peer.
> [2002/09/30 07:51:59, 0] rpc_client/cli_pipe.c:rpc_api_pipe(359)
>   cli_pipe: return critical error. Error was SUCCESS - 0
> 
> -------------------
> 
> /var/log/samba/log.193.180.x.y (the terminalserver)
> 
> [2002/10/01 13:21:50, 0] smbd/sec_ctx.c:initialise_groups(244)
>   Unable to initgroups. Error was Input/output error
> 
> The logs are full of those message. However I think the are due to
> the fact that I have winbind enum groups = no in /etc/samba/smb.conf

That should not be.  That error is probably somthing else...

In any case, one course of action might be (assuming you are running an
Active Directory setup) to move to Samba 3.0.  If the Win2k clients get
kerberos credentials, then Samba doesn't need to contact the DC at all
for authenticaion.  (It might need to contact it for other things
however, but these can be cached too)  Also, Samba 3.0 uses an LDAP
client on AD, which I suspect will cope much better with 10000 users.  

Samba 3.0 also has a 'dual deamon' mode where it can opearate out of
it's cache while waiting for new answers from the DC, which might help
avoid a blocking winbind call backloging the entire system.

Finally, Samba 3.0 has *much* better error reporting, so you might get a
meaningful error message too!

Andrew Bartlett

-- 
Andrew Bartlett                                 abartlet at pcug.org.au
Manager, Authentication Subsystems, Samba Team  abartlet at samba.org
Student Network Administrator, Hawker College   abartlet at hawkerc.net
http://samba.org     http://build.samba.org     http://hawkerc.net



More information about the samba mailing list