[Samba] winbind ubuntu 9.10 crashing machine

Laurent BARRAILLE laurent.barraille at iut-nimes.fr
Tue May 11 03:16:37 MDT 2010


Le 10/05/2010 19:14, Jim Kusznir a écrit :
> Hi all:
>
> I've got a couple Ubuntu 9.10 machines that are suffering from a
> recurring failure of winbind that essentially crash the machine.  When
> the system is in the "crashed state", one can ping the system, but all
> forms of login fail.
It's normal, winbind don't works anymore, so all services using pam are 
out of service.
> It will not even respond to tftpd requests; ssh
> connections "time out", but the initial port is opened (just no
> connect).  Rebooting does NOT recover from this, in order to recover,
> I need to:
>
> 1) reboot into single user mode
>    
Have you enough place on your partitions at this step ?
> 2) edit /etc/nsswitch.conf and remove winbind
> 3) remove winbind from all pam.d/*
> 4) boot normally
> 5) stop samba and winbind
> 6) delete /var/lib/samba/* and /var/cache/samba/*
> 7) start samba
> 8) rejoin doimain
> 9) start winbind
> 10) undo #2 and 3 above
>
> After this, winbind will work for a week or two.  If I stop after step
> 4 above the system is usable, but without domain users able to log in.
>   My diagnostics show that net ads users (and all other "samba"
> commands) work just fine and find all users.  All winbind-specific
> commands (wbinfo -u, etc) fail.  Oh, if I leave the system up in the
> crashed state, it begins to fill up logs to the tune of 32gigs in a
> few days.  The above procedure repeats approximately once every 5 days
> on our main production system.  I have a second workstation that sees
> very little use, and it has suffered the same crash, but far less
> frequently.  I have also tried inserting step 6.5 where I delete the
> machine account on the DC, but that doesn't change anything.  Also,
> our Ubuntu 9.04 system running the same configuration files has no
> issues.  We have not tried 10.04.
>
> This problem has been plaguing our operations for over two months now,
> so any assistance would be greatly appreciated.
>
> Some log file snippits:
>
> (from some point "in the middle" of the crash):
> May  7 15:32:45 casas-lin winbindd[20677]:   sys_select: pipe failed
> (Too many open files)
>    
"Too many open files" means your system has reach the limit of open files

try tu use lsof command to see which process open too many files.

lsof|wc -l

to see how many files are open

lsof|less

to see all open files

cat /proc/sys/fs/file-max

to see the system limit

> May  7 15:32:45 casas-lin winbindd[20677]: [2010/05/07 15:32:45,  0]
> lib/events.c:287(s3_event
> _debug)
> May  7 15:32:45 casas-lin winbindd[20677]:   s3_event: sys_select()
> failed: 24:Too many open f
> iles
> May  7 15:32:45 casas-lin winbindd[20677]: [2010/05/07 15:32:45,  0]
> lib/select.c:64(sys_selec
> t)
> May  7 15:32:45 casas-lin winbindd[20677]: [2010/05/07 15:32:45,  0]
> lib/debug.c:663(reopen_lo
> gs)
> May  7 15:32:45 casas-lin winbindd[20677]:   Unable to open new log
> file /var/log/samba/log.wb
> -CASAS: Too many open files
> ------
>  From startup (step 4 above):
> May 10 08:36:50 casas-lin kernel: May 10 08:38:42 casas-lin
> winbindd[1571]: [2010/05/10 08:38:
> 42,  0] libsmb/smb_signing.c:255(signing_good)
> May 10 08:38:42 casas-lin winbindd[1571]:   signing_good: BAD SIG: seq 41
> May 10 08:42:25 casas-lin winbindd[1562]: [2010/05/10 08:42:25,  0]
> winbindd/winbindd_dual.c:1
> 86(async_request_timeout_handler)
> May 10 08:42:25 casas-lin winbindd[1562]:
> async_request_timeout_handler: child pid 1571 is n
> ot responding. Closing connection to it.
> May 10 08:42:25 casas-lin winbindd[1571]: [2010/05/10 08:42:25,  0]
> winbindd/winbindd.c:190(wi
> nbindd_sig_term_handler)
> May 10 08:42:25 casas-lin winbindd[1571]:   Got sig[15] terminate (is_parent=0)
> May 10 08:42:25 casas-lin winbindd[1825]: [2010/05/10 08:42:25,  0]
> rpc_client/cli_pipe.c:687(
> cli_pipe_verify_schannel)
> May 10 08:42:25 casas-lin winbindd[1825]:   cli_pipe_verify_schannel:
> auth_len 56.
> May 10 08:43:37 casas-lin winbindd[1825]: [2010/05/10 08:43:37,  0]
> libsmb/smb_signing.c:255(s
> igning_good)
> May 10 08:43:37 casas-lin winbindd[1825]:   signing_good: BAD SIG: seq 23
> May 10 08:47:25 casas-lin winbindd[1562]: [2010/05/10 08:47:25,  0]
> winbindd/winbindd_dual.c:1
> 86(async_request_timeout_handler)
> May 10 08:47:25 casas-lin winbindd[1562]:
> async_request_timeout_handler: child pid 1825 is n
> ot responding. Closing connection to it.
> May 10 08:47:25 casas-lin winbindd[1825]: [2010/05/10 08:47:25,  0]
> winbindd/winbindd.c:190(wi
> nbindd_sig_term_handler)
> May 10 08:47:25 casas-lin winbindd[1825]:   Got sig[15] terminate (is_parent=0)
> May 10 08:47:25 casas-lin winbindd[1832]: [2010/05/10 08:47:25,  0]
> rpc_client/cli_pipe.c:687(
> cli_pipe_verify_schannel)
> May 10 08:47:25 casas-lin winbindd[1832]:   cli_pipe_verify_schannel:
> auth_len 56.
> May 10 08:48:38 casas-lin winbindd[1832]: [2010/05/10 08:48:38,  0]
> libsmb/smb_signing.c:255(s
> igning_good)
> May 10 08:48:38 casas-lin winbindd[1832]:   signing_good: BAD SIG: seq 23
> May 10 08:52:25 casas-lin winbindd[1562]: [2010/05/10 08:52:25,  0]
> winbindd/winbindd_dual.c:1
> 86(async_request_timeout_handler)
> May 10 08:52:25 casas-lin winbindd[1562]:
> async_request_timeout_handler: child pid 1832 is n
> ot responding. Closing connection to it.
> May 10 08:52:25 casas-lin winbindd[1832]: [2010/05/10 08:52:25,  0]
> winbindd/winbindd.c:190(wi
> nbindd_sig_term_handler)
>
> ---------
> log.wb-CASAS (my domain is CASAS.WSU.EDU)
> [2010/05/10 09:12:26,  1] libsmb/clikrb5.c:697(ads_krb5_mk_req)
>    ads_krb5_mk_req: krb5_get_credentials failed for ad1$@CASAS (KDC
> reply did not match expectations)
> [2010/05/10 09:12:26,  1] libsmb/cliconnect.c:745(cli_session_setup_kerberos)
>    cli_session_setup_kerberos: spnego_gen_negTokenTarg failed: KDC
> reply did not match expectations
> [2010/05/10 09:12:26,  0] rpc_client/cli_pipe.c:687(cli_pipe_verify_schannel)
>    cli_pipe_verify_schannel: auth_len 56.
> [2010/05/10 09:12:26,  1]
> rpc_client/cli_pipe.c:948(cli_pipe_validate_current_pdu)
>    cli_pipe_validate_current_pdu: RPC fault code DCERPC fault
> 0x00000721 received from host ad1.casas.wsu.edu!
> -------
> log-wb-CASAS.old (during "crashed state"):
> [2010/04/19 08:17:23,  1] libsmb/clikrb5.c:697(ads_krb5_mk_req)
>    ads_krb5_mk_req: krb5_get_credentials failed for ad1$@CASAS (Cannot
> resolve network address
> for KDC in requested realm)
> [2010/04/19 08:17:23,  1] libsmb/cliconnect.c:745(cli_session_setup_kerberos)
>    cli_session_setup_kerberos: spnego_gen_negTokenTarg failed: Cannot
> resolve network address f
> or KDC in requested realm
> [2010/04/19 08:17:23,  0] rpc_client/cli_pipe.c:687(cli_pipe_verify_schannel)
>    cli_pipe_verify_schannel: auth_len 56.
> [2010/04/19 08:17:23,  1]
> rpc_client/cli_pipe.c:948(cli_pipe_validate_current_pdu)
>    cli_pipe_validate_current_pdu: RPC fault code DCERPC fault
> 0x00000721 received from host ad1
> .casas.wsu.edu!
> ------------
> My configuration
> ------------
> smb.conf
> ------------
> [global]
>          security = ads
>          netbios name = casas-lin
>          realm = CASAS.WSU.EDU
> 	workgroup = CASAS
>          password server = ad1.casas.wsu.edu
>          workgroup = CASAS
>          idmap uid = 10000-20000
>          idmap gid = 10000-20000
> 	idmap backend = rid:CASAS.WSU.EDU=10000-20000
>          winbind enum users = yes
>          winbind enum groups = yes
>          winbind use default domain = yes
>          #template homedir = /home/%U
>          template homedir = /net/files/home/%U
>          template shell = /bin/bash
> ;        client use spnego = yes
>          domain master = no
> --------------
> /etc/krb5.conf
> -------------
> [logging]
>   default =FILE:/var/log/krb5libs.log
>   kdc =FILE:/var/log/krb5kdc.log
>   admin_server =FILE:/var/log/kadmind.log
>
> [libdefaults]
>   default_realm = CASAS.WSU.EDU
>   dns_lookup_realm = false
>   dns_lookup_kdc = true
>   ticket_lifetime = 24h
>   forwardable = yes
>
> [realms]
>   EXAMPLE.COM = {
>    kdc = kerberos.example.com:88
>    admin_server = kerberos.example.com:749
>    default_domain = example.com
>   }
>
>   CASAS.WSU.EDU = {
>    kdc = ad1.casas.wsu.edu
>    admin_server = ad1.casas.wsu.edu
>    kdc = ad1.casas.wsu.edu
>   }
>
>   CASAS = {
>    kdc = ad1.casas.wsu.edu
>    admin_server = ad1.casas.wsu.edu
>    kdc = ad1.casas.wsu.edu
>   }
>
> [domain_realm]
>   .example.com = EXAMPLE.COM
>   example.com = EXAMPLE.COM
>
>   casas.wsu.edu = CASAS.WSU.EDU
>   .casas.wsu.edu = CASAS.WSU.EDU
> [appdefaults]
>   pam = {
>     debug = false
>     ticket_lifetime = 36000
>     renew_lifetime = 36000
>     forwardable = true
>     krb4_convert = false
>   }
> ---------------
> /etc/pam.d/common-account
> ---------------
> account	[success=1 new_authtok_reqd=done default=ignore]	pam_unix.so
> account	requisite			pam_deny.so
> account	required			pam_permit.so
> account	sufficient			pam_winbind.so
> account	required			pam_krb5.so minimum_uid=1000
> ------------
> /etc/pam.d/common-auth
> ------------
> auth	[success=3 default=ignore]	pam_winbind.so krb5_auth krb5_ccache_type=FILE
> auth	[success=2 default=ignore]	pam_krb5.so minimum_uid=1000 try_first_pass
> auth	[success=1 default=ignore]	pam_unix.so nullok_secure try_first_pass
> auth	requisite			pam_deny.so
> auth	required			pam_permit.so
> ------------
> /etc/pam.d/common-password
> ------------
> password	requisite			pam_winbind.so
> password	requisite			pam_krb5.so minimum_uid=1000 use_authtok
> password	[success=1 default=ignore]	pam_unix.so obscure use_authtok
> try_first_pass sha512
> password	requisite			pam_deny.so
> password	required			pam_permit.so
> password	optional	pam_gnome_keyring.so
> -------------
> /etc/nsswitch.conf
> -------------
> passwd:         compat winbind
> group:          compat winbind
> shadow:         compat
>
> hosts:          files dns mdns4
> networks:       files
>
> protocols:      db files
> services:       db files
> ethers:         db files
> rpc:            db files
>
> netgroup:       nis
> ----------------
>
> Thanks!
> --Jim
>    


More information about the samba mailing list