Process smbd using 100% CPU and impossible to kill
Sam Liddicott
sam at liddicott.com
Mon Feb 16 13:44:22 MST 2009
I think alt-sysreq-t will do a stack-trace of the process on the cpu to syslog, - at 100% chances are that it will be samba, so you could get lucky.
Sam
-----Original Message-----
From: David Collier-Brown <davec-b at rogers.com>
Sent: 16 February 2009 19:25
To: Cedric Simon <cedric at solucionjava.com>
Cc: samba-technical at lists.samba.org
Subject: Re: Process smbd using 100% CPU and impossible to kill
Cedric Simon wrote:
> Hello,
>
> We have recently installed a Samba server on OpenSuse 11.1 and we have
> the following problem: after some time, a smbd process starts using 100%
> of the CPU, and it is impossible to kill it, event with a kill -9 pid.
>
> The Samba service can be stop/started, but the smbd process keep using
> 100% CPU. Shutdown does not work either. Only a power off of the server
> can 'solve' the problem.
>
You've tripped over a low-level problem of some sort which is doing a
denial-of-service
attack on Samba, and therefor on everyone else (;-))
If its always the same user that triggers the problem, or if the first
user who logs on will
trigger it, you can attach a debugger to the samba process, induce the
problem and
tell the samba folks where it died, which *may* give you a clue about
what failed.
If not, you can run strace on it and see if it loops on a system call.
Failing that, I'd try swapping parts (:-().
--dave
> Please fin below my findings and info.
>
> The pid is running as root instead of admon, and the running time = time
> since user (IP 192.168.1.67) actually disconected from Samba. I assume
> something is wrong while closing the process, and the process enter in
> an unstable/phantom status, using 100% of CPU.
>
> As CPU is used 100%, it affects the whole server :-(((
>
> If you have any idea of what could be wrong/solve this problem, feel
> free to tell me. Also if you need some more info, if I can get it I'll
> send it to you.
>
> As the server is in prod at a client's site, I can do 'what I want' with
> the server. We are investigating of moving the Windows clients to NFS.
>
> Please note most users, ie. the IP 192.168.1.67, are using wireless
> conection, and is some case can loose the network. This might be part of
> the problem.
>
> But my major concern is how can Linux have a process running (smbd) that
> is impossible to kill and prohibit shutdown of the server, as well as
> 'normal' operation, since it uses 100% CPU.
>
> Many thanks in advance for your help.
>
> Cedric Simon.
>
>
> smb.conf
>
> [global]
> workgroup = MEDLAB
> server string = Servidor de archivos de Medlab
> map to guest = Bad User
> null passwords = Yes
> guest account = samba
> printcap name = cups
> ldap ssl = no
> create mask = 0777
> force create mode = 0777
> force security mode = 0777
> directory mask = 0777
> force directory mode = 0777
> force directory security mode = 0777
> cups options = raw
>
> [users]
> comment = All users
> path = /shared
> read only = No
> inherit acls = Yes
> veto files = /aquota.user/groups/shares/
>
> [admon]
> comment = Administracion
> path = /shared/admon
> read only = No
> inherit acls = Yes
> veto files = /aquota.user/groups/shares/
>
> [clientes]
> comment = Clientes
> path = /shared/clientes
> read only = No
> inherit acls = Yes
> veto files = /aquota.user/groups/shares/
>
> [gerencia]
> comment = Gerencia
> path = /shared/gerencia
> read only = No
> inherit acls = Yes
> veto files = /aquota.user/groups/shares/
>
> [medicos]
> comment = Medicos
> path = /shared/medicos/
> inherit acls = yes
> veto files = /aquota.user/groups/shares/
> guest ok = yes
> read only = no
>
>
> [compartido]
> comment = All groups
> path = /shared/compartido/
> username = samba
> read only = No
> acl check permissions = No
> force unknown acl user = Yes
> guest ok = Yes
> hosts allow = 192.168.1.
>
>
> relih:~ # top
> top - 12:35:00 up 19:40, 1 user, load average: 3.01, 2.92, 2.33
> Tasks: 132 total, 4 running, 128 sleeping, 0 stopped, 0 zombie
> Cpu(s): 0.0%us, 25.0%sy, 0.0%ni, 74.5%id, 0.3%wa, 0.0%hi, 0.2%si,
> 0.0%st
> Mem: 2048884k total, 1996896k used, 51988k free, 98664k buffers
> Swap: 2104504k total, 28k used, 2104476k free, 1506752k cached
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 11878 root 20 0 16984 5188 3932 R 100 0.3 37:24.13 smbd
> 14763 root 20 0 2432 1132 848 R 1 0.1 0:00.04 top
> 1 root 20 0 1008 380 332 S 0 0.0 0:02.00 init
> 2 root 15 -5 0 0 0 S 0 0.0 0:00.00 kthreadd
> 3 root RT -5 0 0 0 S 0 0.0 0:00.00 migration/0
> 4 root 15 -5 0 0 0 S 0 0.0 0:00.84 ksoftirqd/0
> 5 root RT -5 0 0 0 S 0 0.0 0:00.00 migration/1
>
> log.smbd:
>
> [2009/02/14 03:45:15, 0] smbd/server.c:main(1208)
> smbd version 3.2.6-0.3.1-2042-SUSE-CODE11 started.
> Copyright Andrew Tridgell and the Samba Team 1992-2008
> [2009/02/14 07:25:36, 1] smbd/service.c:make_connection_snum(1194)
> nadia (::ffff:192.168.1.104) connect to service admon initially as
> user admon (uid=1002, gid=100) (pid 11622)
> [2009/02/14 07:34:17, 1] smbd/service.c:make_connection_snum(1194)
> lenovo_medicos (::ffff:192.168.1.80) connect to service medicos
> initially as user medicos (uid=1004, gid=100) (pid 11651)
> [2009/02/14 07:34:17, 1] smbd/service.c:make
[The entire original message is not included]
More information about the samba-technical
mailing list