Process smbd using 100% CPU and impossible to kill

Sam Liddicott sam at liddicott.com
Mon Feb 16 13:44:22 MST 2009


I think alt-sysreq-t will do a stack-trace of the process on the cpu to syslog, - at 100% chances are that it will be samba, so you could get lucky.

Sam

-----Original Message-----
From: David Collier-Brown <davec-b at rogers.com>
Sent: 16 February 2009 19:25
To: Cedric Simon <cedric at solucionjava.com>
Cc: samba-technical at lists.samba.org
Subject: Re: Process smbd using 100% CPU and impossible to kill


Cedric Simon wrote:
> Hello,
>
> We have recently installed a Samba server on OpenSuse 11.1 and we have
> the following problem: after some time, a smbd process starts using 100%
> of the CPU, and it is impossible to kill it, event with a kill -9 pid.
>
> The Samba service can be stop/started, but the smbd process keep using
> 100% CPU. Shutdown does not work either. Only a power off of the server
> can 'solve' the problem.
>   
You've tripped over a low-level problem of some sort which is doing a
denial-of-service
attack on Samba, and therefor on everyone else (;-))

If its always the same user that triggers the problem, or if the first
user who logs on will
trigger it, you can attach a debugger to the samba process, induce the
problem and
tell the samba folks where it died, which *may* give you a clue about
what failed.
If not, you can run strace on it and see if it loops on a system call.
Failing that, I'd try swapping parts (:-().

--dave


> Please fin below my findings and info.
>
> The pid is running as root instead of admon, and the running time = time
> since user (IP 192.168.1.67) actually disconected from Samba. I assume
> something is wrong while closing the process, and the process enter in
> an unstable/phantom status, using 100% of CPU.
>
> As CPU is used 100%, it affects the whole server :-(((
>
> If you have any idea of what could be wrong/solve this problem, feel
> free to tell me. Also if you need some more info, if I can get it I'll
> send it to you.
>
> As the server is in prod at a client's site, I can do 'what I want' with
> the server. We are investigating of moving the Windows clients to NFS.
>
> Please note most users, ie. the IP 192.168.1.67, are using wireless
> conection, and is some case can loose the network. This might be part of
> the problem.
>
> But my major concern is how can Linux have a process running (smbd) that
> is impossible to kill and prohibit shutdown of the server, as well as
> 'normal' operation, since it uses 100% CPU.
>
> Many thanks in advance for your help.
>
> Cedric Simon.
>
>
> smb.conf
>
> [global]
> workgroup = MEDLAB
> server string = Servidor de archivos de Medlab
> map to guest = Bad User
> null passwords = Yes
> guest account = samba
> printcap name = cups
> ldap ssl = no
> create mask = 0777
> force create mode = 0777
> force security mode = 0777
> directory mask = 0777
> force directory mode = 0777
> force directory security mode = 0777
> cups options = raw
>
> [users]
> comment = All users
> path = /shared
> read only = No
> inherit acls = Yes
> veto files = /aquota.user/groups/shares/
>
> [admon]
> comment = Administracion
> path = /shared/admon
> read only = No
> inherit acls = Yes
> veto files = /aquota.user/groups/shares/
>
> [clientes]
> comment = Clientes
> path = /shared/clientes
> read only = No
> inherit acls = Yes
> veto files = /aquota.user/groups/shares/
>
> [gerencia]
> comment = Gerencia
> path = /shared/gerencia
> read only = No
> inherit acls = Yes
> veto files = /aquota.user/groups/shares/
>
> [medicos]
> comment = Medicos
> path = /shared/medicos/
> inherit acls = yes
> veto files = /aquota.user/groups/shares/
> guest ok = yes
> read only = no
>
>
> [compartido]
> comment = All groups
> path = /shared/compartido/
> username = samba
> read only = No
> acl check permissions = No
> force unknown acl user = Yes
> guest ok = Yes
> hosts allow = 192.168.1.
>
>
> relih:~ # top
> top - 12:35:00 up 19:40,  1 user,  load average: 3.01, 2.92, 2.33
> Tasks: 132 total,   4 running, 128 sleeping,   0 stopped,   0 zombie
> Cpu(s):  0.0%us, 25.0%sy,  0.0%ni, 74.5%id,  0.3%wa,  0.0%hi,  0.2%si,
> 0.0%st
> Mem:   2048884k total,  1996896k used,    51988k free,    98664k buffers
> Swap:  2104504k total,       28k used,  2104476k free,  1506752k cached
>
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
> 11878 root      20   0 16984 5188 3932 R  100  0.3  37:24.13 smbd
> 14763 root      20   0  2432 1132  848 R    1  0.1   0:00.04 top
>     1 root      20   0  1008  380  332 S    0  0.0   0:02.00 init
>     2 root      15  -5     0    0    0 S    0  0.0   0:00.00 kthreadd
>     3 root      RT  -5     0    0    0 S    0  0.0   0:00.00 migration/0
>     4 root      15  -5     0    0    0 S    0  0.0   0:00.84 ksoftirqd/0
>     5 root      RT  -5     0    0    0 S    0  0.0   0:00.00 migration/1
>
> log.smbd:
>
> [2009/02/14 03:45:15,  0] smbd/server.c:main(1208)
>   smbd version 3.2.6-0.3.1-2042-SUSE-CODE11 started.
>   Copyright Andrew Tridgell and the Samba Team 1992-2008
> [2009/02/14 07:25:36,  1] smbd/service.c:make_connection_snum(1194)
>   nadia (::ffff:192.168.1.104) connect to service admon initially as
> user admon (uid=1002, gid=100) (pid 11622)
> [2009/02/14 07:34:17,  1] smbd/service.c:make_connection_snum(1194)
>   lenovo_medicos (::ffff:192.168.1.80) connect to service medicos
> initially as user medicos (uid=1004, gid=100) (pid 11651)
> [2009/02/14 07:34:17,  1] smbd/service.c:make

[The entire original message is not included]


More information about the samba-technical mailing list