stuck Samba3 smbd processes on build farm hosts

Andrew Tridgell tridge at osdl.org
Sun Jun 12 02:56:35 GMT 2005


Jeremy,

I've just been fixing some ways in which smbd/smbtorture could get
stuck while running the build farm tests in Samba4. In doing so, I
noticed that Samba3 is hitting the same problem. For example, I see:

bash-2.03$ ps -fu build|grep Jun.01|grep bin/smbd
   build  4316  4315  0   Jun 01 ?        0:00 /home/build/build_farm/prefix/samba_3_0/sbin/smbd
   build 19883 19882  0   Jun 01 ?        0:00 /home/build/build_farm/prefix/samba_3_0/sbin/smbd
   build  5593     1  0   Jun 01 ?        0:00 /home/build/build_farm/prefix/samba_3_0/sbin/smbd
   build  5594  5593  0   Jun 01 ?        0:00 /home/build/build_farm/prefix/samba_3_0/sbin/smbd
   build 19882     1  0   Jun 01 ?        0:00 /home/build/build_farm/prefix/samba_3_0/sbin/smbd
   build 22305     1  0   Jun 01 ?        0:00 /home/build/build_farm/prefix/samba_3_0/sbin/smbd
   build 18144     1  0   Jun 01 ?        0:00 /home/build/build_farm/prefix/samba_3_0/sbin/smbd
   build 18145 18144  0   Jun 01 ?        0:00 /home/build/build_farm/prefix/samba_3_0/sbin/smbd
   build 22306 22305  0   Jun 01 ?        0:00 /home/build/build_farm/prefix/samba_3_0/sbin/smbd
   build  4315     1  0   Jun 01 ?        0:00 /home/build/build_farm/prefix/samba_3_0/sbin/smbd
   build 21137     1  0   Jun 01 ?        0:00 /home/build/build_farm/prefix/samba_3_0/sbin/smbd
   build 21138 21137  0   Jun 01 ?        0:00 /home/build/build_farm/prefix/samba_3_0/sbin/smbd


that is on sun1, a solaris 2.8 box, and as you can see those processes
have been stuck for 11 days now. How do you want to handle this? In
Samba4 I have just added a --maximum-runtime switch to smbd, and it
exits if that is exceeded. This is not a like the ulimit cputime
limit, as that only helps when the process is chewing cpu, this is to
limit the wall clock time the process runs for, which you can't do
with ulimit :-(

Unfortunately Samba3 uses alarm() quite a lot already, so we can't
just use a global alarm, and I can't think of any single place we can
put a hook in that guarantees a process lifetime. Can you think of a
solution? I know that fixing the individual bug that is causing this
is one approach, but ideally I'd like some way of guaranteeing that
build farm tasks don't run forever.

Also note that the overall build farm test suite script does exit, but
it leaves behind these smbd processes (along with the corresponding
smbtorture processes). That is why so many are able to build up.

Cheers, Tridge


More information about the samba-technical mailing list