[PATCHES] CTDB - fix PID file problem

Amitay Isaacs amitay at gmail.com
Wed Jul 2 21:49:36 MDT 2014


On Thu, Jul 3, 2014 at 1:22 PM, Martin Schwenke <martin at meltin.net> wrote:

> On Wed, 2 Jul 2014 14:58:06 +0200, Sumit Bose <sbose at redhat.com> wrote:
>
> > On Wed, Jul 02, 2014 at 09:01:37PM +1000, Martin Schwenke wrote:
> > > On Wed, 2 Jul 2014 12:36:33 +0200, Sumit Bose <sbose at redhat.com>
> wrote:
> > >
> > > > On Wed, Jul 02, 2014 at 07:56:43PM +1000, Martin Schwenke wrote:
> > > > > 2 patches:
> > > > >
> > > > > * ctdb-daemon: Check PID in ctdb_remove_pidfile(), not unreliable
> flag
> > > > >
> > > > >   If something unexpectedly uses fork() then an exiting child will
> > > > >   remove the PID file while the main daemon is still running.  The
> real
> > > > >   test is whether the current process has the PID of the main CTDB
> > > > >   daemon.
> > > > >
> > > > > * ctdb-daemon: Remove ctdb->ctdbd_pid, just use ctdbd_pid
> > > >
> > > > Wouldn't it be nicer to remove the global variable?
> > > >
> > > > >
> > > > >   The global variable is needed anyway so just use it.
> > > >
> > > > Why? I have found 4 places where ctdbd_pid is used. One is the
> > > > assignment, at two other place the ctdb_context is already available.
> > > > The remaining one is in ctdb_tevent_trace() where the ctdb_context
> must
> > > > be given as private data in tevent_set_trace_callback() to make
> > > > ctdb->ctdbd_pid available.
> > >
> > > ... and an extra use is introduced in the 1st patch.  We need a
> > > reliable way of checking whether the main ctdbd is exiting so that we
> > > can remove the PID file.  The current method is unreliable (e.g. if a
> > > library call unexpectedly forks a child that then exits then the PID
> > > file is mistakenly removed). So, we need to compare against the main
> > > ctdbd PID.  We remove the PID file in an atexit() handler, which needs
> a
> > > global variable.
> > >
> > > If there's a better way of doing the 1st patch then I'm happy to drop
> > > the 2nd patch.  :-)
> >
> > Well, you can read the PID from the PID file :-) A more elegant solution
> > might be getpgrp() which returns the process group ID aka the PID of the
> > main ctdb daemon. So, getpgrp() == getpid() should be only true for the
> > main ctdb daemon as long as you do not do weird stuff with setsid(),
> > setpgid() and friends in the child processes.
>
> Sold!
>
> New patches attached.  I've used session ID instead of progress group
> because we're more likely to do weird stuff with the latter.  There's
> an extra patch to make sure that the call to setsid() succeeds at
> startup, since that's not really optional.
>
> Thanks for pointing out a better way!  :-)
>
>
You missed one change from the original patches. :-)

Amitay.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-ctdb-recoverd-No-need-to-set-ctdbd_pid-again.patch
Type: text/x-patch
Size: 873 bytes
Desc: not available
URL: <http://lists.samba.org/pipermail/samba-technical/attachments/20140703/4c116040/attachment.bin>


More information about the samba-technical mailing list