[PATCH] Fix CTDB PID file handling

Martin Schwenke martin at meltin.net
Fri Sep 23 04:53:59 UTC 2016


On Fri, 23 Sep 2016 11:37:28 +1000, Amitay Isaacs <amitay at gmail.com>
wrote:

> On Thu, Sep 22, 2016 at 11:01 PM, Volker Lendecke <vl at samba.org> wrote:
> 
> > Hi, Martin!
> >
> > On Thu, Sep 22, 2016 at 04:04:15PM +1000, Martin Schwenke wrote:  
> > > At the moment ctdbd_wrapper is solid but multiple ctdbd's can start at
> > > the same time.  If they overtake each other in interesting ways then
> > > one ctdbd can exit early and remove another's PID file.
> > >
> > > Add a PID file abstraction, use it, make PID file creation the first
> > > action, clean up some surrounding code.
> > >
> > > Please review and maybe push...  
> >
> > We have lib/util/pidfile.c. You do destruction with a talloc context,
> > but the locking engine is also there in lib/util/pidfile.c. It's not
> > rocket science, but also not entirely trivial, so -- can't we share
> > code here between ctdb/common and lib/util?
> >  
> 
> Yes, we should.  I did look at lib/util/pidfile.c, but decided not to use
> it.
> 
> The first reason is the big comment at the top of the file which says:
> 
>    /* this code is broken - there is a race condition with the unlink
> (tridge) */
> 
> The code leaks fd and I did not find any torture tests.
> 
> There is a slight difference in the semantics and I have added tests that
> check the semantics CTDB will use.
> In CTDB pidfile is already available, so I don't want to split it into a
> directory name and a base name for the pid file.
> Since pidfile is generally used by rc script (or ctdbd_wrapper in CTDB's
> case), API with directory name did not make sense.
> To be able to use configuration directories in general, we need a config
> parser in CTDB which is currently missing.

Agreed.  We had a problem we wanted to fix quickly so went with
not-invented here.  ;-)  However, we did plan to revisit it later and
see if we could merge the two.  There are very few callers so it should
be easy.  The PID file = directory + basename issue can be solved with a
wrapper that takes the 2 arguments and combines them.

The only issue is the use of O_EXCL when opening the PID file.  This
doesn't seem necessary, since the file will be locked anyway.  It seems
to mean that you have to be very careful to clean up stale PID files to
avoid problems...

Thoughts?

peace & happiness,
martin



More information about the samba-technical mailing list