CTDB Segfault in a container-based env - Looking for pointers

Thu Aug 5 11:37:12 UTC 2021

On Wednesday, August 4, 2021 12:18:37 AM EDT Martin Schwenke via samba-
technical wrote:
> On Tue, 20 Jul 2021 15:59:02 +1000, Amitay Isaacs via samba-technical
> 
> <samba-technical at lists.samba.org> wrote:
> > On Fri, Jul 16, 2021 at 5:47 PM Michael Adam <obnox at samba.org> wrote:
> > 
> > The issue is that CTDB makes assumptions about the orphan processes.
> > On most unix systems, an orphan process gets re-parented to init which
> > traditionally has pid = 1.  This assumption is built into the code to
> > avoid runaway orhan processes in CTDB.
> 

Would it be worthwhile to have ctdbd explicitly reject running as pid 1? For 
example, it could get the pid and if equal to 1 log an error (and exit 
nonzero?). I felt a little foolish not having determined this rule on my own, 
but had this been part of ctdbd already, it would have saved time. I don't 
know if I'll be the last person to try it either :-)

If you are in agreement that this is a good, small, improvement to ctdb should 
I file a bug?

> Yes, we explicitly check if the parent process is 1 in the lock helper
> before continuing.  As discussed offline, we should try something with a
> file descriptor event to try to determine whether the parent has gone
> away.
> 
> > In the container world, what happens to orphan processes?
> 
> Everything I can find says they are re-parented to process 1 in the
> container.
> 

Agreed. However to add some additional detail - it's parented to PID 1 of the 
current pid namespace. So, even with multiple "containers" if they share the 
same pid namespace it's the PID 1 of whatever process from whatever container 
was started first for that namespace.

> > > Even if you don’t see a real benefit of this containerized layout
> > > just yet, it might still be beneficial for the code to consider
> > > some modifications to make ctdb more “container-ready”...
> > 
> > Provided it makes sense. ;-)
> 
> Yep!  If there is no sane re-parenting of orphan processes inside
> containers then we should recommended that CTDB is always run via a
> minimal init.  CTDB launches a lot of processes and if it goes away
> then something needs to look after them.
> 

The good thing is that I have found that the container runtimes docker and 
podman come bundled with such a minimal init. This can be started by providing 
the "--init" option to the `{docker,podman} run` command.

If you are using the "pod" approach to running containers - where some of the 
namespaces are shared between the containers (as in kubernetes or podman) 
there will be a "pause" container started first. When the pid namespace is 
shared this pause process also reaps processes, serving as the minimal init 
needed (AFAICT).

(It's amazing all the things you can find when you know what to look for!)

> As we discussed offline, at the moment the current crash reminds us we
> have a problem to solve, so we shouldn't just "fix" it to avoid the
> crash.  We should find a better solution for detecting that the parent
> has gone away, use that and then fix the crash that may still occur.
> We might also be doing a similar thing elsewhere...
> 
> peace & happiness,
> martin

Sounds good to me. To circle back to my first suggestion in this message, had 
the crash been more obviously linked back to ctdb as pid 1 I probably would 
have not started this thread. :-)
I was concerned it was something more subtle than just being run as PID 1. Now 
that I know that it's well known that ctdbd can't be PID 1 I think that 
putting that knowledge into the code to act as a "wrong way" sign could be 
useful for future travelers.