Any strong views on the exact modal for s3fs startup?

Fri Jun 1 10:37:43 MDT 2012

On Fri, 2012-06-01 at 23:52 +1000, Andrew Bartlett wrote: 
> I've been working recently to try and make s3fs startup more reliable.
> 
> Some of you testing Samba4 as an AD DC will have noticed that sometimes
> after a 'killall samba' a pid file is left behind by the smbd child,
> blocking a restart of Samba.
> 
> Also, if for some reason a port required by Samba is being used (be it
> 88, 389 or quite likely 139/445), we still attempt to start the other
> services, rather than fail outright.
> 
> For Samba4-internal services, I am planning to have them signal the main
> process to say 'failure to startup', and shut the server down.  However,
> this made me think about how s3fs starts up.  I spent some time creating
> this patch set, which moves the socket listen to the parent 'samba'
> process:
> https://git.samba.org/?p=abartlet/samba.git/.git;a=shortlog;h=refs/heads/s3fs-parent-listen
> 
> The problem with this patch series is that because of socket_wrapper,
> the individual 'make test' connections passed across the exec() boundary
> are treated as unix domain sockets, not TCP connections.,  smbd can't
> cope with non-TCP clients. 
> 
> What I would like to do is rather than exec(), I would like to call into
> the shared library that already contains the smbd codebase, and have it
> run most of the same startup routines as library functions in a fork()ed
> child. 
> 
> The advantage of this will give us much closer integration: we can
> (better than we can with a pile of command line arguments) clearly
> indicate that logging is already initialised (stopping the current silly
> logging to stdout only), not not write a pid, where the smb.conf is, and
> not rely on the semantics of writing out fileserver.conf.  
> 
> In short, we can make using s3fs just as seamless and painless for our
> users as running the ntvfs server. 
> 
> The disadvantage would be that it would no longer be possible to
> replicate the file server environment by running just smbd.  (I've not
> seen anybody use this mode however.)
> 
> I normally prefer to make proposals such as this with patches already in
> hand, and the branch above would be a starting point, but I would like
> to canvas any very strong views on the matter before I put much more
> effort into this.  (And then of course address  detailed concerns when I
> have patches).
> 
> On the original problem, I do realise that there are other ways to solve
> this particular pain point (treat the exit of smbd as fatal, for
> example), and I may do that in the shorter term.

I am strongly opposed to the proposed model.

We need to get our pieces more independent and interacting through clear
interfaces and API at an IPC level, not tie them up even more in a big
ball of mud.

For one thing we carefully reduced the amount of stuff that is
initialized (esp wrt memory allocation) before starting preforked
daemons like lsasd and spoolssd and the reason is that when smbd fork()s
we pay a hefty price that is directly proportional to the amount of
memory allocated and referenced before the fork() because, although the
Lunux kerneol uses COW there is still a certain amount of setup to
perform for new processes. So we want different daemons that will have
completely different allocation paths and structures to fork() as soon
as possible to minimize the waste.

Even worse is using exec(). Using it for each new connection is just a
disaster, as on exec() there is no COW, but the full set of pages is
thrown away and we need to add that a full new setup for the process is
now required. From the dynamic linker (and the abuse of using .so
objects in samba4 makes this part quite important as the linker will
slow down startup) to the actual setup of libraries and other facilities
we do within smbd.

Then add all the things Metze said.

Managing child processes or process groups is not rocket science and we
can add messages to the already available messaging interfaces if you
need to convey information from one process to another.

/sbin/samba needs to become more of a monitor system and not try to
swallow everything in, in the normal case it should ideally become a
very thin shm that starts all necessary daemons and then just monitor
and restart them if they crash, see above for the reasons.

There is also a lot of value, again as Metze says, in having a
standalone file server smbd behave in exactly the same way (process
wise) than a full DC smbd.

Simo.

-- 
Simo Sorce
Samba Team GPL Compliance Officer <simo at samba.org>
Principal Software Engineer at Red Hat, Inc. <simo at redhat.com>