[PATCH] Convert Samba to OFD locking.

Jeremy Allison jra at samba.org
Thu May 19 16:16:09 UTC 2016


On Wed, May 18, 2016 at 02:42:20PM -0700, Jeremy Allison wrote:
> From eb7f30b606a006cca80f4de990bb1c9e620a7f97 Mon Sep 17 00:00:00 2001
> From: Jeremy Allison <jra at samba.org>
> Date: Thu, 12 May 2016 21:46:50 +0200
> Subject: [PATCH 10/10] s3: wscript: Add checks for open file description
>  locks.
> 
> We now use these if available.
> 
> Signed-off-by: Jeremy Allison <jra at samba.org>
> ---
>  source3/wscript | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
> 
> diff --git a/source3/wscript b/source3/wscript
> index 3b6f8a4..51c1542 100644
> --- a/source3/wscript
> +++ b/source3/wscript
> @@ -1089,6 +1089,22 @@ syscall(SYS_setgroups32, 0, NULL);
>                  execute=True,
>                  msg='Checking whether fcntl locking is available')
>  
> +    conf.CHECK_CODE('''
> +           struct flock lck = {
> +              .l_whence = SEEK_SET,
> +              .l_type = F_WRLCK,
> +              .l_start = 0,
> +              .l_len = 1,
> +              .l_pid = 0,
> +            };
> +            int ret = fcntl(0, F_OFD_SETLK, &lck);
> +            int ret1 = fcntl(0, F_OFD_SETLKW, &lck);
> +            int ret2 = fcntl(0, F_OFD_GETLK, &lck);
> +            ''',
> +            'HAVE_OFD_LOCKS',
> +            msg="Checking whether fcntl lock supports open file description locks",
> +            headers='unistd.h sys/types.h sys/stat.h fcntl.h')
> +
>  # glibc up to 2.3.6 had dangerously broken posix_fallocate(). DON'T USE IT.
>      if not conf.CHECK_CODE('''
>  #define _XOPEN_SOURCE 600
> -- 
> 2.8.0.rc3.226.g39d4020

Just to clarify with the people who wanted runtime checks,
I'm trying to reduce the complexity of runtime code paths.

If for every new system feature we have to not only check
for existence (which is reasonable) but to check at runtime
if it *actually works* then we're going to end up with code
that's really difficult to maintain and support in the long
run.

The reason I want to move away from this comes from the
experience with code inside tevent that tries to cope with broken kernel
epoll implementations by building a dynamic environment that
can switch over to the poll() model on failure.

Even to test that this code worked required an internal
test framework to tevent_epoll that can simulate a 50%
system call failure rate.

And there's also the issue of trying to explain code
like this:

static int std_event_loop_once(struct tevent_context *ev, const char *location)
{
        void *glue_ptr = talloc_parent(ev->ops);
        struct std_event_glue *glue =
                talloc_get_type_abort(glue_ptr,
                struct std_event_glue);
        int ret;

        ret = glue->epoll_ops->loop_once(ev, location);
        if (glue->epoll_ops != NULL) {
                /* No fallback */
                return ret;
        }

        if (!glue->fallback_replay) {
                /*
                 * The problem happened while modifying an event.
                 * An event handler was triggered in this case
                 * and there is no need to call loop_once() again.
                 */
                return ret;
        }

        return glue->poll_ops->loop_once(ev, location);
}

Note that we indirect glue->epoll_ops->loop_once, *then*
we check it for NULL.

Now this code is perfectly correct (I had the pleasure of
having to explain this to a Googler who was utterly *convinced*
that this was a crash bug), but outside of the people who wrote and
maintain it, it's utterly incomprehensible.

This is where runtime checks for features lead us.

I'd rather not go there again :-).

Cheers,

	Jeremy.



More information about the samba-technical mailing list