Potential race in lib/util_tdb.c:tdb_chainlock_with_timeout_internal

Richard Sharpe realrichardsharpe at gmail.com
Thu Jan 8 18:22:44 GMT 2009


Hi,

I am looking into a problem that relates to fcntl locks being held for
extraordinarily long times on secrets.tdb. I have come across this code in
lib/util_tdb.c:tdb_chainlock_with_timeout_internal in both 3.2.7 and 3.0.3x:

static int tdb_chainlock_with_timeout_internal( TDB_CONTEXT *tdb, TDB_DATA
key, unsigned int timeout, int rw_type)
{
        /* Allow tdb_chainlock to be interrupted by an alarm. */
        int ret;
        gotalarm = 0;

        if (timeout) {
                CatchSignal(SIGALRM, SIGNAL_CAST gotalarm_sig);
                tdb_setalarm_sigptr(tdb, &gotalarm);
                alarm(timeout);
        }

        if (rw_type == F_RDLCK)
                ret = tdb_chainlock_read(tdb, key);
        else
                ret = tdb_chainlock(tdb, key);

        if (timeout) {
                alarm(0);
                tdb_setalarm_sigptr(tdb, NULL);
                CatchSignal(SIGALRM, SIGNAL_CAST SIG_IGN);
                if (gotalarm) {
                        DEBUG(0,("tdb_chainlock_with_timeout_internal: alarm
(%u) timed out for key %s in tdb %s\n",
                                timeout, key.dptr, tdb_name(tdb)));
                        /* TODO: If we time out waiting for a lock, it might
                         * be nice to use F_GETLK to get the pid of the
                         * process currently holding the lock and print that
                         * as part of the debugging message. -- mbp */
                        return -1;
                }
        }

        return ret;

It seems to me that if the lock is already held by another process when we
enter this code, there is a race between the timeout and the granting. If
the lock is subsequently granted, the process releasing the lock will signal
the wait variable (or whatever) and our process will be scheduled. However,
if the timeout occurs before we are scheduled, the timeout will be delivered
first.

We will have the lock but will forget we have the lock, and never release
it.

Does this seem like it is a problem?

I would think that in the timeout path there should be a call to release the
lock, or check if we got the lock or something.

-- 
Regards,
Richard Sharpe


More information about the samba-technical mailing list