[SCM] CTDB repository - branch lock-wait created - ctdb-1.0.108-77-g47d63bc

Ronnie Sahlberg sahlberg at samba.org
Tue Dec 15 03:39:00 MST 2009


The branch, lock-wait has been created
        at  47d63bc9cf3885c16eec7e7f2cdef27f8b6c4e93 (commit)

- Log -----------------------------------------------------------------
commit 47d63bc9cf3885c16eec7e7f2cdef27f8b6c4e93
Author: Ronnie Sahlberg <ronniesahlberg at gmail.com>
Date:   Tue Dec 15 21:28:23 2009 +1100

    initial support for a lockdown protocol.
    
    We use a dummy file and byte range locks on this file to make records sticky.
    
    With the current design there is a possibility that database records for very hot records will be migrating bethween the nodes faster than the client applications can access the data.
    So that once a client application has requested a record and asking ctdbd to migrate the record onto the node, that this record might be migrated off the node again before the client got a chance to access the record.
    
    This can now be prevented by using a "pindown" mechanism.
    This pindown mechanism is implemented using fcntl() locks on a file shared between ctdbd and the clients on the local nodes.
    
    Records that are pinned down, can not be migrated off the node by ctdbd. Instead any such requests will be blocked until the pindown dissapears.
    Records can however be migrated onto the node while there is an active pin-down.
    
    Clients can set a pindown on a record even before it tries to have it being migrated onto the node with the effect that then record is fetched, and then remains pinned down on the node until the client has finished processing the record.
    
    Multiple clietns can pin down the same record, in which case the record remains on the local node until all clietns have released their pin-down.
    
    Client pindown is implemented by a read-lock on teh pindown file.
    
    Ctdbd tries write-locks for the same region on the pindown file when determining whether a migrate off node request should be allowed or if it should be postphoned until all clients have finished.
    
    clients use read-locks to pin the record down
    ctdbd will not allow the record to be migrated off the node until it can take out a write-lock.
    
    Since this will require two extra trips to the kernel and back for the clietns, clietns may try a cheaper non-pinlock the first few interations in the fetch-lock loop and not involve the heavy pind-down until this has failed a few times for the record,
    to make sure non-contended records are as fast as possible and at the same time allow for using the slightly more heavy pin-down when it gets tired of waiting.
    
    clients using pin-down can coexist and access the same data and records as clients not using pin-down.

-----------------------------------------------------------------------


-- 
CTDB repository


More information about the samba-cvs mailing list