using virtual synchrony for CTDB

Tracy Camp campt at
Fri Oct 6 20:15:56 GMT 2006

I'm not sure we are talking about the same thing here at all... Wouldn't 
some form of response be required within the GC layer to guarantee 
delivery?  I'm not familiar with the GC layer that you mentioned, but that 
certainly seems to be the logical conclusion and implemented fact in 
spread (which requires _two_ token rounds to deliver a guaranteed 
message).  Now if you don't care about delivery guarantees just message 
ordering, certainly no response is necissary.

It is not clear that CTDB cares about message ordering at all (including 
the reqid seems sufficient), so I'm not entirely sure what VS would 
provide here except a way to know if a message was sent in the current 
view of the cluster or not.

CTDB appears to want a semantic where every node is free to asynchronously 
issue and handle messages without regard to other nodes issueing or 
handling messages.  The whole point of the DMASTER, LMASTER concepts 
appears to avoid needing to broadcast state to every cluster member.  I 
can assure you that in large clusters this works out much better.  VS 
might provide a more elegant way to initiate recoveries than simply 
relying on message timeouts (and broadcast message semantics in a recovery 
are handy, though not necissary, since recovery is not unlike RAID-5 

Tracy Camp

On Fri, 6 Oct 2006, Steven Dake wrote:

> Tracey,
> If latency is an issue in messages, any message that has roundtrip
> response time over Ethernet medium will have higher latency then those
> messages that do not have round trip responses.
> If no response is required over Ethernet from the server before
> proceeding to do new operations, then ptp will have less latency then
> virtual synchrony.
> The performance problem that vs solves is removing the round trip
> response time, since every node has a copy of the data and can
> immediately handle the request and may continue processing as soon as
> the lock request is delivered (self-delivered) instead of waiting for a
> response from a server over TCP/IP or some other PTP protocol running
> over Ethernet.
> Regards
> -steve
> On Fri, 2006-10-06 at 11:09 -0700, Tracy Camp wrote:
>> Snake oil aside, 'DLM' like clustering schemes, which the CTDB proposal
>> seems like it could be grouped with, are best implemented with p-t-p
>> messages for the latency concerns already expressed.  However also using a
>> VS group communications layer to provide a generation number than can then
>> be embedded in each P-T-P message provides P-T-P w/o the overhead of VS
>> for the latecy sensitive messages.  Sort of a scheme that breaks the
>> 'control' apart from the 'data' transports.
>> Tracy Camp
>> On Fri, 6 Oct 2006, David Boreham wrote:
>>> Steven Dake wrote:
>>>> I have a suggestion to use virtual synchrony for the transport mechanism
>>>> of CTDB.  I think you will find using something like TCPIP unsuitable
>>>> for a variety of reasons.
>>> I'm very far from being a VS expert, but when I looked into a few
>>> of the open source implementations available a while back it became
>>> clear (to me at least) that they have a kind of 'snake oil' property
>>> in that they appear to deliver magical services but do so only by
>>> using quite inefficient methods underneath the covers. For example
>>> it appears that one is avoiding network round-trips but in fact to implement
>>> its
>>> delivery guarantees the message middleware layer needs to propagate a token
>>> around the set of participating nodes which of course involves many sends and
>>> receives.

More information about the samba-technical mailing list