[CTDB] progress

Fri Dec 15 10:43:07 GMT 2006

Tridge,

tridge at samba.org wrote:
> Alexander,
> 
>> we are close to implement CTDB protocol and EDRs in 
>> http://samba.org/~ab/ctdb. Please see revision 26 of
>> common/ctdb_call.c include/ctdb_private.h, there are number of
>> helpers which aren't implemented yet but other than that we are
>> almost done with basic logic of DMASTER/LMASTER.
> 
> sorry for the delay in reviewing this.
> 
> Some questions :-)
> 
> - What is the lmasterdb tdb for? (in ctdb_private.h)
It is the database for LMASTER information about DMASTERs. If LMASTER
isn't DMASTER for the record and the record wasn't in its tdb with any
info how could it know what to respond to a requester? Did you mean here
that LMASTER stores a record being requested and by its EDR finds out
who is the DMASTER? I.e., all nodes a bit dumb, they just store what was
passed through them or requested.

Initially my thought was that we don't have much of duplication between
nodes, each one stores information specific to it and LMASTERs solve
routing.

> - why the additional _struct suffixes on the various structures?
I find it inconvenient to have same function and structure names. It is
error-prone from my point of view, you might accidentally to use
function name (as pointer) instead of struct name. Not you specifically
:-) but it is definitely misleading. So this ambiguity isn't good.

> - In struct ctdb_reply_redirect_struct, you have this:
> 
> struct ctdb_reply_redirect_struct { struct ctdb_req_header hdr; 
> uint32_t datalen; uint8_t  data[0]; /* uint32_t for dmaster id */ };
> 
> why the fancy data buffer? It would be much simpler to have this:
> 
> struct ctdb_reply_redirect { struct ctdb_req_header hdr; uint32_t
> dmaster; };
> 
> The use of a [0] buffer is only needed for variable length data, and 
> this reply is not variable length.
I tried to follow common pattern, but you're right here.

> - for the record header changes, I'd suggest not putting that logic
> in ctdb_call.c, instead it would be cleaner to do it in ctdb_ltdb.c, 
> and replace the tdb_fetch() call in ctdb_call() with a 
> ctdb_ltdb_fetch() call, which will handle removing the extra header 
> (and returning it as an extra parameter perhaps). Otherwise the logic
> in ctdb_call.c just gets too convoluted.
The problem there is that we then need to pass-through header as each
fetch is actually fetch->execute->store sequence: you fetch the record,
pass it to the called function, analyze the result and either update
header and data or only update header. So essentially, you can't do
fetch-only, you would need to make ctdb_ltdb_update() as well and copy a
portion of data.

> - This comment is a bit misleading: /* if c->hdr.destnode ==
> ctdb->vnn then we are possible DMASTER (if nothing changed between
> issuing the call and receiving it here) */
> 
> A test for (c->hdr.destnode == ctdb->vnn) is really only useful for 
> checking if the vnn logic has screwed up. When these are not equal it
> means the mapping of vnn to physical node is not correct, which 
> probably happened because either source or destination nodes didn't 
> participate in a recovery.
> 
> The way to check if we are the DMASTER is to try to load the record 
> from the ltdb. If we are the DMASTER then either the record header 
> will list our vnn as the DMASTER, or the record is not there and we 
> are the LMASTER.
So you are leaning towards 'single db, fetch to know what is there'?
This certainly streamlines logic a lot but adds a bit of load on the db.
  I'm not sure that we can't really use header's info to make such
decisions...

> - similarly for this comment: /* TODO: If we are not the recepient
> (hdr->dstnode != ctdb->vnn): 1. redirect it to the recepient 2. Note
> the requester to forward response later */ if we get a packet for
> which we are not the recipient then either sender or receiver didn't
> participate properly in a recovery. We need to start a new recovery
> and vnn assignment.
> 
> - The logic in ctdb_request_call() is not right. It must check if we 
> are the DMASTER, not if we are the LMASTER. In this function, being 
> the LMASTER is only important if the ltdb record is empty (which 
> means we are by default the DMASTER too)
Why? We've got a call for a record we don't own. This means we are
LMASTER for this record or there was network split. Thus we check for
LMASTER.

-- 
/ Alexander Bokovoy
Samba Team                      http://www.samba.org/
ALT Linux Team                  http://www.altlinux.org/
Midgard Project Ry              http://www.midgard-project.org/