Rev 40: Raw impl. of ibwrapper test tool. in http://samba.org/~tridge/psomogyi/

Peter Somogyi psomogyi at gamax.hu
Mon Dec 18 17:51:25 GMT 2006


Hi Tridge,

> So in your ib backend, I expect you will do much the same thing. You
> will have a fixed size ib/verbs level queue, but if when you are asked
> to queue a packet it doesn't fit then you will link the packet into a
> linked list and wait for an event to tell you that there is more room
> available in the ib level queue.

Sorry, I don't have such an event possibility in verbs. However, specially for 
sending I can have any/variable size of (registered!) buffers - verbs should 
manage it in RC mode...

Fixed size queue means here you can specify it at initialization time (e.g. 
256 send requests). Earlier I've got the info this is something I don't have 
to design for. Important, this limitation is for the # of send requests, not 
for any message size (- in RC mode).

I will find something out about this, e.g. doing {allocate, register mr, send, 
deregister mr} cycle just only for large messages (expected to be rare). 
Presumably register mr + mem. allocation won't be "so" slow...

>
> I expect that actually filling the outgoing queue will be very rare,
> and I suspect that a queue size of just a few packets of a few kb each
> will be all that is required, but I think the backend should correctly
> handle the case of lots of packets.

Receiver side is more interesting: your implementation also incomplete 
(ctdb_tcp_incoming_read: see "combined or partial packets").

BTW. Now I definitely see unavoidable a 4-byte message length prefix for this 
before each message. Strangely I don't see it in your transport 
implementation in spite of you are streaming. (Having it in hdr doesn't tell 
our transport how to feed upper layer - where to cut the message end.)

I'm going to implement a solution for this one, too - in ib case only for 
receiving splitted buffers (I can't [fore]tell in time how big buffer is 
needed in advance). So upper layer would receive message _in one_ block + 
_not more_.

BTW2: now I see I accidentally relied on RD mode, not RC... fixing it...
(RD=Reliable Datagram, RC = Reliable Connection)

Peter


More information about the samba-technical mailing list