[RFC] Performance improvements

Swen Schillig swen at vnet.ibm.com
Fri Jun 22 12:29:15 UTC 2018


On Fri, 2018-06-22 at 16:58 +1000, Martin Schwenke wrote:
> Hi Swen,
> 
> On Thu, 21 Jun 2018 08:39:39 +0200, Swen Schillig via samba-technical
> <samba-technical at lists.samba.org> wrote:
> 
> > On Wed, 2018-06-20 at 15:09 -0700, Jeremy Allison wrote:
> > > On Wed, Jun 20, 2018 at 10:51:58PM +0200, Swen Schillig wrote:  
> > > > 
> > > > Therefore, I would really like you to have a look at the code
> > > > again
> > > > and
> > > > "verify" if the changes are as evil as you claim.  
> > > 
> > > You do have some integer wrap and variable type
> > > work to do first - so why not do that and then resubmit
> > > with the "questionable" patch as the last entry
> > > so it can be reviewed carefully individually ?
> > >   
> > 
> > Thanks for all your support....and here we go.
> 
> As far as I can tell, commit 1 (ctdb: calculate queue input buffer
> size
> correctly) reverts a non-trivial amount of the following commit:
> 
>   commit a61a4b1254e162f39ac22665868ec116a96815f3
>   Author: Amitay Isaacs <amitay at gmail.com>
>   Date:   Wed Aug 21 14:42:06 2013 +1000
> 
>     common/io: Limit the queue buffer size for fair scheduling via
>     tevent
> 
> I'd like the commit message for commit 1 refer to the above commit
> message, making sure that the issue addressed there is still
> addressed.
> 
> Similarly, commit 8 (ctdb: Increase queue's default buffer size to
> 8k) basically reverts the following commit (though with a different
> number):
> 
>   commit 5dc18882fdbaed627262156dd453e6a8a2c0140f
>   Author: Amitay Isaacs <amitay at gmail.com>
>   Date:   Tue Jul 26 14:50:10 2016 +1000
> 
>     ctdb-daemon: Reduce QueueBufferSize from 16k to 1k
> 
> Again, I'd like to see reference to the above commit to ensure the
> issue mentioned there is being considered.
> 
> So, while these 2 commits are likely to provide a performance
> improvement, they might also introduce harmful side-effects that were
> addressed by the 2 commits above from Amitay.
Ok, what about the side effect introduced with leaving available data
on the socket ?  Where was that considered in the above 2 patches ?
How was the buffer size of 1k gathered ?

Besides, I thought the argument for NOT having patch-10 (tevent) is
because of fair-scheduling via tevent. 
Are you telling me now this cannot achieved via tevent ?
So if tevent is just processing all immediate events without looking
left or right, then what is the reason again for not having patch-10 ?

Ok, excuse my slight touch of sarcasm here but I really believe that
the chosen strategy is a good compromise between the 5 year old 16k and
the current 1k value. 
(Besides.... I planned to make it "tunable" which you dislike as well.)

> 
> Commit 2 (ctdb: replace talloc / memcpy by talloc_memdup) is fine and
> already Reviewed-by: Volker.  It isn't a performance improvement but
> is
> reasonable cleanup, so I'll just give it a:
> 
>   Reviewed-by: Martin Schwenke <martin at meltin.net>
> 
> and push now to get it out of this list of commits.

Wooohoo :-)

> 
> Why isn't the destructor in commit 3 (ctdb: remove queue destructor
> as
> it isn't needed anymore) needed anymore?  Is it due to one of the
> previous commits?  Or hasn't it been needed for a while?  If the
> latter, is there a particular commit that stops it from being needed?
> My guess is this one:

Well, if you look at the content of the destructor then it seems
obvious that it was never really useful.
1. It is the "queue" destructor, so there's no need to zero queue-attributes
2. The memory referenced by queue->buffer.data is of the queue's mem context and 
   therefore free'd automatically when the queue is destroyed. So no need to do that manually.
3. There's absolutely no point to set queue->destroyed = true right before I destroy/free the queue.

Besides we did exactly the same for sock_io not too long ago.

commit eae2d35fec071b020f420ba74ac6551c84140a4d
Author: Swen Schillig <swen at vnet.ibm.com>
Date:   Mon Jan 8 14:13:46 2018 +0100

    ctdb-common: Remove sock_queue_destructor
    
    The sock_queue_destructor is not needed.
    The performed tasks will be performed automatically.
    
    Signed-off-by: Swen Schillig <swen at vnet.ibm.com>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>
    Reviewed-by: Martin Schwenke <martin at meltin.net>

> 
>   commit e097b7f8ff1a9992de1d11474dac4969e30cd679
>   Author: David Disseldorp <ddiss at suse.de>
>   Date:   Sun Jul 31 03:14:54 2011 +0200
> 
>     io: Make queue_io_read() safe for reentry
> 
> I doubt that commit 6 (ctdb: Consolidate all ALIGNMENT
> instructions) provides any performance benefits.  Is it needed as
> part
> of this patch set?  If not, I'd still like to take time to consider
> it
> separately when time permits.
It is needed by the succeeding patches.
...and as far as I remember the two of us talked about it extensively
and decided to go for it if it's not too hard / intrusive ?
Did you change your mind ? 

...or do you think it's just the wrong time and place ?
> 
> OK, that leaves 3 (other sets of) changes that improve performance.
> 
> Commit 4 (ctdb: Introduce buffer.offset to avoid memmove) adds an
> offset into the buffer, similar to the begin offset in
> sock_io.c.  That
> seems like a good idea.
> 
> I know I'm strange, but I'd rather be bored by seeing relevant fields
> explicitly zeroed instead of:
> 
> 	memset(&queue->buffer, 0, sizeof(queue->buffer)); 
> 
> which I have to decode to understand why it is being done.
Well, I think this is now a decision of preference versus making sure
it's all done as expected. What if an attribute gets added to the
struct ? what about speed , memory related things tend to be
performance critical. ....and 5 or more assignments to do one thing
just look dirty..I think.

> 
> I'm also not sure why you dropped the "failed:" goto label and
> inlined
> that instead.  It just makes the patch bigger.  :-(

I think it is a good idea to add one extra line and return right away
instead of jumping 50 lines.
Besides, that jump-tag was used ..... once !
> 
> Commits 5, 7 & 9 (ctdb: Adding memory pool for queue callback, ctdb:
> Introduce tunable parameter for queue mempool size, ctdb: Enabling
> ctdb_queue_send to use memory from data_pool) add a memory pool to
> reduce the number of malloc()s.  Seems like a reasonable plan.
> 
> In commit 5 please don't use CTDB_NO_MEMORY_NULL().  We would like to
> get rid of that and similar macros.  Used lazily, they cause memory
> leaks.
> 
> I'm hoping we can do without commit 7.  We need less tunables instead
> of more.  :-)  The plan is to get rid of tunables completely,
> replacing
> them with configuration file settings.  Hopefully there is a number
> that
> generally works. I suspect that without some very good advice in the
> documentation, most users won't know what to do with it anyway and
> will
> never use it.
> 
> Like others, I don't think commit 10 is a good idea.  
Ok, I'm not arguing about that one anymore.

> So, my question is: how much improvement do you get with just commits
> 4, 5 and 9?

I'm counting slightly different
patches 1, 2(already pushed), 3, 4, 5,
6 (alignment !?), 7 (tunable !?), 8, 9 are good to go.

But you all have time to think and re-think about this for 
the next 3 weeks because I will go on vacation now....with no laptop
being close :-)

Cheers Swen




More information about the samba-technical mailing list