[PATCH] avoid calling casefold_internal if the dn is already casefolded
mat at samba.org
Wed Dec 12 17:20:45 MST 2012
On 12/12/2012 04:02 PM, Andrew Bartlett wrote:
> On Wed, 2012-12-12 at 14:02 -0800, Matthieu Patou wrote:
>> On 12/12/2012 03:26 AM, Andrew Bartlett wrote:
>>> On Tue, 2012-12-11 at 22:23 -0800, Matthieu Patou wrote:
>>>> Can someone review this patch ?
>>>> It comes from analysis of indexed ldb_search with scope = one or subtree.
>>>> For each entries in the index we will compare the dn in the index to dn
>>>> specified as base for the search. And we keep casefolding the later one
>>>> which is useless and costly.
>>>> With a hundreds of entries in the index the cost of cost folding is
>>>> starting to be one of the most important one. With this patch we can
>>>> avoid useless calls.
>>> Did this really help? I can't see how this is any different to the
>>> dn->valid_case check we already do in ldb_dn_casefold_internal().
>>> Additionally, the reason that dn->valid_case needs to be checked is that
>>> if the DN is changed, the casefolded version may not be valid, and may
>>> need to be regenerated. (ie, there is a correctness issue here as
>> Looking at the code it seems that if the dn is changed not only
>> valid_case is changed but also casefold too or at least casefolded is
>> freed and nulled.
>> So it seems to be correct still.
> We should avoid looking inside the opaque struct ldb_dn. I'm supprised
> it is even available here, but we must be bringing in the private
> headers as this is a core ldb module.
> You might be falling into an analysis trap, due to how the profiler
> works. If it finds that ldb_dn_casefold_internal() is expensive (the
> timer finds that function in use a lot of the time), and that there are
> a lot of calls to ldb_dn_casefold_internal() from the index code, it is
> possible to make the false assumption that the calls from the index code
> are the expensive calls. I had similar things when I last ran perf, and
> had to discount some of the call stacks as being fantasy.
Actually I do think they are from the index code but I'll bring the
details soon but maybe not in the case of a multiple calls of
ldb_dn_casefold_internal on the same dn.
> Otherwise, I don't know how to explain this, but you might want to
> single step a simple case in gdb and prove it one way or the other.
I might also had more than one optimization that could drive the # of
instruction and cpu time down.
As I said let me bring the details of this with new traces.
More information about the samba-technical