Michael B Allen mba2000 at
Wed Feb 14 04:50:02 GMT 2007

On Tue, 13 Feb 2007 17:06:49 -0800
Howard Chu <hyc at> wrote:

> Love Hörnquist Åstrand wrote:
> >
> > I don't see malloc happy as a big problem.
> Most people don't, they've been trained to believe that malloc is ok. 
> But in the optimization work that went on between OpenLDAP 2.0 and 2.1 
> we found that 50% of our CPU time was in libc, 30% malloc() related and 
> 20% strlen(). One major reason that slapd in 2.1 benchmarks 200 times 
> faster than slapd in 2.0 is because we eliminated most of those calls. 
> Using BerVals everywhere was also a part of that.
> It's true that using a better malloc library can help (see my malloc 
> benchmark results ) but it's 
> better not to use it at all if it can be avoided.

Flow control in a server is simpler wrt the stack and alloction and is
hidden whereas with a client the inverse is true. I think making the
user supply buffers could be ugly. Believe me, I would love to do that. A
lot of my code operates on buffers as opposed to using allocation. But
with the sprawling trees produced by LDAP responses I just don't see a
way around using allocation.

Here's a possibly better way to handle memory allocation though:

First, parameterize allocation by allowing the user to set function
pointers (a typical "handler" struture) for an LDAP context. Samba devs
will want this anyway so they can use their "talloc" code and "steal"
objects. I too would like to use my allocators to elimitate copies. All
memory for data that will be returned to the user would be allocated
using these user supplied functions (but not for internal stuff).

Second, make the default allocator one that does not implement the 'free'
function since memory management for decodeing responses is pretty much
all 'malloc' and no 'free' anyway. The allocator would be backed by a
"chunk chain" of increasingly larger memory chunks allocated with real

There are a number performance benifits to doing this that should be
obvious but I'll list them anyway:

  1) The number of objects allocated from malloc(3) is small. A large
  response that used 1000 objects might take only 10 chunks. The bulk
  of the overhead in using an allocator is juggling the free list.

  2) The number of calls to malloc(3) is small. Again it could be 100 to
  1. Remember the stdlib allocator uses locks.

  3) The implementation of the default allocator would be efficient and
  simple because, without the need for 'free', an allocation would simply
  advance a pointer and return.

  4) One function is used to free the chain of malloc(3)'d chunks. Again,
  100 to 1 maybe.

  5) Again, fewer copies when used with an applications own allocator
  (e.g. talloc)

This also makes freeing stuff a lot easier on the user. You can either
just unbind or call one function to free the "chunk chain".

Also, to implement the old API on top of the new one, simply set the
stdlib allocator functions on the context.

I do this stuff all over the place in my code. I have an allocator
implementation that can be initialized with a fixed sized chunk of
memory. So I can do things like initialize it with stack memory, use it
like I would the stdlib allocator, and then just return without concern
for freeing anything.


Michael B Allen
PHP Active Directory SSO

More information about the samba-technical mailing list