talloc vs malloc speed

Andreas Schneider asn at samba.org
Mon Apr 17 17:24:00 UTC 2023


On Friday, 14 April 2023 20:32:33 CEST Andrew Walker via samba-technical 
wrote:
> I was playing around with building talloc with jemalloc not too long
> ago
> https://github.com/truenas/samba/pull/241/commits/2476bfd3012a95e8015e2b61d
> 3475d6f8cf11476 Some thoughts:
> 1) there was some benefit to removing the memlimit API. Might be worth
> a shot retesting with it ripped out at different optimizations levels.

I wonder if this is used by someone.

> 2) there was also some benefit for case of talloc_zero() if we called
> directly into calloc() rather than doing malloc() followed by
> memset().

Replacing the malloc() call in __talloc_with_prefix() with calloc in case of 
talloc_zero doesn't show much benefit in my testing.

> 
> Andrew
> 
> On Fri, Apr 14, 2023 at 1:12 PM Andreas Schneider via samba-technical
> 
> <samba-technical at lists.samba.org> wrote:
> > Hi,
> > 
> > Florian Weimer implemented hwcaps support in glibc. This allows you to
> > drop
> > optimized libraries.
> > 
> > The support for this is enabled in openSUSE Tumbleweed right now [1]. I've
> > enabled it for libtalloc as you want to to be as fast as possible.
> > 
> > 
> > Here are the results from my AMD Ryzen 9 3900X 12-Core processor.
> > 
> > talloc x86_64_v1 (testsuite compiled with -O0)
> > 
> > test: speed
> > # TALLOC VS MALLOC SPEED
> > talloc:       46623469 ops/sec
> > talloc_pool:  74121933 ops/sec
> > malloc:       66443400 ops/sec
> > success: speed
> > 
> > => talloc is 30% slower
> > 
> > 
> > 
> > talloc x86_64_v3 (testsuite compiled with -O0)
> > 
> > test: speed
> > # TALLOC VS MALLOC SPEED
> > talloc:       47783809 ops/sec
> > talloc_pool:  75068595 ops/sec
> > malloc:       68073710 ops/sec
> > success: speed
> > 
> > => talloc is 30% slower
> > 
> > 
> > 
> > talloc x86_64_v3 (testsuite compiled with -O2)
> > test: speed
> > # TALLOC VS MALLOC SPEED
> > talloc:       50633005 ops/sec
> > talloc_pool:  74245533 ops/sec
> > malloc:      219259200 ops/sec
> > success: speed
> > 
> > => talloc is 77% slower
> > 
> > 
> > It looks like the optimizer is able to optimize the code a lot more if
> > malloc is used.
> > 
> > I wonder if it would be possible to give the optimizer more hints. Maybe
> > Florian has some ideas :-)
> > 
> > 
> > Best regards
> > 
> >         Andreas
> > 
> > P.S. The talloc website states it is 4% slower than malloc. This was
> > probably a long long time ago ;-)
> > 
> > 
> > [1] https://www.phoronix.com/news/glibc-hwcaps-RFC
> > [2] https://news.opensuse.org/2023/03/02/tw-gains-optional-optimizations/
> > 
> > --
> > Andreas Schneider                      asn at samba.org
> > Samba Team                             www.samba.org
> > GPG-ID:     8DFF53E18F2ABC8D8F3C92237EE0FC4DCC014E3D


-- 
Andreas Schneider                      asn at samba.org
Samba Team                             www.samba.org
GPG-ID:     8DFF53E18F2ABC8D8F3C92237EE0FC4DCC014E3D





More information about the samba-technical mailing list