[Samba] Samba4 consumes more CPU

Thiago Fernandes Crepaldi tognado at gmail.com
Mon Sep 30 15:19:58 MDT 2013


Agreed. For some strange reason I though perf would "follow" the new smbd
forked and account their data too =)

Unfortunately, I don't have the libc symbols (at least for today) to see
what is going on there, but here is what I got in the child smbd process on
the server side. The client side is a Windows 7 Virtual machine running
NASPT

Could this result mean that most of the time the performance drop I am
experiencing is due to libc ?
I've never worked with perf before, but I will still try to resolve those
crazy addresses

Events: 45K cycles
-   7.37%  smbd  libc-2.13.so              [.] 0x11e465
   - 0x7ffab9f2043c
        41.73% 0
        5.32% 0x1b3fbe0
        5.29% 0x2c4dab0
        3.60% 0x1b0b130
        3.37% 0x1b0b2a0
        2.94% 0x1b5af80
        2.70% 0x1b0d850
        2.64% 0x2825fb0
        1.86% 0x28e06d0
        1.83% 0x2afcc80
        1.71% 0x1b2ccb0
        1.64% 0x2a4deb0
        1.63% 0x1b56e00
        1.51% 0x1b6bd00
        1.16% 0x1b49eb0
        1.15% 0x1b506e0
        1.13% 0x1b4da00
        1.07% 0x1b35100
        0.93% 0x1af9050
        0.92% 0x2b03680
        0.91% 0x2ae21f0
        0.90% 0x1b21210
        0.89% 0x1b5de80
        0.89% 0x1b5aa80
        0.89% 0x1b2e0e0
        0.88% 0x1b59be0
        0.87% 0x1b4c600
        0.86% 0x1b2aa20
        0.85% 0x1b4a940
        0.85% 0x1b45f50
        0.84% 0x1b4a6d0
        0.84% 0x1b23940
        0.82% 0x1b37210
        0.82% 0x1b2cf30
        0.82% 0x1b33320
        0.77% 0x2c96d50
        0.76% 0x202f380
        0.75% 0x2bd0bd0
0.66% 0x1b5e1d0
   - 0x7ffab9f27e10
        37.72% 0x2f62696c2f3365
      + 23.78% 0
      + 11.24% 0x7fffc9f76d40
      + 6.25% set_unix_security_ctx
        3.13% 0x645f6e656b6f74
        2.46% 0x1000900000000
      + 2.17% 0x11b9f22aac
        2.16% 0x1b53000
      + 2.12% 0x2a29850
        2.08% 0xbe70f000004c4c
        2.01% 0x1b0af00
        1.94% 0x1b07390
        1.51% 0x1b49b00
        1.41% 0x2010
   - 0x7ffab9fc6c10
      + 18.08% 0
      + 13.63% 0x2c5fc20
      + 11.62% 0x2be7b10
      + 7.90% 0x2be8560
      + 6.61% 0x2a29850
      + 6.30% 0x2b3d6c0
        5.67% 0x4e6f5479706f43
      + 5.64% 0x29d7110
      + 5.54% 0x2467130
      + 5.53% 0x2b3d5e0
      + 5.31% 0x28c81a0
      + 4.20% 0x2c5fa30
      + 3.98% 0x2a98990
   + 0x7ffab9f20438
   + 0x7ffab9f2045c
     0x7ffab9fc8e03
   + 0x7ffab9fc425e
   + 0x7ffab9f2a715
   + 0x7ffab9f2a6d0
     0x7ffab9f1f851
     0x7ffab9f1f2ac
   + 0x7ffab9f27e25
   + 0x7ffab9f2a648
   + 0x7ffab9fc4240
     0x7ffab9fc8654
     0x7ffab9f206bf
   + 0x7ffab9f20548
   + 0x7ffab9f20bc2
   + 0x7ffab9f1f130
   + 0x7ffab9f26310
   + 0x7ffab9f20422
     0x7ffab9f1e0db
     0x7ffab9f1f179
   + 0x7ffab9f2a6f2
   + 0x7ffab9f20572
   + 0x7ffab9f2054c
   + 0x7ffab9fc42c5
-   1.72%  smbd  [kernel.kallsyms]         [k] kmem_cache_alloc
   + kmem_cache_alloc
-   1.30%  smbd  libtalloc.so.2.0.7        [.] _talloc_free
   + _talloc_free
-   1.10%  smbd  libtalloc.so.2.0.7        [.]
_talloc_free_children_internal.i
   + _talloc_free_children_internal.isra.4
-   1.07%  smbd  [kernel.kallsyms]         [k] copy_user_generic_unrolled
   + copy_user_generic_unrolled
-   0.95%  smbd  [kernel.kallsyms]         [k] __kmalloc
   + __kmalloc
-   0.78%  smbd  [kernel.kallsyms]         [k] ext4_htree_store_dirent
   + ext4_htree_store_dirent
   + 0x7ffab9f4f2f5
-   0.73%  smbd  [kernel.kallsyms]         [k] kmem_cache_free
   + kmem_cache_free
-   0.73%  smbd  [kernel.kallsyms]         [k] link_path_walk
   + link_path_walk
-   0.69%  smbd  libc-2.13.so              [.] malloc
   + malloc
-   0.69%  smbd  libtalloc.so.2.0.7        [.] _talloc_zero
   + _talloc_zero
-   0.62%  smbd  [kernel.kallsyms]         [k] fcntl_setlk
   + fcntl_setlk
   + 0x7ffabcf93238
-   0.59%  smbd  [kernel.kallsyms]         [k] __d_lookup_rcu
   + __d_lookup_rcu
-   0.57%  smbd  libtalloc.so.2.0.7        [.] talloc_alloc_pool
   + talloc_alloc_pool
-   0.55%  smbd  libtalloc.so.2.0.7        [.] talloc_get_name
   + talloc_get_name
-   0.55%  smbd  [kernel.kallsyms]         [k] __posix_lock_file
   + __posix_lock_file
   + 0x7ffabcf93238
-   0.50%  smbd  [kernel.kallsyms]         [k] _raw_spin_lock
   + _raw_spin_lock
+   0.49%  smbd  [kernel.kallsyms]         [k] tg3_start_xmit
+   0.48%  smbd  [kernel.kallsyms]         [k] system_call_after_swapgs
+   0.46%  smbd  libtalloc.so.2.0.7        [.] talloc_named_const
+   0.46%  smbd  [kernel.kallsyms]         [k] memset
+   0.46%  smbd  libtalloc.so.2.0.7        [.] _talloc_get_type_abort
+   0.45%  smbd  [kernel.kallsyms]         [k] str2hashbuf_signed
+   0.45%  smbd  [kernel.kallsyms]         [k] kfree
+   0.45%  smbd  libc-2.13.so              [.] free
+   0.44%  smbd  [kernel.kallsyms]         [k] __alloc_skb
+   0.42%  smbd  libtalloc.so.2.0.7        [.] talloc_is_parent
+   0.41%  smbd  libtalloc.so.2.0.7        [.] _talloc_array





On Mon, Sep 30, 2013 at 5:39 PM, Jeremy Allison <jra at samba.org> wrote:

> On Mon, Sep 30, 2013 at 05:21:44PM -0300, Thiago Fernandes Crepaldi wrote:
> > Andrew, in my company we are also experiencing a higher CPU usage of
> Samba
> > 4 (smbd) if compared to Samba 3.
> >
> > In fact, it almost reaches 100% of CPU and uses all the memory during
> *dir
> > copies* (individual file copy is as good as samba 3's). I strongly
> believe
> > that this CPU usage is the responsible for a worse samba 4's throughput
> if
> > compared to Samba 3 tests.
> >
> > Giving that, I would like to contribute with this investigation and share
> > my data regarding perf profiling on smbd (parent process)
> >
> > Events: 7  cycles
> > -  90.01%  smbd  [kernel.kallsyms]  [k] copy_pte_range
> >      copy_pte_range
> >      __libc_fork
> >      smbd_accept_connection
> > -   9.77%  smbd  [kernel.kallsyms]  [k] handle_edge_irq
> >      handle_edge_irq
> >      smbd_accept_connection
> > -   0.22%  smbd  [kernel.kallsyms]  [k] perf_pmu_rotate_start.isra.57
> >      perf_pmu_rotate_start.isra.57
> >      __poll
> > -   0.00%  smbd  [kernel.kallsyms]  [k] native_write_msr_safe
> >      native_write_msr_safe
> >      __poll
>
> It's the client process that should have the interesting
> profile data, the parent is just going to sit there doing
> accept().
>
> Jeremy.
>



-- 
Thiago Crepaldi


More information about the samba mailing list