SMB2 performance is worse than SMB1 while iometer 512byte transfer

Jones jones.kstw at gmail.com
Mon Sep 16 13:24:03 CEST 2013


Hi Volker and Jeremy,

OK, let's rock with Ubuntu 13.04!
New disk is inserted to my original platform,
and Ubuntu 13.04 Desktop AMD64 is installed into it.
Test environment and hardware is the same as before,
except the platform OS is changed to Ubuntu 13.04 Desktop AMD64.

Because got trouble to run Samba-4 in Ubuntu 13.04,
this test case run Samba-3.6,
detailed steps as following:


A. samba-3.6.9, apt-get install samba
==============================
3.6.9 is straightly through apt-get.
SMB2 is slower than SMB1,
and SMB2 shows vfprintf on the top in perf.
Although "apt-get install samba-dbg libtalloc-dbg" is installed,
there are some symbols missing in smbd.

Protocol   R/W   IOps   %CPU    %us    %sy   %id   %si
-----------------------------------------------------------
SMB1       Read  94021   98.3%  28.1%  53.1%  2.1%  16.7%
SMB1       Write 82423 100.0%  30.0%  57.0%  0.0%  13.0%
SMB2       Read  31546 100.0%  68.0%  22.0%  0.0%  10.0%
SMB2       Write 27890 100.0%  65.0$  23.0%  0.0%  12.0%

"perf top -p <smbd_pid>" SMB2 enabled shows:
  7.28%  libc-2.17.so        [.] vfprintf
  6.55%  smbd                [.] 0x0000000000188004
  4.76%  libc-2.17.so        [.] _int_malloc
  4.09%  libc-2.17.so        [.] _int_free
  3.80%  libc-2.17.so        [.] malloc_consolidate
  3.02%  libtalloc.so.2.0.7  [.] _talloc_free_internal
  2.33%  libc-2.17.so        [.] _IO_default_xsputn
  2.21%  libc-2.17.so        [.] malloc
  1.91%  libc-2.17.so        [.] __memset_sse2
  1.50%  libtalloc.so.2.0.7  [.] _talloc_zero
  1.11%  libtalloc.so.2.0.7  [.] _talloc_get_type_abort
  1.11%  libtalloc.so.2.0.7  [.] _talloc_array
  1.09%  [kernel]            [k] tcp_sendmsg
  1.04%  smbd                [.] event_add_to_poll_args
  0.99%  libc-2.17.so        [.] __strchrnul
  0.88%  libc-2.17.so        [.] _itoa_word
  0.84%  libtalloc.so.2.0.7  [.] _talloc_free
  0.83%  libc-2.17.so        [.] __vasprintf_chk
  0.82%  libtalloc.so.2.0.7  [.] talloc_get_name
  ...

"perf top -p <smbd_pid>" SMB1 enabled shows:
  6.02%  smbd                [.] 0x0000000000174c05
  3.41%  [kernel]            [k] copy_user_generic_string
  2.30%  [e1000e]            [k] e1000_xmit_frame
  2.18%  [kernel]            [k] __ticket_spin_lock
  1.51%  [kernel]            [k] tcp_sendmsg
  1.34%  smbd                [.] reply_read_and_X
  1.33%  [kernel]            [k] tcp_recvmsg
  1.30%  libtalloc.so.2.0.7  [.] _talloc_free_internal
  1.16%  [vdso]              [.] 0x000000000000070c
  1.03%  [kernel]            [k] skb_copy_datagram_iovec
  1.01%  [kernel]            [k] fget_light
  0.93%  [kernel]            [k] do_sync_read
  0.92%  smbd                [.] event_add_to_poll_args
  0.90%  smbd                [.] run_events_poll
  0.89%  [kernel]            [k] tcp_transmit_skb
  0.88%  [e1000e]            [k] e1000_clean_tx_irq
  0.80%  [kernel]            [k] system_call
  ...

B. samba-3.6.18, wget from Samba.Org
==============================
Source code is straightly wget from Samba.Org,
and compiling with -O.

Protocol   R/W   IOps   %CPU    %us    %sy   %id   %si
-----------------------------------------------------------
SMB1(-O)   Read  89686  100%  33.0% 56.0%  0.0%  11.0%
SMB1(-O)   Write 70437 100%  32.0% 56.0%  0.0%  16.0%
SMB2(-O)   Read  25202 100%  65.0% 22.0%  0.0%  13.0%
SMB2(-O)   Write 19994 100%  63.0$ 29.0%  0.0%  8.0%

C. samba-3.6.18, wget from Samba.Org
==============================
OK, straightly wget from Samba.Org again,
manually edit samba-3.6.18/source3/lib/events.c,
add CHECK_DEBUGLVL() to ease the pain from vfprintf.
Compiling with -O and -O3.

Protocol   R/W   IOps   %CPU    %us    %sy   %id   %si
-----------------------------------------------------------
SMB1(-O)   Read  93220  100%  30.0% 55.0%  0.0%  15.0%
SMB1(-O)   Write 74227 100%  32.0% 55.0%  0.0%  13.0%
SMB2(-O)   Read  34410 100%  61.0% 26.0%  0.0%  13.0%
SMB2(-O)   Write 30246 100%  60.0$ 27.0%  0.0%  13.0%

Protocol   R/W   IOps   %CPU    %us    %sy   %id   %si
-----------------------------------------------------------
SMB1(-O3)  Read  99094  100%  28.0% 55.0%  0.0%  17.0%
SMB1(-O3)  Write 79507 100%  26.0% 59.0%  0.0%  15.0%
SMB2(-O3)  Read  38663 100%  56.0% 31.0%  0.0%  12.0%
SMB2(-O3)  Write 33787 100%  53.0$ 37.0%  0.0%  10.0%

"perf top -p <smbd_pid>" SMB2 enabled shows: (-O1)
  4.68%  libtalloc.so.2.0.5   [.] _talloc_free_internal
  4.57%  libc-2.17.so         [.] _int_malloc
  3.07%  libc-2.17.so         [.] _int_free
  2.30%  smbd                 [.] event_add_to_poll_args
  2.08%  libc-2.17.so         [.] __memset_sse2
  1.85%  libc-2.17.so         [.] malloc
  1.53%  libtalloc.so.2.0.5   [.] _talloc_zero
  1.39%  libc-2.17.so         [.] malloc_consolidate
  1.32%  libtalloc.so.2.0.5   [.] _talloc_get_type_abort
  1.25%  [kernel]             [k] tcp_sendmsg
  1.23%  [kernel]             [k] __ticket_spin_lock
  1.21%  libtalloc.so.2.0.5   [.] _talloc_free
  1.19%  libtalloc.so.2.0.5   [.] _talloc_array
  1.08%  [kernel]             [k] copy_user_generic_string
  0.95%  libtalloc.so.2.0.5   [.] _talloc_free_children_internal
  0.94%  libc-2.17.so         [.] __strcmp_sse42
  ...

"perf top -p <smbd_pid>" SMB1 enabled shows: (-O1)
  3.06%  [kernel]             [k] __ticket_spin_lock
  3.00%  [kernel]             [k] copy_user_generic_string
  1.80%  [e1000e]             [k] e1000_xmit_frame
  1.48%  libtalloc.so.2.0.5   [.] _talloc_free_internal
  1.46%  [kernel]             [k] fget_light
  1.46%  [kernel]             [k] tcp_sendmsg
  1.32%  [kernel]             [k] tcp_poll
  1.29%  [kernel]             [k] tcp_recvmsg
  1.18%  [kernel]             [k] do_sys_poll
  1.16%  [vdso]               [.] 0x000000000000070c
  1.16%  smbd                 [.] event_add_to_poll_args
  1.11%  [kernel]             [k] tcp_transmit_skb
  1.10%  smbd                 [.] reply_read_and_X
  1.04%  [kernel]             [k] _raw_spin_unlock_irqrestore
  1.02%  smbd                 [.] run_events_poll
  0.97%  smbd                 [.] switch_message
  ...

"perf top -p <smbd_pid>" SMB2 enabled shows: (-O3)
  4.84%  libc-2.17.so         [.] _int_malloc
  3.50%  libc-2.17.so         [.] _int_free
  3.31%  libtalloc.so.2.0.5   [.] _talloc_free_internal
  2.05%  smbd                 [.] event_add_to_poll_args
  2.00%  libc-2.17.so         [.] __memset_sse2
  1.81%  libc-2.17.so         [.] malloc
  1.70%  libtalloc.so.2.0.5   [.] _talloc_zero
  1.54%  libc-2.17.so         [.] malloc_consolidate
  1.52%  libtalloc.so.2.0.5   [.] _talloc_array
  1.35%  [kernel]             [k] __ticket_spin_lock
  1.34%  [kernel]             [k] tcp_sendmsg
  1.31%  libtalloc.so.2.0.5   [.] _talloc_get_type_abort
  1.19%  [kernel]             [k] copy_user_generic_string
  1.10%  libtalloc.so.2.0.5   [.] _talloc_free
  1.05%  libc-2.17.so         [.] __strcmp_sse42
  0.99%  [kernel]             [k] fget_light
  0.92%  libtalloc.so.2.0.5   [.] _talloc_free_children_internal.isra.4
  0.89%  libtalloc.so.2.0.5   [.] talloc_get_name
  0.85%  [kernel]             [k] fib_table_lookup
  0.85%  [e1000e]             [k] e1000_xmit_frame
  ...

  "perf top -p <smbd_pid>" SMB1 enabled shows: (-O3)
  3.39%  [kernel]             [k] __ticket_spin_lock
  3.26%  [kernel]             [k] copy_user_generic_string
  2.14%  [e1000e]             [k] e1000_xmit_frame
  1.68%  [kernel]             [k] fget_light
  1.56%  [kernel]             [k] tcp_poll
  1.52%  [kernel]             [k] tcp_sendmsg
  1.30%  [kernel]             [k] tcp_recvmsg
  1.19%  libtalloc.so.2.0.5   [.] _talloc_free_internal
  1.18%  [kernel]             [k] do_sys_poll
  1.16%  [kernel]             [k] tcp_transmit_skb
  1.14%  smbd                 [.] reply_read_and_X
  1.10%  [vdso]               [.] 0x000000000000070c
  1.06%  smbd                 [.] event_add_to_poll_args
  1.05%  [kernel]             [k] _raw_spin_unlock_irqrestore
  0.99%  [kernel]             [k] skb_copy_datagram_iovec
  0.98%  smbd                 [.] run_events_poll
  0.97%  [kernel]             [k] __pollwait
  ...

D. In short summary
==============================
1. Ubuntu 13.04 is with kernel-3.8.0-19 and glibc-2.17,
found some interesting symbols,
it is likely Intel SSE is leveraged, not very sure.
These sse ending symbols does not show in my original glibc-2.6.1
__memset_sse2
__strcmp_sse42
__memcpy_ssse3

2. CFLAGS="-O3" gains additional IOps than simply CFLAGS="-O"

3. vfprintf() shows in perf top with samba-3.6 series.
After checkout v3-6-test branch,
CHECK_DEBUGLVL() is not inside s3_event_debug()@source3/lib/events.c,
imho it might be worth pushing to v3-6-test.

E. Next steps
==============================
With latest Ubuntu 13.04,
SMB2 still spent more computing cycles in user-space than kernel-space,
and glibc and smbd are both involved in user-space.
Replacing original glibc-2.6.1 with glibc-2.17 or later required porting
efforts.
CFLAGS="-O3" really rocks,
and found this one: https://bugzilla.samba.org/show_bug.cgi?id=9412
Next I would like to back to my original developing environment,
test with -O3 and #9412, and report status ASAP.
Any more suggestions are appreciated,
thanks.

Regards,
Jones


More information about the samba-technical mailing list