Memory corruption issue when samba4 is the target of a net vampire command
Angelos Oikonomopoulos
angelos.oikonomopoulos at fp-commerce.de
Wed Oct 27 09:30:30 MDT 2010
On 10/27/2010 09:58 AM, Andrew Bartlett wrote:
> On Wed, 2010-10-27 at 18:52 +1100, Andrew Bartlett wrote:
>> On Tue, 2010-10-26 at 17:23 +0200, Angelos Oikonomopoulos wrote:
>>> Hello all,
>>>
>>> I've been playing around with Samba 4 from git master (specifically
>>> 5785f08268bac332d09bdf71d1907ecb54f3b5bd from last Thursday). It seems
>>> to work well so far, but I've run into a bug when trying to add a second
>>> samba4 server as an additional DC, following the instructions in
>>> http://wiki.samba.org/index.php/Samba4/HOWTO/Join_a_domain_as_a_DC.
>>>
>>> Specifically, the net vampire command described in that page crashes the
>>> existing DC pretty reliably. I've tried looking into it today but I
>>> think I'll need some help to track down the root cause (and produce a fix).
>>
>> If this is reproducible, then put printf() statements in near every
>> questionable variable. valgrind will then complain when you print
>> invalid things, now that it knows you expected it to be valid NOW,
>> rather than later.
>>
>> I don't see from what you have sent why the task would not be valid. It
>> is passed in to the socket code at the top level, and ends up being
>> passed to the accept handler.
>
> Also, liberal addition of talloc_get_type_abort() is often a quick way
> to assert that pointers are current, valid memory.
Thank you for your help. The use-after-free is real:
===661== Invalid read of size 8
==661== at 0x8637922: ldapsrv_accept_nonpriv (ldap_server.c:757)
==661== by 0x853FEE8: stream_new_connection (service_stream.c:230)
==661== by 0x8511873: single_accept_connection (process_single.c:74)
==661== by 0x853FF5D: stream_accept_handler (service_stream.c:245)
==661== by 0x8D3479B: epoll_event_loop (tevent_standard.c:310)
==661== by 0x8D34F11: std_event_loop_once (tevent_standard.c:545)
==661== by 0x8D30B11: _tevent_loop_once (tevent.c:493)
==661== by 0x8D30D4E: tevent_common_loop_wait (tevent.c:594)
==661== by 0x8D30E19: _tevent_loop_wait (tevent.c:613)
==661== by 0x404D7B: binary_smbd_main (server.c:480)
==661== by 0x404DC1: main (server.c:491)
==661== Address 0x199190b8 is 104 bytes inside a block of size 136 free'd
==661== at 0x4C240FD: free (vg_replace_malloc.c:366)
==661== by 0x99D6C17: _talloc_free_internal (talloc.c:669)
==661== by 0x99D7C16: _talloc_free (talloc.c:1141)
==661== by 0x8540FE3: task_server_terminate (service_task.c:52)
==661== by 0x863827C: ldapsrv_task_init (ldap_server.c:985)
==661== by 0x8541103: task_server_callback (service_task.c:90)
==661== by 0x85118F9: single_new_task (process_single.c:95)
==661== by 0x85411A1: task_server_startup (service_task.c:110)
==661== by 0x853F609: server_service_init (service.c:63)
==661== by 0x853F727: server_service_startup (service.c:95)
==661== by 0x404CFF: binary_smbd_main (server.c:471)
==661== by 0x404DC1: main (server.c:491)
with:
epoll_event_loop: fde=0x1a055180, priv=0x1a032090
c=0x147dfee0, ldapsrv_service=0x19ef3060, task=0x199190a0, lp_ctx=(nil),
&session_info=0x7ff000560
Terminating connection - 'failed to setup anonymous session info'
single_terminate: reason[failed to setup anonymous session info]
But it only happens on an error path. Specifically, I had done a chown
on private/ that prevented the ldap server from starting and that caused
samba to die in an unfriendly way :) After lots of tracing I eventually
tracked down the problem and only then found the ldap termination
message which had been lost in all the debug output :)
After this, the net vampire command worked successfully. Replication
works as expected and the combination of the two DCs seems to be pretty
robust in the face of arbitrary kills. Tomorrow I'll take a look and see
if I there's an easy way to avoid the use-after-free in this case so
that samba will die in a less mysterious way if the ldap server can't start.
Again, thanks for your time,
Aggelos
More information about the samba-technical
mailing list