Winbindd using 100% of CPU. Any solution?

Jeremy Allison jra at samba.org
Wed Dec 4 13:00:14 MST 2013


On Wed, Dec 04, 2013 at 11:49:49AM -0800, Richard Sharpe wrote:
> On Wed, Dec 4, 2013 at 11:27 AM, Jeremy Allison <jra at samba.org> wrote:
> > On Wed, Dec 04, 2013 at 11:11:46AM -0800, Richard Sharpe wrote:
> >>
> >> Those line numbers seem messed up. I caught it in talloc_free, so it
> >> looks like we are in a loop here:
> >>
> >>         for (domain = domain_list(); domain; domain = domain->next) {
> >>                 TALLOC_FREE(domain->check_online_event);
> >>         }
> >>
> >> Here is what the list of domain looks like and the prev pointers are
> >> seriously messed up:
> >
> > Yes, we know that. The problem is finding out *HOW*
> > the pointers got messed up :-(.
> 
> OK, after fixing my line numbers, I now find that we are looping here:
> 
>         /* Destroy all possible events in child list. */
>         for (cl = winbindd_children; cl != NULL; cl = cl->next) {
>                 TALLOC_FREE(cl->lockout_policy_event);
>                 TALLOC_FREE(cl->machine_password_change_event);
> 
>                 /* Children should never be able to send
>                  * each other messages, all messages must
>                  * go through the parent.
>                  */
>                 cl->pid = (pid_t)0;
> 
>                 /*
>                  * Close service sockets to all other children
>                  */
>                 if ((cl != myself) && (cl->sock != -1)) {
>                         close(cl->sock);
>                         cl->sock = -1;
>                 }
>         }
> 
> and the winbindd_children list is seriously screwed in a couple of ways:
> 
> (gdb) p winbindd_children
> $22 = (struct winbindd_child *) 0x803358940
> (gdb) p *winbindd_children
> $23 = {next = 0xeac360, prev = 0x8033589a0, pid = 0, domain = 0x803345400,
>   logfilename = 0x8033d7c80 "/var/log/samba/log.wb-XCHANGE", sock = -1,
>   queue = 0x8033d7c50, binding_handle = 0x8033d7d50, lockout_policy_event = 0x0,
>   machine_password_change_event = 0x0, table = 0xe09580}
> (gdb) p *(winbindd_children->next)
> $24 = {next = 0x803358940, prev = 0x803358940, pid = 0, domain = 0x0,
>   logfilename = 0x803330300 "/var/log/samba/log.winbindd-idmap", sock = -1,
>   queue = 0x8033302d0, binding_handle = 0x8033303d0, lockout_policy_event = 0x0,
>   machine_password_change_event = 0x0, table = 0xe09680}
> 
> The last element points back to itself, which is the cause of the
> infinite loop, but the first element has a weird value in its next
> pointer ...

Yes, it's almost certainly a memory overwrite problem,
but as it's writing onto valid memory it's really
difficult to find. Valgrind wouldn't flag it :-(.

Jeremy.


More information about the samba-technical mailing list