Winbindd using 100% of CPU. Any solution?

Andreas Schneider asn at samba.org
Wed Dec 4 12:27:11 MST 2013


On Wednesday 04 December 2013 11:11:46 Richard Sharpe wrote:
> On Wed, Dec 4, 2013 at 10:49 AM, Richard Sharpe
> 
> <realrichardsharpe at gmail.com> wrote:
> > On Thu, Nov 21, 2013 at 12:09 AM, Andreas Schneider <asn at samba.org> wrote:
> >> On Tuesday 19 November 2013 16:55:43 Jeremy Allison wrote:
> >>> On Tue, Nov 19, 2013 at 04:50:16PM -0800, Richard Sharpe wrote:
> >>> > On Tue, Nov 19, 2013 at 3:35 PM, Jeremy Allison <jra at samba.org> wrote:
> >>> > > On Tue, Nov 19, 2013 at 03:31:55PM -0800, Richard Sharpe wrote:
> >>> > >> Hi folks,
> >>> > >> 
> >>> > >> We are seeing the same problem as in
> >>> > >> http://samba.2283325.n4.nabble.com/Winbind-using-100-CPU-td4646572.
> >>> > >> html
> >>> > >> 
> >>> > >> This is with Samba 3.6.12+ under FreeBSD.
> >>> > >> 
> >>> > >> Does anyone have a solution?
> >>> > > 
> >>> > > Ask Jim for his DLIST_ macro changes.
> >>> > 
> >>> > There were some changes there already.
> >>> > 
> >>> > Rather than panic, would it be reasonable to simply refuse to add the
> >>> > duplicate entry and log more info and dump the stack?
> >>> 
> >>> Whatever you need to help track it down. But IMHO panic == dump the
> >>> stack.
> >> 
> >> Maybe also get a
> >> 
> >> talloc_report_full(0, fopen("/tmp/talloc_report.log","w"))
> >> 
> >> which often gives you a hint what's going on.
> > 
> > Well, it actually seems to be a different problem because my core-dump
> > patch did not hit, as far as I can see.
> > 
> > Here is the traceback:
> > 
> > (gdb) where
> > 
> > #0  0x00000000004bc8c7 in winbindd_reinit_after_fork
> > (myself=0x80334cdc0, logfilename=0x8033d7b80
> > "/var/log/samba/log.wb-XCHANGE")
> > 
> >     at winbindd/winbindd_dual.c:1244
> > 
> > #1  0x00000000004bce5a in fork_domain_child (child=0x80334cdc0) at
> > winbindd/winbindd_dual.c:1362
> > 
> > #2  0x00000000004b9273 in wb_child_request_trigger (req=0x803387d50,
> > private_data=0x0) at winbindd/winbindd_dual.c:145
> > 
> > #3  0x00000000005bd8c9 in tevent_queue_immediate_trigger
> > (ev=0x80331e110, im=0x803387b10, private_data=0x8033d7b50) at
> > ../lib/tevent/tevent_queue.c:144
> > 
> > #4  0x00000000005bbcbf in tevent_common_loop_immediate
> > (ev=0x80331e110) at ../lib/tevent/tevent_immediate.c:139
> > 
> > #5  0x00000000005b8852 in run_events_poll (ev=0x80331e110, pollrtn=0,
> > pfds=0x0, num_pfds=0) at lib/events.c:197
> > 
> > #6  0x00000000005b90ab in s3_event_loop_once (ev=0x80331e110,
> > location=0xb51e84 "winbindd/winbindd.c:1456") at lib/events.c:331
> > 
> > #7  0x00000000005ba31f in _tevent_loop_once (ev=0x80331e110,
> > location=0xb51e84 "winbindd/winbindd.c:1456") at
> > ../lib/tevent/tevent.c:494
> > 
> > #8  0x000000000048a585 in main (argc=3, argv=0x7fffffffecb0,
> > envp=0x7fffffffecd0) at winbindd/winbindd.c:1456
> 
> Those line numbers seem messed up. I caught it in talloc_free, so it
> looks like we are in a loop here:
> 
>         for (domain = domain_list(); domain; domain = domain->next) {
>                 TALLOC_FREE(domain->check_online_event);
>         }
> 
> Here is what the list of domain looks like and the prev pointers are
> seriously messed up:

Well, that's what we need to find out, where do they get messed up. We still 
have no clue here. I think Günther is running into this issue too but wasn't 
able to find out what's going on. I hope you have more luck, please keep 
digging.


-- 
Andreas Schneider                   GPG-ID: CC014E3D
Samba Team                             asn at samba.org
www.samba.org



More information about the samba-technical mailing list