I'm probably looking for suggestions on tracing
problems rather than a specific fix, since I don't see
any links relevant to my problem.  

We have a Samba PDC with an ldap backend, serving
about 100 active users and another 400 relatively
inactive lab machines that connect once at the
beginning of the day and possibly load a file or two
from a network share.  

We have a problem where occasionally (couple of times
a day to a couple of times a week) a user smbd process
pegs the cpu, causing login failures for other users. 
This is always one of the active users, not a lab
account. Existing sessions proceed smoothly.  Until we
identify the process with top and kill it, the problem
persists.  In attempt to drown the problem in
resources, we recently upgraded the server from a
generally underworked pIII 500/ 1 gig ram /ultra 2
scsi disks to a 2.4 gHz/2 gig server with U320 disks. 
Problem still comes up.  I conclude the box is at
least adequate.  Hell, the previous one was plenty

Some details:
[root at student1 sbin]# rpm -q samba
[root at student1 sbin]# rpm -q openldap
[root at student1 sbin]# uname -a
Linux server.redacted.edu 2.6.10-1.770_FC3smp #1 SMP
Thu Feb 24 14:20:06 EST 2005 i686 i686 i386 GNU/Linux

top - 11:05:39 up 2 days, 22:57,  3 users,  load
average: 0.08, 0.06, 0.08
Tasks: 362 total,   1 running, 361 sleeping,   0
stopped,   0 zombie
Cpu(s):  2.0% us,  4.1% sy,  0.0% ni, 91.2% id,  0.0%
wa,  0.5% hi,  2.1% si
Mem:   2074988k total,  2046280k used,    28708k free,
  353724k buffers
Swap:  4096532k total,      200k used,  4096332k free,
 1353236k cached

Here's a top exerpt taken when experiencing the
 6185 jesterj   22   0 17180 5008 3912 R 99.6  0.2 
17:49.13 smbd
[root at student1 /var]# pstree jesterj

I omit the config files, because the setup works, most
of the time.  It's not a basic setup problem.  I'd be
happy to provide the configs if they'll help. 

It's possible that the smbd process is actually
waiting for ldap to do something; we have a tool for
account creation that writes directly to ldap (ie, not
involved in samba).  That is affected by the slowdown,
too.  This could be explained by smbd bogging the cpu
down in general, or it could indicate that a problem
with ldap is causing a symptom in smbd.  

If nobody has seen the same thing, I'd appreciate some
pointers in debugging this.  If we turn on significant
logging on the ldap server, things slow to a crawl. 
We can do it off peak hours, though.  

I turned oplocks off because there were some oplock
break failures in a log, but that didn't fix it and I
can't confirm there was any relationship.  The user
may have been on a separate session entirely.  

So to recap:  I need to dig information out of a
running process smbd, which does not have any children
spawned, and find out what it is doing/trying to do.

Thanks for any pointers.

