PR#21625: Smbd processes get stuck and consume 100% CPU.

Tue Aug 21 02:38:42 GMT 2001

Scott Moomaw wrote:
> 
> Fredrik and Jeremy,
> 
> After my submission last night to samba-technical regarding my locking
> observations, I found PR#21625 in the bugs database with your notes thus
> far.  I don't know if I'll be succesful or not, but I've been working on
> adding code to open_mode_check to bump debug levels after a process has
> spun too long.  I'm trying to add sufficient debugging tools to determine
> what is happening.  Thus far, my tdbtool dumps haven't been extremely
> helpful because I've only caught the system after things get deadlocked
> totally.  I'll pass along any details that I get on this for your review.
> I'm hoping tomorrow to find a way to recreate this consistently.  If there
> is something that I can do to help, just let me know.  I've started to put
> some serious time into understanding what's going on here in an effort to
> help solve it.

I've jut checked some changes into HEAD and 2.2 CVS to address
this problem. The first thing I did was to add a deliberate panic if
an smbd is trying to break its own oplock and cannot find a reference
to that file in its open file list. This will allow someone
to use a debugger to catch this in the act (with an additional
'panic action' script).

The second is a robustness fix that (theoretically) shouldn't
be needed, that ensures that once an oplock break request
has returned, that the record that caused this break to
be triggered is removed by the requesting process (if it
exists). The reason this shouldn't be neccessary is that
the receiving process should be removing this record even
if the client failed to respond to the break.

If people experiencing the 'spin' problem could CVS update
and test this change I'd appreciate it.

Thanks,

	Jeremy.