100% cpu utilization of Solaris x86

Neil Hoggarth neil.hoggarth at physiol.ox.ac.uk
Thu Nov 8 05:19:01 GMT 2001

On Wed, 7 Nov 2001, Richard Bollinger wrote:

> The code deleting "dead locks" in locking_init() and locking_end(),
> has a CPU load effect of order N^2, where N is the number of smbd
> processes running, assuming most processes hold at least a few
> locks; so it's a good place to start.
> I think the key fix will be in improving smbd process start up time
> so that the clients don't give up prematurely.

I can see the loading problem that the lock cleaning code is causing: we
ended up needing to defer our upgrade from 2.0.7 to 2.2 until we'd got
faster server hardware.

BUT I'd just like to flag up that some of this lock cleaning stuff was
introduced to (successfully) tackle serious orphaned lock problems that
we and other sites were experiencing with v2.2.0.

"grep -i 'logic error'" on my log.smbd files suggests that this code is
in active use on my main server (SPARC Solaris 8 1/01, Samba 2.2.3-pre
of a couple of weeks ago), cleaning out orphaned locks on a frequent and
ongoing basis.

Jeremy wrote:

> Theoretically the cleanup on exit is not needed for the locking db
> at least as it will be cleaned by other processes starting up.

If Richard is correct in his analysis that slow startup of an smbd can
cause problems due to runaway client retries (which seems plausable to
me based on problems that I've struggled with before), and it is felt
that doing cleanup on both startup and exit represents unnecessary
duplication, perhaps we ought to experiment with doing the lock cleanup
only on smbd exit (ie. comment out the startup cleanup, rather than the
exit cleanup)?

Neil Hoggarth                                 Departmental Computer Officer
<neil.hoggarth at physiol.ox.ac.uk>                   Laboratory of Physiology
http://www.physiol.ox.ac.uk/~njh/                  University of Oxford, UK

More information about the samba-technical mailing list