fcntl F_SETLKW64 failing on Solaris

Tristan Ball tristanb at vsl.com.au
Wed Jan 9 15:10:03 GMT 2002

On a vaguely related note:

We're now running 2.2.3pre from cvs, on solaris 8. The one samba instance
provides 3 different servers, via netbios aliases and included config files.
That means we regularly get between 400-700 active samba processes.

As some of you are aware we have had quite a few troubles in the past, with
fnctl locks on the TDB's, which seems to manifest as a very high contention
rate on one or more kernel mutexes, followed by a graduall descent into
madness - corrupt tdb's, and recently the semephore timout problem Romeril
was reporting.

On our previous release, 2.2.1a+patches, when mutex contention was high, the
load average sky rocketed, and the box crawled. This was
compiled --with-spinlocks, and using nanosleep rather than sched yield.
On our current release, 2.2.3pre, again I've used --with-spinlocks, but this
time I tried sched_yield. We had a short period yesturday when we were
getting 4-500 blocks on kernel mutexes per second (normal is about 100 for
us). This time the load average was stable at 3 (a little higher than normal
for us, but not much), and the box stayed responsive. Interestingly, the
spike occured while I had debug level =3. Reducing that to 1, and sighuping
samba returned the machine to normal.

I don't think this is directly related to the solaris fnctl problem, but
I've found that the more I can reduce the contention on the mutexes, the
better samba behaves. Going to 2.2.3pre, and moving some moderately cpu
intensive processes off the CPU have improved things immensely for us.


----- Original Message -----
From: "Jeremy Allison" <jra at samba.org>
To: <David.Collier-Brown at Sun.COM>
Cc: "Romeril, Alan" <a.romeril at ic.ac.uk>; <samba-technical at samba.org>
Sent: Wednesday, January 09, 2002 7:38 AM
Subject: Re: fcntl F_SETLKW64 failing on Solaris

> On Tue, Jan 08, 2002 at 02:42:52PM -0500, David Collier-Brown wrote:
> >
> > I suspect we're seeing a Solaris-specific bug, but without
> > the errno I'm puzzled as to what we should do about it.
> > ENOLCK would be easier to deal with than EOVERFLOW, and
> > harder than EIO ir EINTR...
> Yeah I'm concerned about that, as people do use Samba on large
> Solaris servers mainly. Solaris fcntl lock code has had historical
> problems with mmapped files (doesn't work over NFS as I recall)
> and we really need this to work right for tdb.
> Any more info would help, even if it just gets us a workaround to
> a Solaris bug (not that I'm implying Solaris has a bug here, it's
> just a possibility :-) :-).
> Jeremy.

More information about the samba-technical mailing list