FCNTL on Solaris
davecb at canada.sun.com
Mon Apr 22 10:05:01 GMT 2002
> Tridge found the (already noted) related bug on our system and conceded it
> was a design flaw. Apparently each new smbd process that starts, does a
> quick traversal of the tdb databases to clean out any stale entries, and on
> Solaris, these are taking too long.
I've found a bunch of fixed bugs on fcntl performance,
implying it's been even slower in the past (:-))
>Ok - discussed this with Andrew last night. It seems that this is only
>a problem on Solaris. Solaris seems to have *serious* issues with fcntl
>locks with multiple processes contending for locks. No other system we
>run on seems to have this problem (they have their own problems :-).
At the expense of not addressing the Sun side of the
problem, might I suggest that validation operations
Throwing my mind into a past life with safety-critical
real-time, I opine that the check without locks will
1) succeed in bounded time dependent on the number
of structures traversed & checked
2) fail because the structures are invalid (in this case
stale) in bounded time, at which point one
chooses to take a lock and remove them.
3) fail in bounded time because the structures were
changed by a program using locking, and the
non-locked program is seeing changing data.
In this case we elect to try to take a lock,
fail because it's already held, wait interminably
for it to complete, get the lock, and
a) find it's done and exit
b) find it still needs to be done and do it.
The third is interesting because the other threads or
processes are delaying us some amount before we get to
do any work. This, you might imagine, is a problem when
you try to demonstrate correctness within lime limits (;-))
I haven't looked at the code, but if it uses F_SETLKW
you might want to do a trylock first, implemented via
F_GETLK or F_SETLK, as this would allow subsequent
processes to continue, knowing that someone's fixing
the tdb, and that they can access it later using the
normal locking regime.
> >Dave CB - can you investigate this within Sun please. This is a *critical*
> >part of Samba, we may have to look into a solaris-specific workaround and
> >this would be bad.
Bad is an understatement...
David Collier-Brown, | Always do right. This will gratify
Performance & Engineering | some people and astonish the rest.
Americas Customer Engineering, | -- Mark Twain
(905) 415-2849 | davecb at canada.sun.com
More information about the samba-technical