Fwd: Samba 3.0.32 - oplock code

itay dar itay.dar at gmail.com
Thu Dec 11 20:29:21 GMT 2008


>On Thu, Dec 11, 2008 at 9:26 PM, Jeremy Allison <jra at samba.org> wrote:

> On Thu, Dec 11, 2008 at 07:18:14PM +0200, Itay Dar wrote:
> > Hi,
> >
> > We are working on a cluster, so if the process will die in the middle of
> > the change phase you are proposing, a state corruption will occur.
> > So this makes life hard on us, as we need to start working with
> > transactions (I think the ctdb project will have the same issues, but I
> > am not sure, I plan to look on it also)
>

> Well if smbd dies inside any number of places
> then state corruption occurs. I'm not sure that
> is a deal-breaker. How likely is this problem
> to occur ?
>

well machines in cluster do get rebooted now and than, either because of a
maintenance operation or as an hardware failure or just your everyday kernel
reboot.
kernel oom the process is also something we saw in the past, but we are
trying to eradicate (seems in qa mostly), and last but not least my fellows
from the qa like to test my system all day long by rebooting nodes. we need
a rock steady system so this is a must and we make all efforts to make our
system fail safe.

> In fact thinking about this further, having
> the writing process hold the lock, send the
> messages, then iterate over the share mode
> data changing the type from level2 -> none
> is the simplest way to fix this, and means
> the least code changes I can see.

I agree this was my initial thought also, i wanted to downgrade only the
process own share entry to none, so each new open request wont get a level
two oplock.
But i just don't know enough windows semantic to know if i can send the
break to the client from  the async release function.
So i decided to stay as close as possible to the samba original flow.
I do agree that an additional state is bad and should be avoided if
possible.

> How is this more dangerous than any
> other places where we hold the lock
> and iterate over the data and change it
> (when we're validating existing share
> modes on open, for example ) ?

>Under a cluster system you'll hold
> the lock and change the data under
> a transaction anyway, so no change
> there.

validating is a read only operation actually, so we are pretty safe there.

we do have a global scrubber which scrubs old share entires data out.
he has more knowledge  than the everyday regular peer smbd process.
This is to solve cases Where a process dies in an non clean manner.

we have adopted samba each samba process is allowed to touch only his share
entries.

so there is no real no need for transactions at the moment in our system.
they are harder to write, so i would really love to avoid them.

many thanks,
itay dar

p.s
i joined two messages to save traffic this is o.k with the samba mailing
list rules?


More information about the samba-technical mailing list