files locked forever -- logs! :)

David Lee t.d.lee at durham.ac.uk
Thu Jun 21 16:31:12 GMT 2001


On Thu, 21 Jun 2001, Syzop wrote:

> [...]
> == B. THIS IS WHEN NETSCAPE DOES NOT START ('file already in use
> blabla') ==
> [...]
> [2001/06/21 14:41:43, 3] lib/util.c:unix_clean_name(384)
>   unix_clean_name [internet/Netscape/Communicator/Program/netscape.exe]
> [2001/06/21 14:41:43, 5] smbd/open.c:open_mode_check(496)
>   open_mode_check: breaking oplock (3) on file
> internet/Netscape/Communicator/Program/netscape.exe, dev = 305, inode =
> 1648344
> [2001/06/21 14:41:43, 3] smbd/oplock.c:request_oplock_break(930)
>   request_oplock_break: sending a oplock break message to pid 18071 on
> port 2104 for dev = 305, inode = 1648344, tv_sec = 3b31cd15, tv_usec =
> 7554e
> [2001/06/21 14:41:43, 0] smbd/oplock.c:receive_local_message(126)
>   receive_local_message. Error in recvfrom. (Connection refused).
> [2001/06/21 14:41:43, 0] smbd/oplock.c:request_oplock_break(1016)
>   request_oplock_break: error in response received to oplock break
> request to pid 18071 on port 2104 for dev = 305, inode = 1648344, tv_sec
> = 3b31cd15, tv_usec = 7554e
>   Error was (Connection refused).
> [2001/06/21 14:41:43, 0] smbd/open.c:open_mode_check(509)
>   open_mode_check: FAILED when breaking oplock (4015) on file
> internet/Netscape/Communicator/Program/netscape.exe, dev = 305, inode =
> 1648344
> [...]
> 
> I think somebody who knows the whole oplock stuff / sharemode blah would
> see pretty fast what's wrong.

Well I don't have a clue about oplocks.  So what follows may be completely
wrong. 

> Looks to me like some client was granted an exclusive oplock and doesn't
> respond on a oplock break request
> when another client wants to open that file, IIRC samba should then
> remove the oplock,
> but it looks like something else happends, anyway... I'm not an expert,

It does indeed seem to be trying to grab the oplock from another process,
in this case 18071 .

When this fault strikes, check whether that process (whatever its number) 
really exists.  (Indeed, if you have the log files from the above incident
still lying around, you might even be able to trace that incident.) 

See if there is a set of messages such as: 

[2001/06/18 14:16:23, 0] ../lib/fault.c:fault_report(40)
  ===============================================================
[2001/06/18 14:16:23, 0] ../lib/fault.c:fault_report(41)
  INTERNAL ERROR: Signal 11 in pid 26673 (2.2.0)
  Please read the file BUGS.txt in the distribution
[2001/06/18 14:16:23, 0] ../lib/fault.c:fault_report(43)
  ===============================================================
[2001/06/18 14:16:23, 0] ../lib/util.c:smb_panic(1139)
  PANIC: internal error

In my case this was pid 26673 (your 18071) reporting its own untimely
demise.  As a result of this, I see other processes getting stuck when
trying to grab oplocks from that (now absent) process.

I understand (see other thread running on this list for last couple of
days) that there is a known problem of this nature in 2.2.0, which is
apparently corrected in the forthcoming release.

Apparently, a process holding an oplock goes away unexpectedly:  it leaves
behind the "======" log, but because of the nature of the exit, cannot
clear its oplocks, which are left trailing.   Future processes don't
detect this combination of circumstances (oplocks from deceased process)
and themselves trip over...


> I'm looking in the samba source code
> for just two days or something.

See if it classifies as above.  If so, your choices are probably:

1. backtrack to 2.0.7 (ideally 2.0.9);

2. try to live with and firefight the problem for now (what we are doing);
   basically it means re-starting samba on your server.

3. (not recommended unless totally expert): update your source from CVS.

> I hope somebody can look at it, since it's a bit difficult for me to
> trace the error exactly (and to fix it).

As I say, I'm totally inexpert on oplocks.  But it sounds something like a
"known problem", which may be firefightable until the next release. 

Hope that helps.

-- 

:  David Lee                                I.T. Service          :
:  Systems Programmer                       Computer Centre       :
:                                           University of Durham  :
:  http://www.dur.ac.uk/t.d.lee/            South Road            :
:                                           Durham                :
:  Phone: +44 191 374 2882                  U.K.                  :





More information about the samba-technical mailing list