[Samba] tdb/tdbutil.c:tdb_chainlock_with_timeout_internal

Hansjörg Maurer hansjoerg.maurer at dlr.de
Mon Jul 24 20:04:22 GMT 2006


Hi

we had an comparable issue with gpfs clusterfilesystem from IBM at 
11/2005 I posted on samba technical (subject tdb_lock problem on gpfs 
filesystem). Smbd went to D state sometimes to in this case.
Mostly  we recognized the problem with the tdb files of the printer ( 
the samba server was acting as a printserver to)

I got the following information from the IBM gpfs list:
"Also, Samba uses fcntl locking extensively on these files and may be 
maintaining thousands of individual locks. GPFS specifically sets a 
limit on the number of fcntl ranges allowed on a file at one time (to 
prevent a runaway or deviant application from consuming large amounts of 
resources recording such locks). I expect you are exceeding this limit, 
but you can configure a larger value: "mmchconfig 
maxFcntlRangesPerFile=10000.
The default is 200 and the acceptable range is currently 10-200000"

Increasing this (undocumented) value to 10000 solves the problem in our 
case.

Maybe there is a similar restriction with vertiasFS.

Have you tried to start smbd with an

strace -e fcntl -f smbd


to trace down the system call?
In our case it shows something like

fcntl(18, F_SETLKW, {type=F_WRLCK, whence=SEEK_SET, start=936, len=1}) =
-1 ENOLCK (No locks available)

which indicates a problem with the filesystem.

Greetings

Hansjörg











Pappas, Bill wrote:

>Jeremy,
>
>I was in a position (last night) to upgrade to 3.0.23a. 
>Again, I was using 3.0.21c.
>
>If smbd goes into the D state, we can at least eliminate the possibility
>that it is an unexpected 3.0.21c bug.   
>
>
>Thanks,
>Bill Pappas - System Integration Engineer - SAN 
>St. Jude Children's Research Hospital
>332 North Lauderdale
>Memphis, TN 38105
>Danny Thomas Tower - Room D1010
>Mail Stop 312
>
>-----Original Message-----
>From: Pappas, Bill 
>Sent: Saturday, July 22, 2006 4:01 PM
>To: Jeremy Allison
>Cc: samba at lists.samba.org
>Subject: RE: [Samba] tdb/tdbutil.c:tdb_chainlock_with_timeout_internal
>
>Jeremy Allison wrote:
>  
>
>>>Then it might be an intermittent bug in Veritas. What system call is
>>>smbd hanging on ? smbd should never hang in the D wait state unless
>>>it's a filesystem bug.
>>>      
>>>
>
>I am beginning to believe that this could make sense. Let me emphasize
>that ./private/secrets.tdb is shared between two samba servers (via
>clustered vxfs) that are running independently.  Only one server runs
>nmbd at a time as veritas cluster server fails nmbd over between servers
>as needed.  I just figured keeping smbd running up on both servers to
>reduce failover time.  I discovered that I had to share secrets.tdb to
>ensure that either samba server would remain as a domain member server.
>Is there another way to do what I am doing?  I'd gladly stop sharing
>this file if I could keep smbd up on both servers.  Does smbd need a
>lock on secrets.tdb? I thought (probably wrong) that only nmbd relied on
>this file?
>
>Further below, you will find some more logs between clients and the
>server running nmbd and smbd (as the other was sitting idle with smbd
>running). SJMEMDC05 is a windows domain controller and the other clients
>are windows explorer clients. 
>
>When you see these logs, they appear to confirm that secrets.tcb is
>directly involved, but how would a locking issue with this file cause
>smbd to go to the D state (and stay)?
>
>log.hc-dfinkletest:  tdb(/usr/local/samba-3.0.21c/private/secrets.tdb):
>tdb_lock failed on list 78 ltype=1 (Interrupted system call)
>log.hc-dfinkletest:  tdb_chainlock_with_timeout_internal: alarm (10)
>timed out for key SJMEMDC05 in tdb
>/usr/local/samba-3.0.21c/private/secrets
>.tdb
>log.hc-dfinkletest:  tdb(/usr/local/samba-3.0.21c/private/secrets.tdb):
>tdb_lock failed on list 78 ltype=1 (Interrupted system call)
>log.hc-dfinkletest:  tdb_chainlock_with_timeout_internal: alarm (10)
>timed out for key SJMEMDC05 in tdb
>/usr/local/samba-3.0.21c/private/secrets
>.tdb
>log.hc-dfinkletest:  tdb(/usr/local/samba-3.0.21c/private/secrets.tdb):
>tdb_lock failed on list 78 ltype=1 (Interrupted system call)
>log.hc-dfinkletest:  tdb_chainlock_with_timeout_internal: alarm (10)
>timed out for key SJMEMDC05 in tdb
>/usr/local/samba-3.0.21c/private/secrets
>.tdb
>log.hc-mwang1:  tdb(/usr/local/samba-3.0.21c/private/secrets.tdb):
>tdb_lock failed on list 78 ltype=1 (Interrupted system call)
>log.hc-mwang1:  tdb_chainlock_with_timeout_internal: alarm (10) timed
>out for key SJMEMDC05 in tdb
>/usr/local/samba-3.0.21c/private/secrets.tdb
>log.hc-mwang1:  tdb(/usr/local/samba-3.0.21c/private/secrets.tdb):
>tdb_lock failed on list 78 ltype=1 (Interrupted system call)
>log.hc-mwang1:  tdb_chainlock_with_timeout_internal: alarm (10) timed
>out for key SJMEMDC05 in tdb
>/usr/local/samba-3.0.21c/private/secrets.tdb
>log.hc-mwang1:  tdb(/usr/local/samba-3.0.21c/private/secrets.tdb):
>tdb_lock failed on list 78 ltype=1 (Interrupted system call)
>log.hc-mwang1:  tdb_chainlock_with_timeout_internal: alarm (10) timed
>out for key SJMEMDC05 in tdb
>/usr/local/samba-3.0.21c/private/secrets.tdb
>log.hc-mwang1:  tdb(/usr/local/samba-3.0.21c/private/secrets.tdb):
>tdb_lock failed on list 78 ltype=1 (Interrupted system call)
>log.hc-mwang1:  tdb_chainlock_with_timeout_internal: alarm (10) timed
>out for key SJMEMDC05 in tdb
>/usr/local/samba-3.0.21c/private/secrets.tdb
>log.hc-mwang1:  tdb(/usr/local/samba-3.0.21c/private/secrets.tdb):
>tdb_lock failed on list 78 ltype=1 (Interrupted system call)
>log.hc-mwang1:  tdb_chainlock_with_timeout_internal: alarm (10) timed
>out for key SJMEMDC05 in tdb
>/usr/local/samba-3.0.21c/private/secrets.tdb
>log.hc-mwang1:  tdb(/usr/local/samba-3.0.21c/private/secrets.tdb):
>tdb_lock failed on list 78 ltype=1 (Interrupted system call)
>log.hc-mwang1:  tdb_chainlock_with_timeout_internal: alarm (10) timed
>out for key SJMEMDC05 in tdb
>/usr/local/samba-3.0.21c/private/secrets.tdb
>
>Thanks,
>Bill Pappas - System Integration Engineer - SAN 
>St. Jude Children's Research Hospital
>332 North Lauderdale
>Memphis, TN 38105
>Danny Thomas Tower - Room D1010
>Mail Stop 312
>
>-----Original Message-----
>From: Jeremy Allison [mailto:jra at samba.org] 
>Sent: Saturday, July 22, 2006 10:56 AM
>To: Pappas, Bill
>Cc: jra at samba.org; samba at lists.samba.org
>Subject: Re: [Samba] tdb/tdbutil.c:tdb_chainlock_with_timeout_internal
>
>On Fri, Jul 21, 2006 at 06:17:09PM -0500, Pappas, Bill wrote:
>  
>
>>I will say this works for weeks on end w/o a problem.  When you say
>>    
>>
>this will not work, why? I've had no real problems with the veritas
>clustered fs.  It adheres to file locking and fcntl operations like any
>normal local filesystem (ext3).
>
>Then it might be an intermittent bug in Veritas. What system call is
>smbd hanging on ? smbd should never hang in the D wait state unless
>it's a filesystem bug.
>
>Jeremy.
>
>
>  
>



More information about the samba mailing list