[Samba] tdb/tdbutil.c:tdb_chainlock_with_timeout_internal
Hansjörg Maurer
hansjoerg.maurer at dlr.de
Mon Jul 24 20:04:22 GMT 2006
Hi
we had an comparable issue with gpfs clusterfilesystem from IBM at
11/2005 I posted on samba technical (subject tdb_lock problem on gpfs
filesystem). Smbd went to D state sometimes to in this case.
Mostly we recognized the problem with the tdb files of the printer (
the samba server was acting as a printserver to)
I got the following information from the IBM gpfs list:
"Also, Samba uses fcntl locking extensively on these files and may be
maintaining thousands of individual locks. GPFS specifically sets a
limit on the number of fcntl ranges allowed on a file at one time (to
prevent a runaway or deviant application from consuming large amounts of
resources recording such locks). I expect you are exceeding this limit,
but you can configure a larger value: "mmchconfig
maxFcntlRangesPerFile=10000.
The default is 200 and the acceptable range is currently 10-200000"
Increasing this (undocumented) value to 10000 solves the problem in our
case.
Maybe there is a similar restriction with vertiasFS.
Have you tried to start smbd with an
strace -e fcntl -f smbd
to trace down the system call?
In our case it shows something like
fcntl(18, F_SETLKW, {type=F_WRLCK, whence=SEEK_SET, start=936, len=1}) =
-1 ENOLCK (No locks available)
which indicates a problem with the filesystem.
Greetings
Hansjörg
Pappas, Bill wrote:
>Jeremy,
>
>I was in a position (last night) to upgrade to 3.0.23a.
>Again, I was using 3.0.21c.
>
>If smbd goes into the D state, we can at least eliminate the possibility
>that it is an unexpected 3.0.21c bug.
>
>
>Thanks,
>Bill Pappas - System Integration Engineer - SAN
>St. Jude Children's Research Hospital
>332 North Lauderdale
>Memphis, TN 38105
>Danny Thomas Tower - Room D1010
>Mail Stop 312
>
>-----Original Message-----
>From: Pappas, Bill
>Sent: Saturday, July 22, 2006 4:01 PM
>To: Jeremy Allison
>Cc: samba at lists.samba.org
>Subject: RE: [Samba] tdb/tdbutil.c:tdb_chainlock_with_timeout_internal
>
>Jeremy Allison wrote:
>
>
>>>Then it might be an intermittent bug in Veritas. What system call is
>>>smbd hanging on ? smbd should never hang in the D wait state unless
>>>it's a filesystem bug.
>>>
>>>
>
>I am beginning to believe that this could make sense. Let me emphasize
>that ./private/secrets.tdb is shared between two samba servers (via
>clustered vxfs) that are running independently. Only one server runs
>nmbd at a time as veritas cluster server fails nmbd over between servers
>as needed. I just figured keeping smbd running up on both servers to
>reduce failover time. I discovered that I had to share secrets.tdb to
>ensure that either samba server would remain as a domain member server.
>Is there another way to do what I am doing? I'd gladly stop sharing
>this file if I could keep smbd up on both servers. Does smbd need a
>lock on secrets.tdb? I thought (probably wrong) that only nmbd relied on
>this file?
>
>Further below, you will find some more logs between clients and the
>server running nmbd and smbd (as the other was sitting idle with smbd
>running). SJMEMDC05 is a windows domain controller and the other clients
>are windows explorer clients.
>
>When you see these logs, they appear to confirm that secrets.tcb is
>directly involved, but how would a locking issue with this file cause
>smbd to go to the D state (and stay)?
>
>log.hc-dfinkletest: tdb(/usr/local/samba-3.0.21c/private/secrets.tdb):
>tdb_lock failed on list 78 ltype=1 (Interrupted system call)
>log.hc-dfinkletest: tdb_chainlock_with_timeout_internal: alarm (10)
>timed out for key SJMEMDC05 in tdb
>/usr/local/samba-3.0.21c/private/secrets
>.tdb
>log.hc-dfinkletest: tdb(/usr/local/samba-3.0.21c/private/secrets.tdb):
>tdb_lock failed on list 78 ltype=1 (Interrupted system call)
>log.hc-dfinkletest: tdb_chainlock_with_timeout_internal: alarm (10)
>timed out for key SJMEMDC05 in tdb
>/usr/local/samba-3.0.21c/private/secrets
>.tdb
>log.hc-dfinkletest: tdb(/usr/local/samba-3.0.21c/private/secrets.tdb):
>tdb_lock failed on list 78 ltype=1 (Interrupted system call)
>log.hc-dfinkletest: tdb_chainlock_with_timeout_internal: alarm (10)
>timed out for key SJMEMDC05 in tdb
>/usr/local/samba-3.0.21c/private/secrets
>.tdb
>log.hc-mwang1: tdb(/usr/local/samba-3.0.21c/private/secrets.tdb):
>tdb_lock failed on list 78 ltype=1 (Interrupted system call)
>log.hc-mwang1: tdb_chainlock_with_timeout_internal: alarm (10) timed
>out for key SJMEMDC05 in tdb
>/usr/local/samba-3.0.21c/private/secrets.tdb
>log.hc-mwang1: tdb(/usr/local/samba-3.0.21c/private/secrets.tdb):
>tdb_lock failed on list 78 ltype=1 (Interrupted system call)
>log.hc-mwang1: tdb_chainlock_with_timeout_internal: alarm (10) timed
>out for key SJMEMDC05 in tdb
>/usr/local/samba-3.0.21c/private/secrets.tdb
>log.hc-mwang1: tdb(/usr/local/samba-3.0.21c/private/secrets.tdb):
>tdb_lock failed on list 78 ltype=1 (Interrupted system call)
>log.hc-mwang1: tdb_chainlock_with_timeout_internal: alarm (10) timed
>out for key SJMEMDC05 in tdb
>/usr/local/samba-3.0.21c/private/secrets.tdb
>log.hc-mwang1: tdb(/usr/local/samba-3.0.21c/private/secrets.tdb):
>tdb_lock failed on list 78 ltype=1 (Interrupted system call)
>log.hc-mwang1: tdb_chainlock_with_timeout_internal: alarm (10) timed
>out for key SJMEMDC05 in tdb
>/usr/local/samba-3.0.21c/private/secrets.tdb
>log.hc-mwang1: tdb(/usr/local/samba-3.0.21c/private/secrets.tdb):
>tdb_lock failed on list 78 ltype=1 (Interrupted system call)
>log.hc-mwang1: tdb_chainlock_with_timeout_internal: alarm (10) timed
>out for key SJMEMDC05 in tdb
>/usr/local/samba-3.0.21c/private/secrets.tdb
>log.hc-mwang1: tdb(/usr/local/samba-3.0.21c/private/secrets.tdb):
>tdb_lock failed on list 78 ltype=1 (Interrupted system call)
>log.hc-mwang1: tdb_chainlock_with_timeout_internal: alarm (10) timed
>out for key SJMEMDC05 in tdb
>/usr/local/samba-3.0.21c/private/secrets.tdb
>
>Thanks,
>Bill Pappas - System Integration Engineer - SAN
>St. Jude Children's Research Hospital
>332 North Lauderdale
>Memphis, TN 38105
>Danny Thomas Tower - Room D1010
>Mail Stop 312
>
>-----Original Message-----
>From: Jeremy Allison [mailto:jra at samba.org]
>Sent: Saturday, July 22, 2006 10:56 AM
>To: Pappas, Bill
>Cc: jra at samba.org; samba at lists.samba.org
>Subject: Re: [Samba] tdb/tdbutil.c:tdb_chainlock_with_timeout_internal
>
>On Fri, Jul 21, 2006 at 06:17:09PM -0500, Pappas, Bill wrote:
>
>
>>I will say this works for weeks on end w/o a problem. When you say
>>
>>
>this will not work, why? I've had no real problems with the veritas
>clustered fs. It adheres to file locking and fcntl operations like any
>normal local filesystem (ext3).
>
>Then it might be an intermittent bug in Veritas. What system call is
>smbd hanging on ? smbd should never hang in the D wait state unless
>it's a filesystem bug.
>
>Jeremy.
>
>
>
>
More information about the samba
mailing list