replay cache mutex failure again

Dave Daugherty dave.daugherty at
Thu Jul 17 00:37:41 GMT 2008



A very busy Centrify-Samba 3.0.27a on a Solaris 10 with 400-500 users


Periodically tdbbackup is run from a cron job to backup secrets.tdb


Winbind is not used.


After several months of operation, suddenly this:


  tdb_chainlock_with_timeout_internal: alarm (10) timed out for key
replay cache mutex in tdb /etc/samba/private/secrets.tdb

[2008/07/08 14:00:13, 1] libads/kerberos_verify.c:ads_verify_ticket(384)

  ads_verify_ticket: unable to protect replay cache with mutex.

[2008/07/08 14:00:13, 1] smbd/sesssetup.c:reply_spnego_kerberos(316)

  Failed to verify incoming ticket with error NT_STATUS_LOGON_FAILURE!


.... And now no one can authenticate to the server via kerberos.  The
users already authenticated continued with their work okay.


I am told that a restart of Samba did not work (maybe different


I am wondering if perhaps a tdb_unlock failed silently causing all
subsequent "replay cache mutex" lock attempts to fail.


Any other ideas?


I wrote a new utility" tdblocks" that will figure out which process if
any is holding the lock, and a torture test to spawn a bunch of
processes that lock and unlock "replay cache mutex" to see if I can
repro the issue. So far no luck

I will submit this program back to samba soon.  Use if you like.



Dave Daugherty




