svn commit: samba r20690 - in branches/SAMBA_3_0/source/libsmb: .

tridge at samba.org tridge at samba.org
Thu Jan 11 23:10:17 GMT 2007


Author: tridge
Date: 2007-01-11 23:10:16 +0000 (Thu, 11 Jan 2007)
New Revision: 20690

WebSVN: http://websvn.samba.org/cgi-bin/viewcvs.cgi?view=rev&root=samba&rev=20690

Log:

fix a bug that causes smbd to 'hang' intermittently.

The problem occurs like this:

  1) running smbd as a domain member without winbindd

  2) client1 connects, during auth smbd-1 calls update_trustdom_cache()

  3) smbd-1 takes the trustdom cache timestamp lock, then starts
     enumerate_domain_trusts

  4) enumerate_domain_trusts hangs for some unknown reason

  5) other clients connect, all block waiting for read lock on trustdom
     cache

  6) samba is now hung

The problem is the lock, and really its just trying to avoid a race
where the cure is worse than the problem. A race in updating the
trutdom cache is not a big issue. So I've just removed the lock.

It is still an open question why enumerate_domain_trusts() can
hang. Unfortunately I've not in a position to get a sniff at the site
that is affected. I suspect a full fix will involve ensuring that all
the rpc code paths have appropriate timeouts.

Modified:
   branches/SAMBA_3_0/source/libsmb/trustdom_cache.c


Changeset:
Modified: branches/SAMBA_3_0/source/libsmb/trustdom_cache.c
===================================================================
--- branches/SAMBA_3_0/source/libsmb/trustdom_cache.c	2007-01-11 23:09:57 UTC (rev 20689)
+++ branches/SAMBA_3_0/source/libsmb/trustdom_cache.c	2007-01-11 23:10:16 UTC (rev 20690)
@@ -250,24 +250,6 @@
 }
 
 
-/*******************************************************************
- lock the timestamp entry in the trustdom_cache
-*******************************************************************/
-
-BOOL trustdom_cache_lock_timestamp( void )
-{
-	return gencache_lock_entry( TDOMTSKEY ) != -1;
-}
-
-/*******************************************************************
- unlock the timestamp entry in the trustdom_cache
-*******************************************************************/
-
-void trustdom_cache_unlock_timestamp( void )
-{
-	gencache_unlock_entry( TDOMTSKEY );
-}
-
 /**
  * Delete single trustdom entry. Look at the
  * gencache_iterate definition.
@@ -314,8 +296,7 @@
 	time_t now = time(NULL);
 	int i;
 	
-	/* get the timestamp.  We have to initialise it if the last timestamp == 0 */
-	
+	/* get the timestamp.  We have to initialise it if the last timestamp == 0 */	
 	if ( (last_check = trustdom_cache_fetch_timestamp()) == 0 ) 
 		trustdom_cache_store_timestamp(0, now+TRUSTDOM_UPDATE_INTERVAL);
 
@@ -325,11 +306,12 @@
 		DEBUG(10,("update_trustdom_cache: not time to update trustdom_cache yet\n"));
 		return;
 	}
+
+	/* note that we don't lock the timestamp. This prevents this
+	   smbd from blocking all other smbd daemons while we
+	   enumerate the trusted domains */
+	trustdom_cache_store_timestamp(now, now+TRUSTDOM_UPDATE_INTERVAL);
 		
-	/* lock the timestamp */
-	if ( !trustdom_cache_lock_timestamp() )
-		return;
-	
 	if ( !(mem_ctx = talloc_init("update_trustdom_cache")) ) {
 		DEBUG(0,("update_trustdom_cache: talloc_init() failed!\n"));
 		goto done;
@@ -338,20 +320,19 @@
 	/* get the domains and store them */
 	
 	if ( enumerate_domain_trusts(mem_ctx, lp_workgroup(), &domain_names, 
-		&num_domains, &dom_sids) ) 
-	{
+		&num_domains, &dom_sids)) {
 		for ( i=0; i<num_domains; i++ ) {
 			trustdom_cache_store( domain_names[i], NULL, &dom_sids[i], 
 				now+TRUSTDOM_UPDATE_INTERVAL);
-		}
-		
-		trustdom_cache_store_timestamp( now, now+TRUSTDOM_UPDATE_INTERVAL );
+		}		
+	} else {
+		/* we failed to fetch the list of trusted domains - restore the old
+		   timestamp */
+		trustdom_cache_store_timestamp(last_check, 
+					       last_check+TRUSTDOM_UPDATE_INTERVAL);
 	}
 
 done:	
-	/* unlock and we're done */
-	trustdom_cache_unlock_timestamp();
-	
 	talloc_destroy( mem_ctx );
 	
 	return;



More information about the samba-cvs mailing list