connections.tdb keeps getting corrupted. v2.2.1a - CAN REPROD UCE!

MCCALL,DON (HP-USA,ex1) don_mccall at hp.com
Fri Aug 31 17:46:10 GMT 2001


Hi Andrew, Jeremy, et all;
I've been able to reproduce the tdb corruption pretty much at will on an
HP-UX 11.00 box with CVS Samba2.2 from a couple of weeks ago;  I start up
samba on rkm-nt, and then run the following script on another hp-ux box on
the same subnet,to stress the connections.tdb access:

********************************
#!/usr/contrib/bin/perl
for ($i=0;$i <=100;$i++){
	$test = "test"."$i";
 	$command = "nohup /usr/local/samba/source/bin/smbclient //rkm-nt/tmp
-Uddmc%passwd -d 0 -n $test &";
	system $command;
	}
*********************************

after this completes (sometimes I have to run it twice to get the
corruption),
running tdbtool against the connections.tdb database I get the following:

*************************************************
tdb> list
hash=1
 rec: offset=16816 next=0 rec_len=596 key_len=264 data_len=328
full_hash=0x6ff86
6cb magic=0x26011999
hash=7
 rec: offset=19296 next=0 rec_len=596 key_len=264 data_len=328
full_hash=0x6c3f0
9a1 magic=0x26011999
hash=47
 rec: offset=24256 next=1936 rec_len=8488 key_len=0 data_len=1074707504
full_has
h=0x67 magic=0xd9fee666
 rec: offset=1936 next=696 rec_len=14856 key_len=264 data_len=328
full_hash=0x25
1289a1 magic=0xd9fee666
 rec: offset=696 next=0 rec_len=596 key_len=264 data_len=328
full_hash=0xbec866c
b magic=0xd9fee666
hash=62
 rec: offset=27356 next=0 rec_len=5388 key_len=0 data_len=1074707504
full_hash=0
x67 magic=0xd9fee666
ERROR: tailer does not match record! tailer=8512 totalsize=5412
hash=96
 rec: offset=1316 next=0 rec_len=596 key_len=264 data_len=328
full_hash=0x9c07ba
cb magic=0x26011999
freelist:
****************************************************

and there are entries like this in the log files from several of the
'clients':

../var/log.test98:  tdb(/stand/locks/connections.tdb): tdb_free: right free
fail
ed at 24256

I'm expecting this is a result of the data_len being this huge number, not a
cause 
itself...


NOW, I am doing this with the connections.tdb, cause it's easy to reproduce,
but there have been 
reports of the same sort of corruption on HP-UX 9.05, 10.20 and 11.00 for
messages.tdb, locking.tdb,
etc. So I'm beginning to suspect some sort of typing conflict in one or more
of the structures, etc
that the tdb code is using, where maybe an assumption about the length of a
type is incorrect on 
HP-UX, and so we're generating these huge numbers when we fill some
structure in corner cases...

I'm really having difficulty with the tdb code; can either of you give me
some pointers to pursue
collecting data that would nail this down - I have not seen this reported on
any of the other platforms,
like linux or sun so far...

Thanks,
Don










More information about the samba-technical mailing list