autobuild failure: ntdb failtest
rusty at samba.org
Wed Jun 20 06:05:32 MDT 2012
On Wed, 20 Jun 2012 13:57:54 +1000, Andrew Bartlett <abartlet at samba.org> wrote:
> I just had an autobuild fail with:
TLDR: fix pushed to autobuilder.
> WAF_MAKE=1 PATH=buildtools/bin:../../buildtools/bin:$PATH waf build
> WAF_MAKE=1 PATH=buildtools/bin:../../buildtools/bin:$PATH waf install
> WAF_MAKE=1 PATH=buildtools/bin:../../buildtools/bin:$PATH waf test
> Can you give me any clues as to why this might be, or look into it for me?
> ntdb-run-capabilities (test/run-capabilities.c) failed:
OK, bin/ntdb-run-capabilities died:
> Killed by signal 6:
It aborted(). That's bad.
> ntdb-run-capabilities: ../test/run-capabilities.c:98:tap_log_messages == 0
> ntdb-run-capabilities: ../test/run-capabilities.c:100:tap_log_messages == 0
> ntdb-run-capabilities: ../test/run-capabilities.c:114:tap_log_messages == 0
> ntdb-run-capabilities: ../test/run-capabilities.c:116:tap_log_messages == 0
> ntdb-run-capabilities: ../test/run-capabilities.c:133:tap_log_messages == 0
> ntdb-run-capabilities: ../test/run-capabilities.c:137:tap_log_messages == 1
It failed tests on lines 98, 100, 114, 116, 133 and 137.
> ntdb log level 0: Locking error: ntdb_brunlock failed (fd=18) at offset 2 rw_type=0 len=1: Resource deadlock avoided
And spat out an ntdb log message.
It passed some tests, at least...
> ntdb log level 2: Success: ntdb_check: database has unknown
> capability, cannot check.
Another ntdb log message.
Some more successes.
> To reproduce: --failpath=mmmorxowxxmmomfrmeffafefF
OK, it was an injected failure (thus the failed tests above). We can
re-run the test with this argument to walk the same failure path. (FYI:
lower case is success, upper is a fail. eg. that final F is a fcntl
Now, we run:
gdb --args bin/ntdb-run-capabilities --failpath=mmmorxowxxmmomfrmeffafefF
And of course, it doesn't abort!
This usually happens because ntdb puts a random seed into the hash.
This is how it made it through autobuild :(
So I put in a printf() to print out the random seed, and ran it until it
abort(). A seed of 0xb5f19495 triggered it, for example: sure enough,
it causes a hash clash which the primitive layout code doesn't handle.
Thanks for the report, I've pushed the following fix to the autobuilder!
Author: Rusty Russell <rusty at rustcorp.com.au>
Date: Wed Jun 20 21:31:21 2012 +0930
ntdb: fix occasional abort in testing.
Occasionally, the capability test inserts multiple used records and they
clash, but our primitive test layout engine doesn't handle hash clashes
Force a seed value which we know doesn't clash.
Reported-by: Andrew Bartlett <abartlet at samba.org>
Signed-off-by: Rusty Russell <rusty at rustcorp.com.au>
diff --git a/lib/ntdb/test/run-capabilities.c b/lib/ntdb/test/run-capabilities.c
index cb03746..6503214 100644
@@ -30,6 +30,12 @@ static void create_ntdb(const char *name,
struct ntdb_layout *layout;
struct ntdb_context *ntdb;
int fd, clen;
+ union ntdb_attribute seed_attr;
+ /* Force a seed which doesn't allow records to clash! */
+ seed_attr.base.attr = NTDB_ATTRIBUTE_SEED;
+ seed_attr.base.next = &tap_log_attr;
+ seed_attr.seed.seed = 0;
key = ntdb_mkdata("Hello", 5);
data = ntdb_mkdata("world", 5);
@@ -61,7 +67,7 @@ static void create_ntdb(const char *name,
/* We open-code this, because we need to use the failtest write. */
- ntdb = ntdb_layout_get(layout, failtest_free, &tap_log_attr);
+ ntdb = ntdb_layout_get(layout, failtest_free, &seed_attr);
fd = open(name, O_RDWR|O_TRUNC|O_CREAT, 0600);
if (fd < 0)
More information about the samba-technical