[TDB] Patches for file and memory usage growth issues
Rusty Russell
rusty at samba.org
Mon Apr 18 01:23:00 MDT 2011
On Thu, 14 Apr 2011 14:14:32 -0400, simo <idra at samba.org> wrote:
> On Wed, 2011-04-13 at 14:46 +0930, Rusty Russell wrote:
> Hi Rusty,
> unfortunately this still doesn't seem to really help.
>
> I've slightly modified my tests so I've rerun 3 tests:
> 1. plain
> 2. with your first 3 patches
> 3. with all 4 patches
>
> plain is the baseline and it includes my patches to use better
> heuristics as published in my git tree.
>
> With your patches I actually see both an increase in time spent and size
> of the memory footprint as well as final size of the tdb
> unfortunately.
>
>
> The strict repack looks certainly overkill with thess tests, although
> the basic repack patches do not hit too hard on time spent although the
> final tdb is still 300MiB larger than without the patch.
Yes, let's ignore that overagressive repack as a completely bad idea.
And this clearly reveals YA bug in my repack code:
> PLAIN tdb:
...
> Size of file/data: 1048993792/665083314
> Number of records: 76628
> Smallest/average/largest keys: 12/48/65
> Smallest/average/largest data: 43/8631/1289140
> Smallest/average/largest padding: 20/1283/322353
Vs:
> Rusty's TDB REPACK:
...
> Size of file/data: 1378013184/820552100
> Number of records: 89944
> Smallest/average/largest keys: 12/50/65
> Smallest/average/largest data: 43/9072/1289140
> Smallest/average/largest padding: 9/1847/322353
I have reproduced it, and am now tracking it down...
My test program is below for reference, it very roughly approximates
your test at a TDB level, I believe.
Cheers,
Rusty.
PS. Oh, an dropping _PUBLIC_ was not deliberate. Will fix!
#include <ccan/tdb/tdb.h>
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
#include <unistd.h>
#include <err.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
int main(int argc, char *argv[])
{
unsigned int i, j, users, groups;
TDB_DATA idxkey, idxdata;
TDB_DATA k, d, gk;
char cmd[100];
struct tdb_context *tdb;
if (argc != 3) {
printf("Usage: growtdb-bench <users> <groups>\n");
exit(1);
}
users = atoi(argv[1]);
groups = atoi(argv[2]);
sprintf(cmd, "cat /proc/%i/statm", getpid());
tdb = tdb_open("/tmp/growtdb.tdb", 10000, TDB_DEFAULT,
O_RDWR|O_CREAT|O_TRUNC, 0600);
idxkey.dptr = (unsigned char *)"User index";
idxkey.dsize = strlen("User index");
idxdata.dsize = 51;
idxdata.dptr = calloc(idxdata.dsize, 1);
/* Create users. */
k.dsize = 48;
k.dptr = calloc(k.dsize, 1);
d.dsize = 64;
d.dptr = calloc(d.dsize, 1);
tdb_transaction_start(tdb);
for (i = 0; i < users; i++) {
memcpy(k.dptr, &i, sizeof(i));
if (tdb_store(tdb, k, d, TDB_INSERT) != 0)
errx(1, "tdb insert failed: %s", tdb_errorstr(tdb));
/* This simulates a growing index record. */
if (tdb_append(tdb, idxkey, idxdata) != 0)
errx(1, "tdb append failed: %s", tdb_errorstr(tdb));
}
if (tdb_transaction_commit(tdb) != 0)
errx(1, "tdb commit1 failed: %s", tdb_errorstr(tdb));
system(cmd);
/* Now put them all in groups: add 32 bytes to each record for
* a group. */
gk.dsize = 48;
gk.dptr = calloc(k.dsize, 1);
gk.dptr[gk.dsize-1] = 1;
d.dsize = 32;
for (i = 0; i < groups; i++) {
tdb_transaction_start(tdb);
/* Create the "group". */
memcpy(gk.dptr, &i, sizeof(i));
if (tdb_store(tdb, gk, d, TDB_INSERT) != 0)
errx(1, "tdb insert failed: %s", tdb_errorstr(tdb));
/* Now populate it. */
for (j = 0; j < users; j++) {
/* Append to the user. */
memcpy(k.dptr, &j, sizeof(j));
if (tdb_append(tdb, k, d) != 0)
errx(1, "tdb append failed: %s",
tdb_errorstr(tdb));
/* Append to the group. */
if (tdb_append(tdb, gk, d) != 0)
errx(1, "tdb append failed: %s",
tdb_errorstr(tdb));
}
if (tdb_transaction_commit(tdb) != 0)
errx(1, "tdb commit2 failed: %s", tdb_errorstr(tdb));
system(cmd);
}
return 0;
}
More information about the samba-technical
mailing list