NTDB progress!
Rusty Russell
rusty at rustcorp.com.au
Mon Jun 4 05:41:34 MDT 2012
Hi all,
Sorry this has been delayed: two things happened. Firstly,
other duties involved me going to Hong Kong for a week. Secondly,
porting revealed an unacceptable slowdown for smaller databases going
from tdb to tdb2, so after much benchmarking, the format was simplfied
to be closer to the original tdb. See benchmarks below taken from that
commit message; we still pay a slight penalty for 64 bit.
See my ntdb-wip head:
https://git.samba.org/rusty/samba.git/?p=rusty/samba.git;a=shortlog;h=refs/heads/ntdb-wip
So far:
All of source4/ is converted to ntdb, as is ldb (it handles the
switch internally). I've written a dbwrap_open_local() which switches
between the ntdb and tdb backends based on 'use old tdb = yes'
configution option for dbwrap users. If this isn't set, I plan to use
the tdb backend if a tdb file is there, otherwise use ntdb, but I
haven't implemented that.
The general rule of conversions has to be to rename databases to
".ntdb", so it's absolutely clear. The dbwrap_open_ntdb() will change
.tdb names to .ntdb names for the moment, though dbwrap_open_tdb() will
do the reverse mapping, so you can use either method (not yet
implemented).
Everything not using dbwrap is being converted; CLEAR_IF_FIRST
or INTERNAL databases are fairly non-controversial. If something else
should not be converted, feel free to change it to use dbwrap.
Note that NTDB_DATA/struct ntdb_data is a synonym for TDB_DATA/struct
TDB_DATA if tdb.h is included before ntdb.h: without this, compatibility
becomes a nightmare, as these are used all over Samba.
To come:
There's a bit more source3 to convert, then lots of testing and
making sure the s3->s4 upgrade scripts work well. I'll be working on
this all this week.
BTW, here are the benchmarks which made me rework the NTDB hash code:
Insert Re-ins Fetch Size dbspeed
(nsec) (nsec) (nsec) (Kb) (ops/sec)
TDB (10000 hashsize):
100 records: 3882 3320 1609 53 203204
1000 records: 3651 3281 1571 115 218021
10000 records: 3404 3326 1595 880 202874
100000 records: 4317 3825 2097 8262 126811
1000000 records: 11568 11578 9320 77005 25046
TDB2 (1024 hashsize, expandable):
100 records: 3867 3329 1699 17 187100
1000 records: 4040 3249 1639 154 186255
10000 records: 4143 3300 1695 1226 185110
100000 records: 4481 3425 1800 17848 163483
1000000 records: 4055 3534 1878 106386 160774
NTDB (8192 hashsize)
100 records: 4259 3376 1692 82 190852
1000 records: 3640 3275 1566 130 195106
10000 records: 4337 3438 1614 773 188362
100000 records: 4750 5165 1746 9001 169197
1000000 records: 4897 5180 2341 83838 121901
Analysis:
1) TDB wins on first insert on small databases, beating TDB2 by
~15%, NTDB by ~10% on dbspeed.
2) TDB starts to lose when hash chains get 10 long (fetch 10% slower
than TDB2/NTDB).
3) TDB does horribly when hash chains get 100 long (fetch 4x slower
than NTDB, 5x slower than TDB2, insert about 2-3x slower).
4) TDB2 databases are 40% larger than TDB1. NTDB is about 15% larger
than TDB1.
Cheers,
Rusty.
More information about the samba-technical
mailing list