TDB: using the jenkins hash for non-persistent tdbs

Rusty Russell rusty at rustcorp.com.au
Mon Sep 20 20:21:16 MDT 2010


On Mon, 20 Sep 2010 07:11:08 pm Stefan (metze) Metzmacher wrote:
> Hi Rusty,
> 
> > On Fri, 17 Sep 2010 11:55:02 pm Stefan (metze) Metzmacher wrote:
> >> If the jenkins hash is used, we set the rwlocks field in the tdb header
> >> to a magic value.
> >> Current tdb versions expect this to be 0 and fail the open.
> > 
> > Unfortunately, this is an ABI break.  Someone using TDB and a non-default
> > hash won't be able to open their old TDBs any more.
> 
> I the caller specifies a hash function, we still use that one.
> So I don't see how this breaks existing users.

Sorry, I actually read your new patch now.  I really like that you put
the jenkins hash in the tdb code; it's a good idea to encourage it in all
tdb-using projects.  I especially like that it auto-detects; a real usability
win for new code!

Minor suggestions and one problem:

1) Can we make this more generic?  Perhaps move tdb_jenkins_hash to hash.c
   and expose it in the header so the user can pass it to tdb_open_ex.

2) We can't do this (in tdb_new_database()):

        } else {
-               tdb->hash_fn = default_tdb_hash;
-               hash_alg = "default";
+               if (tdb_flags & TDB_CLEAR_IF_FIRST) {
+                       tdb->hash_fn = tdb_jenkins_hash;
+                       hash_alg = "jenkins";
+               } else {
+                       tdb->hash_fn = default_tdb_hash;
+                       hash_alg = "default";
+               }
        }

  A program relying on sharing data can no longer upgrade tdb if old code will
  be reading the tdb.  We could do it for internal TDBs if we want.

3) I think we should add a new flag TDB_INCOMPATIBLE_HASHING, which puts the
   example hash into the rwlocks field.  That way the user has to deliberately
   choose to break old code for safety.  Of course, the new code would insist
   that the rwlock field be 0, *or* match the example hash field.
   In fact, this could change the default hash to jenkins automatically, like
   your code does.

4) We need to use tdb_jenkins_hashlittle() for endian compatibility (or,
   we could switch depending on tdb endianness, which maximizes speed in
   return for a little code bloat).

> > The only way we could do this is as a separate flag to tdb_open, eg.
> > TDB_INCOMPATIBLE_HASH or something.  I'm not sure that's worthwhile though?
> 
> I'd like to avoid that.

Me too.  But I really don't want to break some random TDB-relying program :(

Ideally, we would have put this check in when we added user-defined hashes
originally, but we didn't really care about TDB ABI/API back in those days...

Thanks,
Rusty.
PS.  I know this is frustrating, but thanks for spending the time on this!!


More information about the samba-technical mailing list