tdb and file-per-hash-chain

Wed Feb 1 22:45:21 GMT 2006

On Wed, 2006-02-01 at 23:03 +1100, tridge at samba.org wrote:
> James,
> 
> I can see how the file per hash chain will be better for some
> situations than a plain tdb on a clustered filesystem, but I don't
> think it will approach anything like what is needed to make a
> clustered smbd really scalable. As soon as you get multiple nodes
> hitting the same hash bucket you will be back with poor performance
> again. That might be rare with netbench, but with loads that have
> shared files it will be quite nasty.

That's true. However, I don't know whether this will be a problem in
practice. For most cluster technologies, multiple nodes accessing the
same files is a pessimal case because it requires the most cross-cluster
consistency checking. If you add some poor TDB scalability to this
already dubious workload, then I'm not sure that the situation really
gets that much worse.

I don't have any numbers to indicate whether this is a common workload.
Do you have any?

> I was talking to Brian Aker and Stewart Smith from MySQL at LCA'06
> about this problem last week, and they pointed me at something called
> 'ndb' which is a a clustered database that apparently has a suitable
> API to build a key-value pair database much like tdb. It's used to
> build a clustered version of MySQL. Stewart told me the performance
> numbers are in the ballpark of what we need for Samba. He sent me the
> API docs, but I haven't had a chance to look through them yet.

Hmmm. I had a quick poke around mysql.com and it's not clear whether
this is easily separable from MySQL. I guess you could always use MySQL
cluster as a backend to the TDB API, but that would require running
both your SAN cluster and the MySQL cluster. There's a lot of scope here
for misunderstandings and poor integration between discrete clustering
technologies.

> Meanwhile, as Metze mentioned, it would certainly be worthwile to have
> a look at the tdb work in Samba4. As part of the transactions work I
> abstracted about some of the IO routines, so a transaction can
> override the read/write functions. It isn't broken out by hash chain
> like you have done, but at least some of the abstraction is done.

Yes, you would need to abstract the entire TDB API to be able to change
the storage machanism in a meaningful way.

-- 
James Peach | jpeach at samba.org