tdb and file-per-hash-chain

Volker Lendecke Volker.Lendecke at SerNet.DE
Wed Feb 1 13:04:32 GMT 2006


On Wed, Feb 01, 2006 at 11:03:43PM +1100, tridge at samba.org wrote:
> I can see how the file per hash chain will be better for some
> situations than a plain tdb on a clustered filesystem, but I don't
> think it will approach anything like what is needed to make a
> clustered smbd really scalable. As soon as you get multiple nodes
> hitting the same hash bucket you will be back with poor performance
> again. That might be rare with netbench, but with loads that have
> shared files it will be quite nasty.

That's why I've done a more drastic one... A file per key.

In particular if the cluster fs is configured for large blocks, something that
is needed for high write rates, the internal fs lock granularity _really_
becomes a problem if we write 10 bytes and there.

A file per key might be excessive, but nicely offloads all the lock management
etc into the cluster fs, so you can point at someone else if your performance
sucks :-)

In the clustered case we have the advantage that for the non-contended case the
share mode entry creator is the one who has to do business with that, ideally
during the file handle lifetime this does not need to be migrated anywhere
else. This puts high stress on the directory and inode creation code, but this
is the design space we have to explore and possibly adapt to different
clustered file system.

> I was talking to Brian Aker and Stewart Smith from MySQL at LCA'06
> about this problem last week, and they pointed me at something called
> 'ndb' which is a a clustered database that apparently has a suitable
> API to build a key-value pair database much like tdb. It's used to
> build a clustered version of MySQL. Stewart told me the performance
> numbers are in the ballpark of what we need for Samba. He sent me the
> API docs, but I haven't had a chance to look through them yet.

Except its C++, but this might be something we can live with for a clustered
file server.

Volker
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://lists.samba.org/archive/samba-technical/attachments/20060201/4b11c0ae/attachment.bin


More information about the samba-technical mailing list