tdb and file-per-hash-chain

Wed Feb 1 22:59:44 GMT 2006

On Thu, 2006-02-02 at 09:24 +1100, tridge at samba.org wrote:
> Volker,
> 
>  > A file per key might be excessive, but nicely offloads all the lock management
>  > etc into the cluster fs, so you can point at someone else if your performance
>  > sucks :-)
> 
> I know you are partly joking, but you need to also think about whether
> a file per key would be fast for the simplest case of a single
> node. 
> 
> When designing a cluster solution you need to have a design that will
> run acceptably fast when run on a local filesystem without
> clustering. If it doesn't run acceptably fast in that case then you
> have little chance of getting good performance when you have it spread
> across multiple nodes. 
> 
> To test this, you could hack tdb to do fake file IOs per call, then
> run this on a local filesystem and benchmark the result. My guess is
> that it will be pretty slow. If you can show that it isn't then maybe
> the file per key method is worth trying in a cluster.
> 
> Also note that it still suffers from the same problem I mentioned to
> James. When you do have contention (multiple clients operating on the
> same file) then you will be hitting the same key on multiple nodes in
> the cluster. That will raise the same bad performance problems you are
> trying to avoid.

I guess my view is that a shared write workload will always be bad in a
clustered environment irrespective of how clever the Samba locking is.
Shared reads is a different story - that is something that must be fast.

>  > In the clustered case we have the advantage that for the non-contended case the
>  > share mode entry creator is the one who has to do business with that, ideally
>  > during the file handle lifetime this does not need to be migrated anywhere
>  > else. This puts high stress on the directory and inode creation code, but this
>  > is the design space we have to explore and possibly adapt to different
>  > clustered file system.
> 
> As I think I've mentioned to you before, I think that the clustered
> tdb approach is only good as a proof of concept. Once past that stage
> I think you must move the knowledge of clustering up a level, so that
> you have a clustered solution for share modes, rather than a clustered
> solution for tdb. The tdb model is pretty good for a local filesystem,
> but it was designed knowing what the relative costs of system calls
> are on a local filesystem. The design looks pretty bad when you change
> those relative costs, as happens for the clustered case.

The really nice aspect of doing this at the TDB level is that the
cluster takes care of any relocation, membership and failover tasks.
It's certainly not the most elegant solution, however.

-- 
James Peach | jpeach at samba.org