tdb and file-per-hash-chain

James Peach jpeach at samba.org
Wed Feb 1 22:50:39 GMT 2006


On Wed, 2006-02-01 at 14:04 +0100, Volker Lendecke wrote:
> On Wed, Feb 01, 2006 at 11:03:43PM +1100, tridge at samba.org wrote:
> > I can see how the file per hash chain will be better for some
> > situations than a plain tdb on a clustered filesystem, but I don't
> > think it will approach anything like what is needed to make a
> > clustered smbd really scalable. As soon as you get multiple nodes
> > hitting the same hash bucket you will be back with poor performance
> > again. That might be rare with netbench, but with loads that have
> > shared files it will be quite nasty.
> 
> That's why I've done a more drastic one... A file per key.
> 
> In particular if the cluster fs is configured for large blocks, something that
> is needed for high write rates, the internal fs lock granularity _really_
> becomes a problem if we write 10 bytes and there.
> 
> A file per key might be excessive, but nicely offloads all the lock management
> etc into the cluster fs, so you can point at someone else if your performance
> sucks :-)

Yes, a file per key is the best case for minimising token contention,
but it does have disadvantages. It's wasteful of diskspace and you can
run into performance issues if you have to iterate over very large
directories. The latter problem can be solved by adding another layer of
subdirectories, but the former is more painful than it first appears. 

> In the clustered case we have the advantage that for the non-contended case the
> share mode entry creator is the one who has to do business with that, ideally
> during the file handle lifetime this does not need to be migrated anywhere
> else. This puts high stress on the directory and inode creation code, but this
> is the design space we have to explore and possibly adapt to different
> clustered file system.

Yes, we will need a certain level of configurability to be able play to
the strengths of different cluster types.

-- 
James Peach | jpeach at samba.org



More information about the samba-technical mailing list