[clug] cachefs question for meeting tonight

Thu Nov 27 11:08:55 EST 2003

I'd like to put a question on notice for the meeting tonight:

How can something akin to Sun's cachefs be kludged in Linux?

Kludging is optional but the solution needs to provide
 a way of using local disk space as a cache of a network file system.

The CSIRO Bioinformatics Facility has two 400Gig filesystems on a server,
 that need to be available to 66 cluster nodes.  Just NFS exporting
 one server to so many nodes leaves the cluster NFS bound.

Nodes only need read-only access and files change slowly (>weekly).
Jobs come in big batches, all based on the same few files.

Each node has 27 Gig of free space which is heaps for a working set
 I just don't want to have to groom what is in this working set.
Nor do I want to have to pre-pend an rsync to a batch.

So it's calling for cache, which means a incomplete local FS
 with on-demand loading and a simple flushing algorithm.
Flushing outdated copies could be done by checking against the master 
 but could be as crude as broadcasting an "expire all cached" command.

It would be nice if it could export an existing trusted filesystem
 like ext3 or reiser rather than being it's own FS like Coda or AFS.
( They are ugly using ACLs not Unix file permissions
 and can't handle the size I need)

But I can't find cachefs or anything like it for Linux,  ||8^(
 Tridge, anyone what's the best way forward?

michaelj

Alternate Solution:
Our experience suggests that an NFS server can easily feed 6 clients,
 so an intermediate solution would use NFS cross mounting
 to group the storage of 6 nodes.  Having 150 Gig
 of "almost local" space would meet the present need
 but is not as elegant as 800 Gig with cache.

-- 
Michael James                         michael.james at csiro.au
System Administrator                    voice:  02 6246 5040
CSIRO Bioinformatics Facility             fax:  02 6246 5166