[clug] aggregating disks across multiple machines
Daniel Pittman
daniel at rimspace.net
Wed Oct 28 03:28:55 MDT 2009
Paul Wayper <paulway at mabula.net> writes:
> On 26/10/09 17:49, Daniel Pittman wrote:
>> Michael James<michael at james.st> writes:
>>
>>> I've been asked to recommend a setup for a group of high end workstations.
>>> They each have dual 4 core processors, 32 Gig of ram and 2 x 1.5 TB disks.
>>> Nice.
>>>
>>> At present each machine has a separate 1.5TB /home and /data partition.
>>> Bad:
>>> no redundancy, data will be lost from a single disk failure.
>>> files will be copied around as jobs dictate => confusion and waste.
>>> when partitions start to fill up, files will get put where they fit
>>> not where they should go => even greater confusion.
>>
>> Mmmm. Some of that sounds like a user training and control issue, not a disk
>> layout issue, to me. Assuming y'all do provide a central server:
>
> I think Michael was talking about using both machines as some kind of
> distributed storage, rather than a 'central server'. I want to find out about
> this too. The key problem is that a lot of the cluster storage that works
> like this assumes that each machine is accessing the same backend store. This
> is convenient for those that have infiniband or fiberchannel cards lying
> around and SAN units sitting in their cupboards, but for those of us with just
> standard machines I haven't found any obvious candidates.
>
> Anyone seen something that makes a bunch of disks spread across multiple
> machines act like a big communal block device?
Yeah: GLusterFS. It does exactly this, and is almost certainly what you
want. The alternatives tend to look like Hadoop or so — a dedicated storage
solution for a data processing system, not a filesystem.
You would probably want to unify[1], and perhaps the BDB backed store[2], for
this; perhaps AFR[3] if you really felt enthused, but it is still not quite
where I would like a replicated storage device to be.
Apparently, though, the latest release hides all that behind a sane
interface. Go, GLusterFS developers.
Daniel
Footnotes:
[1] Single namespace over multiple machines.
[2] Stores small files in a BDB spool, excellent for many small files, still
looks like a POSIX filesystem to the client.
[3] Mirroring, basically.
--
✣ Daniel Pittman ✉ daniel at rimspace.net ☎ +61 401 155 707
♽ made with 100 percent post-consumer electrons
Looking for work? Love Perl? In Melbourne, Australia? We are hiring.
More information about the linux
mailing list