[Beowulf] Big storage

Jeffrey B. Layton laytonjb at charter.net
Tue Aug 28 10:34:55 PDT 2007


Peter St. John wrote:
> Jeff,
> I was musing about the (to me, novel) idea of multi RAID on one box 
> (probably the head node of a cluster or subcluster), by analogy with 
> CPU cache levels; RAID0 with a few 10K rpm drives for speed, RAID6 
> with many 7200 drives for size and reliability; then I'd be able to go 
> fast until a failure, then rollback to a checkpoint served by the 
> RAID6. This is just a gedankenexperiment.
> I don't have any particular configuration in mind, I don't have a 
> cluster and it will probably be a small number of superannuated Alpha 
> boxes if I do it at all this summer.
> Peter

Well, it's an interesting experiment anyway. Here are my thoughts on
file systems for small systems:

- I would think of 2, maybe 3 file systems: /home, /scratch, and maybe
  /data.
- I would use PVFS2 for /scratch because you could get faster than NFS
speed (depends on network, number of servers, etc.) Otherwise I would
probably use RAID-10 for /scratch. It gets you some speed from the
RAID-0 part, but if a disk fails, then it will at least continue to operate.
Yes, it may be expensive in terms of the number of disks (at least 4
disks, but only the capacity of 2 of them). if money is a constraint, then
I would have a single disk or RAID-1 for /scratch. Another alternative
is to just let users run out of /home.
- For /home I would use something like RAID-6, or if I had enough
money RAID-61. For RAID-6 you need at least 4 disks so for
RAID-61 you need at least 8 disks (could get a little pricey). RAID-6 is
also a bit slower than RAID-5, particularly for writes. So if speed is an
issue (or cost), I would go with RAID-5 or RAID-51. RAID-5 requires
at least 3 disks, so RAID-51 requires 6 disks. It's better than RAID-5
but it still may be pricey.
- If speed is more important, you could try RAID-60 for /home. It gives
you some redundancy, but you also get RAID-0 to gain some performance
back.
- Finally, I think of /data as a directory structure where you park data
after you are finished with it. So, I tend to think of capacity and 
reliability
instead of performance. This could mean something like RAID-51 or
RAID-61 if you like.
- BTW - if you want good performance on the file system itself, I would
recommend XFS. I've seen really good throughput results with it (but
I don't have any personal experience).

Ultimately, however you chose to configure your file systems, be sure
to think about the purpose of each file system. What is stored in the
file system and how valuable is it? Can I tolerate the file system being
down for some period of time? Can I restore the file system from backup?
Did I make a backup? :)

Then once you have an idea of what you want to do, how are you going
to get the file system to the compute nodes? Are you going to use NFS?
PVFS2? Lustre? Something else? Once you decide this, you can make
some estimates of the performance the file system is likely to see. For
example, with NFS you can estimate the throughput of NFS.

Once you know this estimate, you can then determine how much
IO you need from the disks to balance the system. For example, if you
think the throughput of NFS is X, then I would make sure I have
X throughput on my file system. Otherwise the file system can become
a bottleneck. If you have too much throughput on the storage side, then
the file system becomes the bottleneck. In my opinion it's all about
balance.

With that said, if you're building the system at home and the management
has given a fairly severe budget limit, I would use RAID-1 in the master
node with fairly large disks and leave it at that :)  My management has
restricted me, so this is what I do.

Jeff




More information about the Beowulf mailing list