[Beowulf] Big storage
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Jeffrey B. Layton laytonjb at charter.netTue Aug 28 10:34:55 PDT 2007
- Previous message: [Beowulf] Big storage
- Next message: [Beowulf] small file systems
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Peter St. John wrote: > Jeff, > I was musing about the (to me, novel) idea of multi RAID on one box > (probably the head node of a cluster or subcluster), by analogy with > CPU cache levels; RAID0 with a few 10K rpm drives for speed, RAID6 > with many 7200 drives for size and reliability; then I'd be able to go > fast until a failure, then rollback to a checkpoint served by the > RAID6. This is just a gedankenexperiment. > I don't have any particular configuration in mind, I don't have a > cluster and it will probably be a small number of superannuated Alpha > boxes if I do it at all this summer. > Peter Well, it's an interesting experiment anyway. Here are my thoughts on file systems for small systems: - I would think of 2, maybe 3 file systems: /home, /scratch, and maybe /data. - I would use PVFS2 for /scratch because you could get faster than NFS speed (depends on network, number of servers, etc.) Otherwise I would probably use RAID-10 for /scratch. It gets you some speed from the RAID-0 part, but if a disk fails, then it will at least continue to operate. Yes, it may be expensive in terms of the number of disks (at least 4 disks, but only the capacity of 2 of them). if money is a constraint, then I would have a single disk or RAID-1 for /scratch. Another alternative is to just let users run out of /home. - For /home I would use something like RAID-6, or if I had enough money RAID-61. For RAID-6 you need at least 4 disks so for RAID-61 you need at least 8 disks (could get a little pricey). RAID-6 is also a bit slower than RAID-5, particularly for writes. So if speed is an issue (or cost), I would go with RAID-5 or RAID-51. RAID-5 requires at least 3 disks, so RAID-51 requires 6 disks. It's better than RAID-5 but it still may be pricey. - If speed is more important, you could try RAID-60 for /home. It gives you some redundancy, but you also get RAID-0 to gain some performance back. - Finally, I think of /data as a directory structure where you park data after you are finished with it. So, I tend to think of capacity and reliability instead of performance. This could mean something like RAID-51 or RAID-61 if you like. - BTW - if you want good performance on the file system itself, I would recommend XFS. I've seen really good throughput results with it (but I don't have any personal experience). Ultimately, however you chose to configure your file systems, be sure to think about the purpose of each file system. What is stored in the file system and how valuable is it? Can I tolerate the file system being down for some period of time? Can I restore the file system from backup? Did I make a backup? :) Then once you have an idea of what you want to do, how are you going to get the file system to the compute nodes? Are you going to use NFS? PVFS2? Lustre? Something else? Once you decide this, you can make some estimates of the performance the file system is likely to see. For example, with NFS you can estimate the throughput of NFS. Once you know this estimate, you can then determine how much IO you need from the disks to balance the system. For example, if you think the throughput of NFS is X, then I would make sure I have X throughput on my file system. Otherwise the file system can become a bottleneck. If you have too much throughput on the storage side, then the file system becomes the bottleneck. In my opinion it's all about balance. With that said, if you're building the system at home and the management has given a fairly severe budget limit, I would use RAID-1 in the master node with fairly large disks and leave it at that :) My management has restricted me, so this is what I do. Jeff
- Previous message: [Beowulf] Big storage
- Next message: [Beowulf] small file systems
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
