[Beowulf] distributing storage amongst compute nodes
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Mark Hahn hahn at mcmaster.caMon Oct 22 08:17:37 PDT 2007
- Previous message: [Beowulf] distributing storage amongst compute nodes
- Next message: [Beowulf] distributing storage amongst compute nodes
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
>> [...] commodity disks are plenty reliable >> and are not a significant source of uptime problems. > > C|N>K (i.e. coffee piped through nose into keyboard) sorry! > That's not quite a general truth. 8^) I mean that in the experience of my organization, the mundane maxtor and seagate disks that we get with our mostly HP hardware is extremely reliable. surprisingly so - certainly we were expecting worse, based on the published studies. we have ~20 clusters online totalling >8k cores. most nodes (2-4 cores/node) have 2 sata disks, which have had a very low failure rate (probably < 1% afr over 2-3 years of service). in addition, we have four 70TB storage clusters build from arrays of 9+2 raids of commodity 250G sata disks, as well as a 200TB cluster (10+2x500G disks iirc). failure rate of these disks have been quite low as well (I'm guessing actually lower than the in-node disks, even though the storage-cluster disks are much more heavily used.) here's my handwaving explanation of this: in-node disks are hardly used, since they're just the OS, and nodes spend most of their time running apps. disks in the storage clusters are more heavily used, but even for a large cluster, we simply do not generate enough load. (I'm not embarassed by that - remember cheap disks sustain 50 MB/s these days, so if you have a 70 TB Lustre filesystem, you'd have to sustain > 10 GB/s to actually keep the disks busy. in other words, bigger storage is generally less active...) >> but maybe it makes sense not to fight the tide of disturbingly cheap >> and dense storage. even a normal 1U cluster node could often be configured >> with several TB of local storage. the question is: how to make use of it? > > Some people are running dCache pools on their cluster nodes. that's cool to know. how do users like it? performance comments? thanks, mark hahn.
- Previous message: [Beowulf] distributing storage amongst compute nodes
- Next message: [Beowulf] distributing storage amongst compute nodes
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
