[Beowulf] Mature open source hierarchical storage management

Nifty Tom Mitchell niftyompi at niftyegg.com
Tue Oct 27 18:02:03 PDT 2009


these 
On Fri, Oct 23, 2009 at 04:12:11PM +1100, Carl Thomas wrote:
> Date: Fri, 23 Oct 2009 16:12:11 +1100
>    We are currently in the midst of planning a major refresh of our existing
>    HPC cluster.

Carl,

Do add "PowerFile" to your research list.

    http://www.powerfile.com/

My back of the email envelope view of what you are doing should have
quick cluster disks for binary objects, swap and libs /scratch /tmp and a
largish NFS RAID based filesystem with an archival back end.  Perhaps a
large slow spinning disk staging RAID in the middle or off to the side too.

There are multiple "delta equations" that
you need to evaluate.  I know I missed some

   - delta file change (GB/day).
   - performance delta at each layer.
   - cost delta at each layer.
   - management cost delta
   - operational cost delta
   - cost of compliance -- what the law requires, by method.
   - cost of physical storage on and off site, include handling and shipping.
   - cost of user training delta.
   - cost of expansion delta.
   - cost of necessary bandwidth, by layer.

Clusters are unique in that they have the potential
of hosting their own distributed RAID (lustre, gluster, zfs)
and with a sufficient archival backend life could be good.
Thus select systems that you can add a second disk to.

Choice of filesystem can help too (see dmapi and friends).

Have fun.
mitch



More information about the Beowulf mailing list