[Beowulf] Project Planning: Storage, Network, and Redundancy Considerations

Mon Mar 19 12:49:11 PDT 2007

Brian,

(Threads like this can get confusing, which Brian? :)

Brian D. Ropers-Huilman wrote:
> Brian,
>
> There are usually three or four categories of storage:
>
> 1) /home - small, just enough to keep source files and compile code
> 2) /scratch/local - distributed disks within a cluster for local
> writing (think Gaussian)
> 3) /scratch/global - a high-performance (and higher cost) parallel
> file system accessible by all nodes
> 4) /archive - a very large pool of spinning disks which receives data
> from /scratch/global when a run (or set of consecutive runs) is
> "complete." The idea is to clear off the expensive parallel system for
> other run-time use, but that you still want to hold the data for some
> future need.
We have 1-3.  4 is the equivalent to our 1 but we make it the user's 
responsibility to move their data.  I like your idea though.
>
> I would keep your /home and /scratch/global separate.
I've thought about this and it makes sense on a couple of levels.  a) a 
lot of data that gets written to /scratch/global is fairly transient in 
nature.  Some results a user might keep, many others they discard.  If 
/home == /scratch/global, then chances are our backup tapes will be 
littered with data that nobody wants.  b) Not a single point of 
failure.  However, there are some advantages, I think, if you can merge 
the two: a) You only have one disk to administer and all of your efforts 
for fault tolerance, monitoring, and maintenance can be focused on that 
device.  When you're a one-man-cluster-army, sysadmining and 
maintaining, testing, developing, and deploying codes, you learn to 
appreciate consolidation of this nature.  Sure, it may appear a single 
point of failure, but the plan also includes an offsite backup volume 
which can be vlan'ed into the cluster's network.  If the local array 
dies, the outside array can take its place (albeit, with significantly 
reduced performance) until repairs can be made to the main array.  The 
offsite array should also be able to be physically moved (fairly 
quickly) to our datacenter as a drop-in replacement.
>
> The /scratch/global solution you pick will very much depend on how you
> want it connected to your clusters. By definition (of your cluster
> suite) you cannot have a system that relies on IB as not all of your
> systems have IB. This leaves GbE as the only global means of
> connection. If at all possible, I would dedicate a GbE interface on
> all nodes who access /scratch/global.
>
Yes, this is unfortunate.  But fortunately, very few problems running on 
the current system need disk access on the level provided by an 
IB-connected storage device.  It would be good to have for later, but we 
can pass for now.  I agree with the separate networks as well.  I've 
heard this elsewhere. 

Thanks for the advice!

Brian

-- 
--------------------------------------------------------
+ Brian R. Smith                                       +
+ HPC Systems Analyst & Programmer                     +
+ Research Computing, University of South Florida      +
+ 4202 E. Fowler Ave. LIB618                           +
+ Office Phone: 1 (813) 974-1467                       +
+ Mobile Phone: 1 (813) 230-3441                       +
+ Organization URL: http://rc.usf.edu                  +
--------------------------------------------------------