[Beowulf] HPC and SAN

Guy Coates gmpc at sanger.ac.uk
Sun Dec 19 03:37:26 PST 2004


> I remain skeptical on the value proposition for a SAN in a cluster.
> In short, you need to avoid single points of information flow within
> clusters.

True, and the grown up cluster filesystems (GPFS, Lustre) allow you to
avoid those. You take N storage nodes with locally attached disk (IDE,
SCSI or FC) and export those to the cluster over a LAN, and glue it all
together with a cluster file-system.  The larger you make N, the faster
your IO goes, as the file-systems automatically stripe IO across all your
storage nodes.

The speed of the individual disks attached to your nodes doesn't actually
matter too much, so long as you have enough of them. On our clusters, we
see the GPFS limiting factor for single client access is how fast they can
drive their gigabit cards, and the limiting factor for multiple clients is
how much non-blocking LAN bandwidth we can put between the storage nodes
and clients.

The only time SAN attached storage helps is in the case of storage node
failures, as you have redundant paths between storage nodes and disks.
(You can set up redundant IO nodes even without a SAN.) Whether this
matters to you or not depends on what QoS you are trying to maintain.

The other big win is that we can also achieve these IO rates under
production conditions. Users can run unmodified binaries and code and get
the benefit of massive IO without having to re-write apps to use specific
APIs such as MPI-IO or PVFS.

Cheers,

Guy Coates

-- 
Dr. Guy Coates,  Informatics System Group
The Wellcome Trust Sanger Institute, Hinxton, Cambridge, CB10 1SA, UK
Tel: +44 (0)1223 834244 ex 7199














More information about the Beowulf mailing list