Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

the need for storage area networks [was: Shared diskspace between nodes]

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Matthew O'Keefe okeefe at brule.borg.umn.edu
Sun Jan 13 11:08:45 PST 2002


Jon,

others have provided good suggestions for solutions to 
your problem, but I think the ultimate solution to your
problem is a storage area network between your Beowulf nodes
and a pool of shared storage devices.  This approach allows efficient
partitioning and sharing of storage between the Beowulf nodes.

A cluster file system like GFS can be used to map a shared
file system (one that all nodes can mount directly) onto the 
shared storage devices.  This approach completely removes 
your problem: trying to map your data evenly across many nodes,
when the data needs on each node can grow or shrink in
unexpected ways.  It also allows you to manage 1 file system,
instead of 40.  

Some may object that SANs are expensive, but that is changing.
IP-based SANs are now becoming available, and a cluster of
NFS servers with shared storage and a cluster file system can
also be used to share data across a Beowulf without the full
expense of a SAN.  For details see the white paper I wrote 
on "Accelerating Technical Computing..."  at the Sistina web site
(www.sistina.com).

When running complex parallel applications in production
on a Beowulf cluster (for example, Oracle Real Application Clusters), 
a storage area network and cluster file system greatly
simplifies your life.  

Matt O'Keefe




On Mon, Jan 07, 2002 at 02:36:42PM -0500, Jon E. Mitchiner wrote:
> Greetings!
> 
> I presently run a 40-node cluster, Dual 1GHz with 20GB hard drive on each
> system.  This gives me roughly 15GB (safe estimate) after the OS, installed
> programs, some data, etc on each machine.  This gives me roughly 600GB of
> space that I am not currently utilizing on 40 nodes.
> 
> Right now, we are saving data on various nodes, and moving it around when
> space gets tight on a machine.  This is getting time consuming as some of us
> have to look on different nodes to find out where your data is currently
> residing.  I am considering saving all directory names in a database and
> then making a GUI interface via the web so its easy to find the location of
> data directories, rather than looking for it (especially if someone moved my
> directory to another machine without letting me know).
> 
> I am curious if there is a program out there that might be able to utilize
> the space that we are not utilizing -- such as linking the file space
> between nodes so that way I can set up a "large" data partition sharable by
> all nodes.  Some redunancy would be nice.  Im curious if there is a software
> solution (either GPL licensed, or commercial) to utilize the space better.
> 
> Optimally, it would be nice to see all "shared" drives as one large
> partition to be mounted to all nodes and all the data is handled by a daemon
> or something like that.
> 
> Does anyone have any ideas, suggestions, or programs that might be able to
> do something similar?
> 
> Thanks!
> 
> Regards,
> 
> Jon E. Mitchiner
> Minotaur Technologies
> http://www.minotaur.com
> AOL IM [http://www.aol.com/aim] MinotaurT
> 
> 
> 
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf



More information about the Beowulf mailing list