Any coments on GFS as applies to a Beowulf ?
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Matthew O'Keefe okeefe at borg.umn.eduSat Feb 17 10:15:04 PST 2001
- Previous message: Any coments on GFS as applies to a Beowulf ?
- Next message: network profiling
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi, we wrote GFS originally for a problem I had in dealiing with the large-scale, parallel fluid dynamics and electromagnetics calculations I was working on. The problem was very simple: getting the simulation data of the parallel machine over to the graphics machines so we could visualize and make sense of it all. Doing this across Ethernet and TCP/IP was not feasible, and since new shared storage networking technologies were appearing, having the graphics machine and supercomputer share the data gave us speed and efficiency (no data replication!). There is a white paper on how GFS lets you develop Storage Clusters, and how GFS applies to supercomputing which you can find at http://www.sistina.com/Pages/publications.html and check out the paper "Storage Clusters for Linux". I think Beowulfs work very well for parallel applications (like Monte Carlo) that are not IO-intensive, of which there are many. However, IO is currently a weakness, in particular when one tries to migrate data off the Beowulf. GFS allows Beowulf nodes to share and pool disks to significantly improve IO scalability by increasing: * extendibility: you can add more nodes and more disks to your Beowulf to increase its storage and computational capacity * availability: be decoupling storage devices from compute nodes, you don't lose access to storage when a compute node dies; in addition, you can add more compute nodes and storage while the Beowulf is running, and *on-line* resize both the volume manager and GFS, again, while the Beowulf is running. * manageability: GFS allows you to create a single pool of storage that is more efficient than server-centric storage, and that can be much more easily managed. In addition, in combination with server virtualization technologies like bproc from Scyld, load balancing across Beowulf nodes becomes much more efficient. * affordability: GFS runs on Linux and PCs and is media-independent: you can use Fibre Channel, Myrinet, or whatever as your shared media for storage. You can use different kinds of storage devices and networking equipment from a variety of vendors to build low-cost GFS clusters. * efficiency: GFS allows you to efficiently load-balance applications, consolidate storage, and quickly transfer data from your Beowulf cluster. GFS is a 64-bit, production-ready, journaled cluster file system, that allows fast recovery from node failures, and that supports large files, directories, and file systems. It is GPL'ed code, available on Linux 2.2 currently, and will soon be available on 2.4 ( < 2 weeks). Within 6 months, it will be integrated with a new cluster version of the Linux Logical Volume Manager to provide integrated file and volume cluster services. GFS will change the way you compute and run your servers. It allows Linux to leapfrog nearly every other UNIX by providing a cluster file system that scales and makes Beowulf's and Linux HA clusters *manageable*. It is being used at leading NASA labs, web sites, and feature film shops. In 2001, it will be ported to FreeBSD as well. So.... go get it! www.sistina.com/gfs/ Matt O'Keefe On Thu, Feb 08, 2001 at 08:41:26AM -0800, JParker at coinstar.com wrote: > G'Day ! > > http://news.linuxprogramming.com/news_story.php3?ltsn=2001-02-08-002-05-CD > > cheers, > Jim Parker > > Sailboat racing is not a matter of life and death .... It is far more > important than that !!!
- Previous message: Any coments on GFS as applies to a Beowulf ?
- Next message: network profiling
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
