Questions and Sanity Check
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Ray Jones rjones at merl.comMon Feb 26 14:48:20 PST 2001
- Previous message: Q: Node selection with mpirun under Scyld
- Next message: Questions and Sanity Check
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
I'm involved in a group putting together a proposal for building a Beowulf for our research lab (www.merl.com). We feel like we've achieved a reasonable level of confidence in our design, but I wanted to run it past the list for a final sanity check, as well as tack on a few questions that we still need to answer. Hardware configuration questions: Our proposed system: 128 nodes - 64 RS-1200 servers from Racksaver Each node: 1 GHz AMD Thunderbird 10 GB IDE hard drive 512 MB CS2 memory floppy Intel Etherexpress 8460 NIC Switch: D-Link DES-6000 w/ 8 6003 16-port blades OS: Scyld (most likely) I realize that it's an ill-formed question, but does anyone see anything horribly wrong with the above? Fuzzier, cost of ownership questions: We have about 10 researchers that would be interested in using the system. They almost exclusively into two categories: - Matlab users - Users with embarassingly parallel problems (tree search, graphics rendering, ...) For the Matlab users, we plan to use Matlab*p (aka MITMATLAB, aka Parallel Problems Server) to provide them access to the system. The others will probably receive a bit of an introduction to MPI and a bit of handholding while they get used to running parallel batch jobs. How much they'll need is one of the questions below. Open questions for anyone with experience with supporting multiple user access to Beowulf systems. I realize most of these are even more vague than my question above, but any input (no matter how anecdotal) would be helpful. 1- How much scheduling will we have to do? Will we see a graceful degradation of the system if multiple users ignore each other and run their jobs simultaneously? How will this affect things like Matlab*p and ScaLAPACK? 2- How many people are we going to need to dedicate to the software side of maintaining the cluster and helping researchers solve their problems, given that most of them are either doing batch parallelism or using tools (Matlab*p) that just make things magically happen? Is it going to be a full time to support 10 researchers that don't want to learn parallel programming? Specific questions:
- Previous message: Q: Node selection with mpirun under Scyld
- Next message: Questions and Sanity Check
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
