32 + 8 nodes beowulf cluster design.
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Jared Hodge jared_hodge at iat.utexas.eduWed Feb 14 06:42:02 PST 2001
- Previous message: MPICH hangs in redhat kernel-2.2.14-5.0
- Next message: 32 + 8 nodes beowulf cluster design.
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Paul Maragakis wrote: > > Hi everyone, > > I have just drafted a proposal listing the hardware components and their > prices for a 40 nodes beowulf using 1.2 GHz Athlons, 8*1.5 GB + 32*512 MB > memory, a double fast-ethernet network, and a front-end for storage and > administration. The draft is at: > http://hdsc.deas.harvard.edu/~plm/beowulf.pdf > All comments are highly appreciated, especially those regarding the > network, the UPS and the price estimates. > > Take care, > > Paul > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf Paul, As someone who has recently had to write a proposal for a Beowulf cluster, I thought I'd check yours out and see if I could learn anything and make a few suggestions from what I've learned. This is kind of long, so if your on the Beowulf list and not interested in this, you may want to just skip this E-mail. Ok, first off, I know some of your prices are a little high. Our local vendor of choice listed 512MB memory (for PIII) at just over $500, and you may be able to find it cheaper elsewhere. It would definitely be worth the time since that's the predominant cost of your cluster. Like you said though, you want to leave a little room for unexpected costs. The cluster I've been working with has been using Pentium (II currently, III proposed) processors and Myrinet, so I don't know all the details of your setup, but it seems to me that your networking plan is ... unique. Are you just going to be using the Hubs and bootable NICs (I assume this means Wake-On-LAN) for booting? If so could you not just use the Cisco switch and bootable NICs for booting? If you are going to use the hubs and extra NICs for channel bonding, this introduces some interesting segmentation problems in the network (having two hubs) and of course you'll have to deal with extra packet collisions on the hubs. If you are going for channel bonding, it might be worth getting a second Cisco switch. I'll agree with Joseph's $0.02 in that you should at least check out switches with one or two gigabit ports to go to the server, although the need for this would be dictated by the code your running. Speaking of the code, I'm not sure exactly how PGI's licensing agreement works, but I think all you would have to buy their compilers for is the server (Lead node), since this is where you'll be doing your compilation. If you only buy the license for one user, only one user can compile a program at a time, but if you don't have many users logged in at once, this may not be a problem. I didn't understand quite all of your reasoning for choosing Debian over other distributions of Linux (I like Red Hat, although I'd like to try Scyld), but it sounds like you've got more experience than I do in this area. I'd say the exact distribution is usually mostly a preference thing anyway (i.e. go with what you know). Oh yeah, going back to booting the cluster, is there any particular reason you are wanting to boot your cluster from the network? Your proposal sounded like the only other alternative was booting from a floppy, but you do have those 20GB hard drives sitting there. Maybe you could spare a few megabytes and install your OS there (I think you can still WOL but boot from there). Also, I understand the desire for a large scratch space on nodes, but I've found with Myrinet that it is MUCH faster to use the network to get to another node's (or the server's) memory than to just go to your own hard drive, although this isn't always a reasonable option. I don't think this is true with ethernet, but I haven't tested it. On to the server... Do you really want a separate UPS for the server? Personally if my cluster is going to go down, I'd just assume the whole thing go down instead of all the nodes and not the server (it makes the "why isn't my parallel job running?" question have a much more obvious answer, you can't even log in to the server...) If the nodes go down and not the server, your equally in a "who cares, the cluster doesn't work" situation. I guess if your at the max of the big UPS or if the server and cluster are in different places this might make sense. Have you looked into just using EIDE drives on the server? Maybe buying a RAID controller (there was a thread on RAID stuff on Monday I think). From my limited experience, SCSI is much more expensive and much more trouble than it's worth, again, this depends on your code. For the $2400 your spending on it though, it might be worth looking into an integrated RAID solution. I would still do the tape backups even if you do go with RAID, though, stick to your guns on that one. Also on the server motherboard, I've heard that it's better to use the exact type of motherboard on your server as on the nodes. I believe this is partly because some compilers will optimize for the architecture that they're compiling on. I don't know if there will be enough difference in these boards to cause a noticeable difference in performance, also you can change an option somewhere in your compiler, but that is one more thing to worry about. Also, it sometimes helps if your server is configured nearly identically to (at least some) of your nodes. That way if someone runs a reasonably big single job interactively (for testing their code) and it does run, they can be confident it will run correctly on the nodes. If you can, for this reason, and because of compiling and multi-user overhead, I'd go to 1.5GB RAM on your server like your "large" nodes. Well, I know I've said a lot here, so I better quit before I think of something else (oh, too late, don't forget to check your spelling on your proposal by hand before you send it off to the Money People, there "were" a few errors). Good luck to you on your cluster and let me know what you think of the suggestions. It would be good for me to hear other people's input on these things so I can better plan my own upgrades. -- Jared Hodge Institute for Advanced Technology The University of Texas at Austin 3925 W. Braker Lane, Suite 400 Austin, Texas 78759 Phone: 512-232-4460 FAX: 512-471-9096 Email: Jared_Hodge at iat.utexas.edu
- Previous message: MPICH hangs in redhat kernel-2.2.14-5.0
- Next message: 32 + 8 nodes beowulf cluster design.
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
