[Beowulf] [jak@uiuc.edu: Re: [APPL:Xgrid] [Xgrid] megaFlops per Dollar?]
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Eugen Leitl eugen at leitl.orgTue May 17 02:50:20 PDT 2005
- Previous message: [Beowulf] [jak@uiuc.edu: [Xgrid] Re: megaFlopsper Dollar? real world requirements]
- Next message: [Beowulf] Re: Beowulf Digest, Vol 15, Issue 35
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
----- Forwarded message from "Jay A. Kreibich" <jak at uiuc.edu> ----- From: "Jay A. Kreibich" <jak at uiuc.edu> Date: Mon, 16 May 2005 01:22:25 -0500 To: "Holder, Darryl" <Darryl.Holder at us.army.mil> Cc: Xgrid-users at lists.apple.com Subject: Re: [APPL:Xgrid] [Xgrid] megaFlops per Dollar? User-Agent: Mutt/1.4.2.1i Reply-To: jak at uiuc.edu On Wed, May 11, 2005 at 02:44:38PM -0500, Holder, Darryl scratched on the wall: > Greetings all; > > I will this summer be assembling a "nano-Xgrid" system to perform a > large antenna modeling calculation. I will be buying all of the hardware > new of maximum clock speed, number of CPUs, (number of cores per CPU?) > for this project. My question is: > > What is the best Mac platform for the maximum "megaFlops per Dollar"? This is an exceptionally difficult problem to answer. There are many factors involved, and the "right" choice can be heavily influenced by political, management, business, and technical decisions. It is a topic that you could write half a book about (actually a whole book, if the publisher doesn't kill the project half-way through; but that's different story). While I could easily go off on a mutli-hundred line discussion about I realize that nobody would really care to read it. So let me just say that the real question is one of "USABLE megaFlops for TOTAL cost." First, USABLE. Clusters are a compromise between the monolithic architecture of 1980 style Crays and cost. Larger singular systems are easier to program, easier to debug, and easier to utilize (not universally, but nearly so). Distributed computing (i.e. clusters) takes on additional "costs" (often non-monetary) in these areas to compensate for the fact that a cluster often costs a tiny fraction of a more traditional monolithic supercomputer. This is just good business and engineering sense, and there are good reasons why the majority of the systems on the Top-500 list are clusters. But, it is important to recognize that (for most programs) there are "distribution" costs, and they are a compromise. For most (but not all!) applications, it is easier to get better performance on a lower number of more powerful nodes. If your application efficiency is reduced by 20% each time you double the node count, saving a few bucks by buying a much larger number of less powerful systems is going to be a losing proposition. Next, TOTAL. Only looking at the price of the machines in a cluster is like only looking at the price of a machine in a desktop workstation. You don't get a very accurate assessment of costs unless you include the monitor, keyboard, possible printer, and all of the software (for example, Office, Photoshop, whatever) to make the computer useful. There's also whole "cost of ownership" issue that I'll just gloss over. If you're looking at performance optimizing your dollar, you need to look at your total dollars, not just what the machines cost. Many of the costs involved in building a cluster are "per node" costs, and quickly dilute cost savings derived from choosing a large number of *very* inexpensive systems. For example, if you need a high-speed network, you need to pay for one (or more) ports per node, not "per GFLOP" or "per $5K of computing hardware." Same is true of items like large memory upgrades. If your application requires every node to have a full copy of the data set, and that means 4GB of RAM per node, that can get extremely expensive if you have a large number of nodes. The biggest killer of all is often software costs. If you are running commercial software (which, at the very least, includes the OS (although this argument in relation to the OS is reduced now that Tiger is on shipping machines)) you need to pay per-node. There are also costs that are related to per-node, although not direct, such as space, power, and cooling. While there is some "economy of scale" for admin costs, system administration time becomes a much bigger issue with a larger number of systems. In short, most of the time there is a pressure to move towards a smaller number of more powerful systems. Obviously this isn't absolute-- if carried to its logical extreme we'd be back to large single monolithic systems-- but given that cost/performance is generally linear (or close to it), most clusters will benefit from the most powerful systems you can (practically) purchase. > By this, I mean Mac desktop versus Mac 1U server chassis. If the decision is purely an engineering one, and you don't have other business questions (like what to do with the machines when the cluster has completed useful lifetime) I'd go with the Xserve systems in a second. Yes, you can get slightly more powerful desktop systems (which, I realize, I just got done saying is a good idea), but the space, power, and heat costs are considerable for a large(r) number of nodes. > I have no real space/cooling/power constraints, Careful. Cooling, followed by power, are the two biggest mistakes made when building clusters. Xserve systems put out about 1000+ BTUs when running all out. A G5 PowerMac can put out twice that. You can heat a small house (and consume its whole electrical capacity) with a 32 node PowerMac cluster. PowerMacs also take up about 5x the space. Again, not a big deal for four or five, but a huge deal for 20 or 30. Other issues aside, one other consideration that I didn't see mentioned is that the Xserve systems come with OS X Server, while the PowerMacs do not. You don't really need Server on the compute nodes, but if you are running a cluster of four or more systems without Server on the head-node, you're insane (and wasting money and time). The ability to image compute nodes via NetBoot/NetInstall and provide a centralized Directory for all account and preference information is extremely valuable for a cluster of any size. Humm... I'm still over 100 lines, but not by much. -j -- Jay A. Kreibich | CommTech, Emrg Net Tech Svcs jak at uiuc.edu | Campus IT & Edu Svcs <http://www.uiuc.edu/~jak> | University of Illinois at U/C _______________________________________________ Do not post admin requests to the list. They will be ignored. Xgrid-users mailing list (Xgrid-users at lists.apple.com) Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/xgrid-users/eugen%40leitl.org This email sent to eugen at leitl.org ----- End forwarded message ----- -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature Url : http://www.scyld.com/pipermail/beowulf/attachments/20050517/7971aea9/attachment.bin
- Previous message: [Beowulf] [jak@uiuc.edu: [Xgrid] Re: megaFlopsper Dollar? real world requirements]
- Next message: [Beowulf] Re: Beowulf Digest, Vol 15, Issue 35
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
