[Beowulf] 512 nodes Myrinet cluster Challanges
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Robert G. Brown rgb at phy.duke.eduMon May 1 17:50:54 PDT 2006
- Previous message: [Beowulf] 512 nodes Myrinet cluster Challanges
- Next message: [Beowulf] 512 nodes Myrinet cluster Challanges
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Mon, 1 May 2006, David Kewley wrote: >>> For clusters with more than perhaps 16 nodes, or EVEN 32 if you're >>> feeling masochistic and inclined to heartache: >> >> with all respect to rgb, I don't think size is a primary factor in >> cluster building/maintaining/etc effort. certainly it does eventually Aw, Mark, you ARE joking, aren't you? C'mon...;-) Sizes in general, along with rates and maybe prices, are the ONLY factors that enter into quantitative cluster engineering, and the size of the cluster (number of nodes) in particular is the prime variable in all sorts of areas of cluster design from Amdahlian scaling laws to engineering of the spaces designed to handle the clusters. Many of which are fundamentally nonlinear relationships. Of COURSE there are different issues one has to confront building a cluster with 16 nodes compared to building one with 1600 nodes (other than aggregate MTBF, which certainly is an issue I agree). Like building your own small power plant and having AC units the size of a tractor trailer outside a warehouse-sized space with carefully delivered power and cooling vs sticking them in the office down the hall that happens to be first in the AC delivery system and everybody thinks it is too cold anyway. Like ensuring that your IPC system can enable acceptable speedup on the task the cluster is designed to do. Like that. >> become a concern, but that's primarily a statistical result of >> MTBF/nnodes. it's quite possible to choose hardware to maximize MTBF and >> configuration risk. Ah, I remember well my halcyon days when I too truly believed this. I also well remember the Tyan motherboards and in a SEPARATE incident Taiwanese capacitors that permanently changed my mind, scarring my psyche deeply in the process. The point is ultimately that there is a nasty nonlinearity in the impact of a "catastrophe" -- one not necessarily linked to how carefully you pick your hardware -- where a happy resolution ultimately depends on who pays to fix it and how rapidly it gets fixed if it doesn't work out. So sure, it is possible to choose hardware wisely. Or not. If you DO choose it wisely, it is still quite possible that it breaks 13 months in when everything is out of manufacturer's warranty, maybe even ALL of it breaks quite rapidly (as in fact happened with the infamous Taiwanese Capacitor, something that affected "good" and "bad" motherboards alike as far as I could tell -- mine were certainly fine and worked well right up to the minute they blew capacitor sputum all over the inside of my cases). Then the issue is "who's going to pay to fix it, especially when fixing it will cost a signficant fraction of what the cluster cost in the first place. If you've bought commercial nodes with 4 year contracts, the answer is "they are", and the cost is a few days of downtime and a bitterly cursing vendor and if anything, your users are impressed with your foresightfulness in making service a part of the up-front cost of the nodes. If the answer is "we are", well, that is a bad answer I can tell you, unless you happen to have a whole bunch of money you've carefully reserved to pay for the hardware required, and even then it is STILL a bad answer because of all the work required, and people hate you in the meantime. Of course this is true for large and small clusters alike. The only advantage of a small cluster is that it is a lot more likely that you have 16x$100 "lying around" to fix 16 nodes, and replacing 16 motherboards might take a day of human time for somebody armed with an electric screwdriver and a pretty clear idea of how to recable. 1-2 days for a single person and a "miscellaneous" class repair budget was the basis for my estimate of 16-32 nodes or less. So it's really just an issue of insuring the risk more than MTBF per se. Insurance companies exist because there are bad things that can happen that on aggregate are very cheap but when they happen to you they are very expensive. It is easy enough to budget for insurance in the form of service contracts -- you just build it into the cost estimates of your cluster in the first place. But how can you budget adequately for disasters that you cannot predict, because if you could predict them you'd avoid them? And yet sure, you're right, there is a matter of judgement here and in addition to very small clusters, for very large clusters and projects with a large ongoing cash flow one can achieve both the numbers required to make self-insurance viable and the cash flow to make it possible without holding out part of one's budget (effectively defeating the point of self-insuring). So let's compromise with a standard Caveat Engineer and YMMV > Ah, so my opinion is midway between Mark's & RGB's. A very nice place to > sit. :) Yeah, but don't forget, the guy in the middle has to buy the beer. Right, Mark? rgb -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
- Previous message: [Beowulf] 512 nodes Myrinet cluster Challanges
- Next message: [Beowulf] 512 nodes Myrinet cluster Challanges
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
