[Beowulf] Register article on Opteron - disagree
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Robert G. Brown rgb at phy.duke.eduMon Nov 22 00:04:16 PST 2004
- Previous message: [Beowulf] Register article on Opteron - disagree
- Next message: [Beowulf] Register article on Opteron - disagree
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Sun, 21 Nov 2004 john.hearns at clustervision.com wrote: > The fact that there are fewer Opteron based systems in the Top 500 is > irrefutable (I didn't know this) but it makes me uneasy to extrapolate > this to the impending death of a CPU. > > I DO agree (and let's have some debate here) that Nocona is bound to make > big inroads. Sure. We can start with the fact that the Top 500 list is irrelevant. It is a hardware vendor pissing contest. I criticized it pretty strongly in a recent CWM column. Let's see: a) It lists identical hardware configurations as many times as they are submitted. Thus "Geotrace" lists positions 109 through 114 with identical hardware. If one ran uniq on the list, it would probably end up being the top 200. At most. Arguably 100, since some clusters differ at most by the number of processors. What's the point, if the site's purpose is to encourage and display alternative engineering? None at all. b) It focusses on a single benchmark useful only to a (small) fraction of all HPC applications. I mean, c'mon. Linpack? Have we learned >>nothing<< from the last twenty years of developing benchmarks? Don't get me wrong -- linpack is doubtless a useful measure to at least some folks. However, why not actually present a full suite of tests instead of just one? Vendors would hate this, just like they hate Larry Mcvoy's benchmark suite, because it makes it so difficult to cook up a cluster that does just one thing (runs Linpack) well... c) It totally neglects the cost of the clusters. If you have to ask, you can't afford to play, I suppose. Is it any surprise that the list is dominated by the goldest-plated of the gold plated vendors? Obviously many of the folks who build and buy the systems that make the list aren't troubled by the spectre of cost-benefit. So we have absolutely no idea what BlueGene/L costs for the R it produces compared to SGI's Altix 1.5 GHz cluster that comes in at number 2. Even if we did know their cost, we would be unlikely to know their true cost -- at best the cost after the vendor discounted the system heavily for the advertising benefit of getting a system into the top whatever. I could go on. I mean, look at the banner ads on the site. Vendors love this site. If it didn't exist, they'd go and invent it. If they want me to take the top500 list seriously, they could start by de-commercializing it, running a pretty stringent unique-ing process on the submissions and accepting only the first of a given design or architecture, especially for clusters that are more or less turnkey and mass produced. Then they could run a SERIOUS suite of benchmarkS (note plural) on the clusters, one which (like SPEC) attempts to provide useful information about things like latency, bandwidth, interconnect saturation for various communications patterns, speed for a variety of actual applications including several with very different computation/communication patterns (ranging from embarrassingly parallel to fine grained synchronous). Scaling CURVES (rather than a single silly number) would be really useful. I mean, this site is "sponsored" by some presumably serious computer science and research groups (although you'd never know it to look at all the little flashy things blinking Myrinet, Atipa, Tyan, IBM from the sides of the listings compared to the tiny little corner where the sponsoring institutions are listed). If they want to do us a real public service, they could do some actual computer science and see if they couldn't come up with some measure a bit richer than just R_max and R_peak.... Now, with that said (and it needed to be said, it did it did) the only thing most real cluster computer buyers care about is price/performance. To be more specific, price/performance on their particular application(s). At a guess, some 2/3 to 3/4 of all cluster computer users are doing something fairly coarse grained that doesn't use anything at all that linpack is relevant to as a measure of performance. This is probably true even on many of the clusters in the top 500. AMD has more or less "owned" the price/performance sweet spot for the last two years. If you have LIMITED money to spend and want to get the most work done for your money, you buy Opterons, at least at this particular moment and for most of the applications I've heard of that have been compared. Could Nacoma change this? Sure, if Intel drops its margins, but historically they've avoided doing this. It is also WAY early to see if AMD's road map "beats" Intel's or vice versa -- there are a lot of changes lined up in both architectures where getting to 64 bits was only the first step. So Nacoma could easily end up being both more expensive and slower, at least in the medium run. I personally am currently very fond of my Opteron-based systems and would cheerfully buy a lot more if only somebody would give me the money to do so. In a year, though, or two or four, who knows what I'd buy? I don't pick Opterons because I'm "fond" of AMD or "hate" Intel. I'm equally fond of both and hate neither one of them. I will, however, buy the price/performance winner because it is my work that will suffer if I don't. The only good reason to deliberately pick a more expensive architecture is if there are issues with reliability (either software or hardware). At the moment I'm unaware of any issues at all with Opterons -- they run "perfectly" with FC2 and later, and I'm still waiting for our first hardware failure from our Opteron stack after close to a year under nearly continuous load. SO I'd have to say that I doubt that the authors of the article were particularly well informed, and that AMD is likely to be around and kicking for a few years yet. Look, even the Power series hasn't disappeared and it has almost no top 500 presence at all, if you discount BG itself as IBM showing its marketing clout and finding a use for 700 MHz CPUs in Very Large Quantities... rgb Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
- Previous message: [Beowulf] Register article on Opteron - disagree
- Next message: [Beowulf] Register article on Opteron - disagree
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
