[Beowulf] Benchmark between Dell Poweredge 1950 And 1435

Thu Mar 8 06:25:16 PST 2007

On Tue, 6 Mar 2007, Juan Camilo Hernandez wrote:

> Hello..
>
> I would like to know what server has the best performance for HPC systems
> between The Dell Poweredge 1950 (Xeon) And 1435SC (Opteron). Please send me
> suggestions...
>
> Here are the complete specifications for both servers:
>
> Poweredge 1435SC
> Dual Core AMD Opteron 2216 2.4GHz
> 3GB RAM 667MHz, 2x512MB and 2x1GB Single Ranked DIMMs
>
> Poweredge 1950
> Dual Core Intel Xeon  5130 2.0Ghz
> 2GB 533MHz (4x512MB), Single Ranked DIMMs
>

Almost certainly the opteron.  For a variety of reasons, but higher
clock certainly helps -- it would probably have been faster at
equivalent clock anyway.  Now that I've "answered", let me tell you why
you should't believe me and what you should actually do to answer your
own question. There is a standard litany we like to chant on this list:

   "Your Mileage May Vary"
   "A benchmark in hand is worth any number of anecdotal reports"
   "The best benchmark is your own application"
   "What do you plan to do with it?"
   "It depends..."
   "In particular, it depends on your application (mix), its memory and
disk and network requirements, the topology and type of your network,
the communication and memory access pattern used by your application
(mix), the compiler and library used, and a few dozen other variable
major and minor, which is why nobody is going to tell you one is always
better than the other even if >>they<< think it is true..."
   "And then there is the cost -- the REAL question is which one has the
better cost-benefit, not which one is the cheapest or fastest
independently.  Ask yourself the question -- with a fixed budget to
spend, which architecture lets me get the most done in the least time."

So if you like, I wouldn't be doing you a favor by telling you
>>definitely<< the opteron only partly because it might not be true.  If
you believed me (because I sound so glib and because you don't know that
AMD once sent me a cool tee-shirt and Intel hasn't, although I do have a
pair of these cool little contamination-suited Intel dude keychains that
come close) then you might be tempted to skip the CORRECT cluster
engineering step(s) of:

   a) Study your application (mix) -- figure out in at least general
terms its (their) communication patterns, its (their) memory
requirements (size and access pattern), its (their) CPU requirements.
Some applications are "I/O bound" -- run at a speed determined by the
access speed of disk, for example.  Some applications are "memory bound"
-- they spend all of their time fetching data from memory, relatively
little on actually doing something to it.  Some applications (especially
parallel cluster applications) are "network bound" and run at a rate
that is determined by the latency or bandwidth of a network connection,
further complicated in the case of real parallel code by the
communication PATTERNS which can cause bottlenecks outside of the system
altogether.  Some applications (the happiest ones, I tend to think:-)
run at a rate that is limited by CPU clock, clock, clock and nothing but
clock, although different CPU architectures (e.g. Xeon and Opteron, 32
or 64 bit) have a different BASE performance at any given clock.

   b) If at all possible, and it nearly always is possible, beg, borrow,
steal, buy, or rent a system or two in your competing architectures and
run YOUR CODE compiled with YOUR PLANNED COMPILER on those systems and
just measure its performance.  This is actually a whole lot easier than
the stuff in step a) and a lot more likely to be accurate, but I still
don't advise skipping a).  If you are planning on buying more than a
handful of systems, it is actually often worth your while to >>buy<< one
of each of two or three or even four candidate system, test them, and
then buy the other 127 or however many nodes you plan to put in your
cluster of the winner, instead of buying 128 of the wrong kind.  You can
recycle the losers as servers, really powerful desktops, whatever.  A
really good vendor will often loan you systems (or network access to
systems) to do this testing.  A really good compiler vendor (e.g.
pathscale) will even/often lend you a compiler for a trial period to do
the testing.  Or there may be list humans who own a system who will set
up a trial account for you -- it's a pretty friendly list;-)

   c) Don't lock yourself in to only Dells (or any single distributer)
while looking over systems.  I personally do not dislike Dell, although
I know people that do.  Their hardware is not the most reliable that
I've ever used -- far from it, actually -- but their service plans tend
to be very good, their cost is reasonable, and they aren't linux-averse
although I think that they're still working on becoming actively
linux-friendly.  However, there are a number of other tier 1 and tier 2
vendors you should be considering with hardware that is as good or
better (in my opinion MUCH better) and with equally attractive prices
and service deals.  IBM, for example, is also linux-friendly and tends
to make excellent if gold-plated hardware.  Penguin Computing is my own
personal favorite, largely because with the exception of one DOA system
out of a good size stack of Altus's we've gotten (no doubt the one that
"fell off the truck" and likely not Penguin's fault) I have yet to see
an Altus fail in harness.  Seriously.  Pretty extraordinary, really,
given that they run at full load pretty much 24x7 for as long as years
at this point.  I've heard that their service is really good -- maybe
one day I'll have a chance to find out...:-).  Penguin will almost
certainly let you prototype on their systems

   d) When you've done all your research above, then DO THE COST BENEFIT
ANALYSIS.  If your application is network bound, don't worry so much
about system clock and speed, worry about getting a really high speed
cluster network to match (which is expensive, so you may want to get
CHEAPER SLOWER nodes if the app isn't CPU bound anyway).  If your
application is memory bound, you may want to skip the dual cores and get
two single cores or quad single cores -- otherwise you might just be
using two cores at a time while the other cores are waiting in line to
get at memory, wasting all the money you spent on the dual cores in the
first place.  If your applications is disk bound then look more closely
at disk and less at CPU -- what kind of bus, what kind of disk
subsystem, what are the bottlenecks (per system) and the costs of
minimizing them.

As you can hopefully now see, the RIGHT question to have asked isn't
which of two particular systems out of twenty on the market is "best" in
some amorphous way, it is which of the twenty systems in the two
thousand possible ways of configuring them with network and disk and
memory and CPU and compiler will get the most work done for your
investment of a fixed amount of money.  Answer that, and then make your
purchase with confidence.

I'm sure that other list-humans have experiences or suggestions to share
here.  If you are very unsure of your abilities to carry out the list of
chores above, there are at least 2 or 3 professional cluster consultants
on the list who would probably help you for a moderate fee -- ask them
to contact you offline if you are interested as they generally won't
spam the list beyond maybe letting you know that they exist while
helping to answer your original question.  They can do anything from
helping you with the prototyping and analysis to provide you with a
cost-competitive turnkey cluster, depending on your needs and cluster
management skills.

I myself provide the kind of dear-abby advice above on-list and charge
only beer (should we ever meet).  Mind you, at this point if I ever
actually received the beer due me according to this rule, I would die in
a gutter somewhere inside six months with my liver in complete failure,
so it is probably just as well that I generally don't go to cluster
meetings and so forth...;-)

    rgb

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu