[Beowulf] [AMD64] Gentoo or Fedora

Robert G. Brown rgb at phy.duke.edu
Fri Aug 31 13:15:33 PDT 2007

On Fri, 31 Aug 2007, Larry Stewart wrote:

> Mark Hahn wrote:
>>> For X86 archtectures, it actually makes a very noticable performance
>>> change (30%?) to use build switches that are appropriate for your
>>> machines, rather than the generic switches chosen by Red Had or
>> interesting.  30% on what operations?
> I should absolutely know not to report a number without a citation.
> I was thinking of this report: http://www.linux.com/articles/41348
> but it is actually comparing Linux vs other Unixlike OSs.
> The Linux happens to be Gentoo, and the comments cite
> Gentoo's "unfair advantage" because it is compiled with optimization!

Fair enough, but Mark's surprise is due to the fact that most numerical
applications tend to be rate limited by only a very few things:  CPU
clock and architecture, memory speed and type and interconnect, compiler
quality, program organization.  One therefore would be surprised at a 5%
performance difference for a binary built using the same versions of the
same libraries (built, to be sure, to match the architecture in
questions -- i686 vs i386 binaries might well differ in performance),
although one might not be at all surprised at a larger benefit if one
changed compilers or libraries altogether.  After all, when a task is on
the CPU it is pretty much running as fast as it can run.

> So I shouldn't have claimed a performance advantage against other
> distributions.  Might be true, but not supported by this study.

For some applications I wouldn't be surprised -- mysql in particular is
almost certainly something that can be significantly improved because of
what it does and because it isn't exactly the cleanest code in the
Universe from what I understand.  Improved caching or buffering,
rearranging a few loops, and the like might make a big difference for
it.  I also don't know exactly what Super Smack does as a benchmark, but
I do know a bit about benchmarking, and one perennial problem with
comparative benchmarking is establishing a uniform system state.
Running it on "a small and easily cached data set" doesn't give me a lot
of confidence that the author will achieve replicable results that
reflect the real world even for mysql, nor am I convinced that the test
run in this way is CPU/OS/memory bound.

I tend to agree with Mark and Gerry's earlier remarks -- most
differences in linux distros these days are ESSENTIALLY window dressing.
apt-get vs yum.  gnome vs kde.  Sure, some run slightly more advanced
kernels, or slightly newer versions of libraries, but for the most part
they share a common code base.  In the context of building a cluster
node, differences are further minimized, as one tends to install just
what you need, which tends to be nonspecialized -- the common code base
that is shared by most programs (and hence most linuces).  Compiling
with the Intel reference compiler, or pathscale, might well make a big
difference relative to compiling with gcc (or not).  Tuning the
application for the cache size and memory architecture might make a
difference.  I wouldn't EXPECT running Debian, or Fedora, or Red Hat, or
Slackware to make a big difference, except insofar as one or another of
them is way ahead or way behind in their basic libraries.  So RHEL when
the libraries are 2+ years old might well be slow compared to Fedora
Core with the latest versions of the basic libraries (although 30% is a
big difference to expect period) but RHEL when it is first released and
its libraries "are" FC really shouldn't be.


Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu

More information about the Beowulf mailing list