[Beowulf] HPL Benchmarking and Optimization

Ellis Wilson xclski at yahoo.com
Fri Apr 4 16:48:18 PDT 2008

Tom Elken wrote:
>> -----Original Message-----
>> From: beowulf-bounces at beowulf.org 
>> [mailto:beowulf-bounces at beowulf.org] On Behalf Of
Ellis Wilson
>> I'll likely try MKL soon for the Intel processors
>> interested in.
> Good idea.  
> You might also want to try "Goto BLAS" (Google that
to find the free
> download site).  It can be compiled for a different
architecture a lot
> quicker than ATLAS, and provides very good
performance for both Intel
> and AMD architectures.
> As you may have already found, once you are using a
good BLAS library
> with HPL, various compilers or compiler options
won't make much
> difference in performance.
> -Tom

Hey Tom,

Thanks for your suggestions, I had already begun
testing Goto BLAS when 
I got your email, and it has been thus far the most
beneficial one to my 
particular application residing on a CD-ROM.  MKL
proved to be far too 
heavyweight (and I try to avoid closed source as often
as possible). 
The only difficulties have come with the compilation
of Goto BLAS (or 
anything, for that matter) on a static system such as
a LiveCD.  As I do 
not include in my LiveCD (in order to keep its total
size and initrd 
loaded size down as low as possible) the portage tree,
nothing can be 
emerged.  This has required me to pursue a number of
solutions, the 
first being to copy the full version of it directly
into the tmpfs from 
a usb pen, uncompress it, chroot into that
environment, compile on the 
architecture desired, exit the chroot, recompress, and
put it back on 
the usb for later burning.  Obviously, this requires a
ton of work, so I 
came up with an easier fix that has interesting
repercussions I'd like 
to hear from this list on:

An NFS directory is mounted onto my system, which I
chroot into, compile 
Goto-BLAS or ATLAS upon, and exit the chroot.  Since
the directory 
remains on my development system (which does use a
harddrive) I have no 
issues with running out of RAM, moving this there or
the other place, 
etc.  However, upon compiling Goto-BLAS on an older P4
without HT and 
with 256MB RAM, it reported warnings due to
"clock-skew" I've never seen 
previously.  Is this due to the NFS mount?  And if so,
will it hurt my 
optimization of Goto BLAS or ATLAS?  I still achieved
4GFlops on the P4 
I had used that methodology upon, which was way above
my previous 
findings using the reference library (obviously), but
I still have my 
concerns that better optimization might occur with
local compilation.

Anyone think thats true/false?



You rock. That's why Blockbuster's offering you one month of Blockbuster Total Access, No Cost.  

More information about the Beowulf mailing list