[Beowulf] HPL Benchmarking and Optimization

Donnie Berkholz dberkholz at gentoo.org
Fri Apr 4 19:14:22 PDT 2008

On 16:48 Fri 04 Apr     , Ellis Wilson wrote:
> Thanks for your suggestions, I had already begun testing Goto BLAS 
> when I got your email, and it has been thus far the most beneficial 
> one to my particular application residing on a CD-ROM.  MKL proved to 
> be far too heavyweight (and I try to avoid closed source as often as 
> possible). The only difficulties have come with the compilation of 
> Goto BLAS (or anything, for that matter) on a static system such as a 
> LiveCD.  As I do not include in my LiveCD (in order to keep its total 
> size and initrd loaded size down as low as possible) the portage tree, 
> nothing can be emerged. 

You could include only the small subset of the tree you need.

> This has required me to pursue a number of solutions, the first being 
> to copy the full version of it directly into the tmpfs from a usb pen, 
> uncompress it, chroot into that environment, compile on the 
> architecture desired, exit the chroot, recompress, and put it back on 
> the usb for later burning.  Obviously, this requires a ton of work, so 
> I came up with an easier fix that has interesting repercussions I'd 
> like to hear from this list on:
> An NFS directory is mounted onto my system, which I chroot into, 
> compile Goto-BLAS or ATLAS upon, and exit the chroot.  Since the 
> directory remains on my development system (which does use a 
> harddrive) I have no issues with running out of RAM, moving this there 
> or the other place, etc.  However, upon compiling Goto-BLAS on an 
> older P4 without HT and with 256MB RAM, it reported warnings due to 
> "clock-skew" I've never seen previously.  Is this due to the NFS 
> mount?  And if so, will it hurt my optimization of Goto BLAS or ATLAS?  
> I still achieved 4GFlops on the P4 I had used that methodology upon, 
> which was way above my previous findings using the reference library 
> (obviously), but I still have my concerns that better optimization 
> might occur with local compilation.
> Anyone think thats true/false?

In the case of Goto, the optimization will be a function of the compiler 
and compiler flags you use, not whether you compile it on a local or 
remote disk. For ATLAS, I suppose the difference in I/O could affect its 
automatic tuning if for some reason it's not doing it all in memory, but 
that seems unlikely.


More information about the Beowulf mailing list