Anyone have information on latest LSU beowulf?
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Craig Tierney ctierney at hpti.comTue Oct 8 09:54:47 PDT 2002
- Previous message: Anyone have information on latest LSU beowulf?
- Next message: Anyone have information on latest LSU beowulf?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Tue, Oct 08, 2002 at 01:35:27PM +0100, Daniel Kidger wrote: > > On Mon, 23 Sep 2002, Craig Tierney wrote: > > http://www.phys.lsu.edu/faculty/tohline/capital/beowulf.html > > > > Their HPL result is 2.2 Tflops! Very impressive. > > > > (lines deleted) > > What HPL settings did they use to achieve their result? > > I think that many people would be interested in their settings for xhpl, > in particular what percentage peak did they manage for a single CPU run ? > > A 1.8GHz P4 has a theoretical peak of 3.6GFlops/s, but so far I have only > seen figures of around 60% of this for linpack. Compare this with 75%+ for > Alpha nodes (and of course 95%+ for vector processors). > > So, in terms of single processor performance: > > Which compiler did they use ? (icc version 7 perhaps) > And which compiler options ? > Did they use mkl or Atlas for the BLAS ? Probably Atlas. I tested both and got about 70-80% more speed out of Atlas compiled with gcc-2.91 (recommended). > What value of NB did they settle on ? (80 and 160 seem common choices) > any other non-default values in HPL.dat ? Why are 80 and 160 common choices? I do know that they used 160 for their run. I also retested my setup at 160 and it is much slower than 64. I was told by someone at UTK that the size of NB should be a multiple of the L1 cache and that double is good. So NB = sqrt(8kb * 1024/8)=32 for P4 Xeon. I tried 64 and that has been the best for a single node run. The one thing I have not yet determined is that maybe the larger number is more effective when you start running on large node counts. I cycled through the NB values from 32 to 160 on 64 processors and NB=64 was still the best. I wonder if having more memory (1 GB vs. 2 GB per node) could drastically improve scaling. Anyone know? Craig -- Craig Tierney (ctierney at hpti.com) > > > > > Yours, > Daniel. > > -------------------------------------------------------------- > Dr. Dan Kidger, Quadrics Ltd. daniel.kidger at quadrics.com > One Bridewell St., Bristol, BS1 2AA, UK 0117 915 5505 > ----------------------- www.quadrics.com --------------------
- Previous message: Anyone have information on latest LSU beowulf?
- Next message: Anyone have information on latest LSU beowulf?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
