[Beowulf] running the Linpak -HPL benchmark.
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Gustavo Correa gus at ldeo.columbia.eduSun Jan 17 18:03:25 PST 2010
- Previous message: [Beowulf] running the Linpak -HPL benchmark.
- Next message: [Beowulf] running the Linpak -HPL benchmark.
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi Rahul I've got Rmax/Rpeak around 84% on the cluster (AMD Opteron Shanghai, IB on a single switch). I didn't have the cluster available to play with HPL for too long, not too much tuning, I had to move to production mode. Some folks on mailing lists said they'd get 90%, but the topmost group in Top500 get less (as of mid-2009 it was ~75%, IIRR), probably because of their big networks with stacked switches and communication overhead. To optimize in a single node, apply also the formula for Nmax, using the node's RAM. P and Q (block matrix decomposition) tend to be optimal when they are close to each other. With Nehalem you may have to consider the extra complexity of symmetric multi-threading (hyperthreading), and whether it makes or doesn't make a difference on very regular problems like HPL, with big loops and not much branching/ifs. (Your real world computational chemistry problems probably are not like that.) Have you tried HPL with and without SMT/hyperthreading? It maybe worth testing on a single node at least. I hope this helps. Gus Correa On Jan 16, 2010, at 10:02 PM, Rahul Nabar wrote: > On Thu, Jan 14, 2010 at 7:25 PM, Gus Correa <gus at ldeo.columbia.edu> wrote: > >> >> First, to test, run HPL in a single node or a few nodes, >> using small values of N, say 1000 to 20000. >> >> The maximum value of N can be approximated by >> Nmax = sqrt(0.8*Total_RAM_on_ALL_nodes_in_bytes/8). >> This uses all the RAM, but doesn't get into memory paging. >> >> Then run HPL on the whole cluster with the Nmax above. >> Nmax pushes the envelope, and is where your >> best performance (Rmax/Rpeak) is likely to be reached. >> Try several P/Q combinations for Nmax (see the TUNING file). >> > > Thanks Gus! That helps a lot. I have Linpak running now on just a > single server and am trying to tune and hit the Rpeak. > > I'm getting 62 Gflops but I think my peak has to be around 72 (2.26 > GHz 8 cores Nehalem). On a single server test do you manage to hit the > theoretical peak?What's a good Rmax / Rpeak to shoot for while tuning? > > Once I am confident I'm well tuned on one server I'll try and extend > it to the whole cluster. > > -- > Rahul
- Previous message: [Beowulf] running the Linpak -HPL benchmark.
- Next message: [Beowulf] running the Linpak -HPL benchmark.
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
