[Beowulf] How Would You Test Infiniband in New Cluster?
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Bill Broadley bill at cse.ucdavis.eduTue Nov 17 10:33:17 PST 2009
- Previous message: [Beowulf] How Would You Test Infiniband in New Cluster?
- Next message: [Beowulf] How Would You Test Infiniband in New Cluster?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Jon Forrest wrote: > Let's say you have a brand new cluster with > brand new Infiniband hardware, and that > you've installed OFED 1.4 and the > appropriate drivers for your IB > HCAs (i.e. you see ib0 devices > on the frontend and all compute nodes). > The cluster appears to be working > fine but you're not sure about IB. > > How would you test your IB network > to make sure all is well? My first suggest sanity test would be to test latency and bandwidth to insure you are getting IB numbers. So 80-100MB/sec and 30-60us for a small packet would imply GigE. 6-8 times the bandwidth certainly would imply SDR or better. Latency varies quite a bit among implementation, I'd try to get within 30-40% of advertised latency numbers. Then I'd try a workload that kept all nodes busy with something communications intensive. Pathscale has a mpi_nxnlatbw which works reasonable well to identify ports/nodes that are are slower than expected. After that works I'd suggest a production MPI work load with a known answer.
- Previous message: [Beowulf] How Would You Test Infiniband in New Cluster?
- Next message: [Beowulf] How Would You Test Infiniband in New Cluster?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
