[Beowulf] How Would You Test Infiniband in New Cluster?
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Bill Broadley bill at cse.ucdavis.eduTue Nov 17 14:46:43 PST 2009
- Previous message: [Beowulf] How Would You Test Infiniband in New Cluster?
- Next message: [Beowulf] How Would You Test Infiniband in New Cluster?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Jon Forrest wrote: > Bill Broadley wrote: > >> My first suggest sanity test would be to test latency and bandwidth to >> insure >> you are getting IB numbers. So 80-100MB/sec and 30-60us for a small >> packet >> would imply GigE. 6-8 times the bandwidth certainly would imply SDR or >> better. Latency varies quite a bit among implementation, I'd try to get >> within 30-40% of advertised latency numbers. > > For those of us who aren't familiar with IB utilities, > could you give some examples of the commands you'd use > to do this? > > Thanks, > Jon Here's 2 that I use: http://cse.ucdavis.edu/bill/relay.c http://cse.ucdavis.edu/bill/mpi_nxnlatbw.c So to compile, assuming a sane environment: mpicc -O3 relay.c -o relay The command to run an MPI program varies by environment and mpi implementation, and batch queue environment (especially tight integration). It should be something close to: mpirun -np <number of nodes> -machinefile <list of nodes> ./relay 1 mpirun -np <number of nodes> -machinefile <list of nodes> ./relay 1024 mpirun -np <number of nodes> -machinefile <list of nodes> ./relay 8192 You should see something like: c0-8 c0-22 size= 1, 16384 hops, 2 nodes in 0.75 sec ( 45.97 us/hop) 85 KB/sec c0-8 c0-22 size= 1024, 16384 hops, 2 nodes in 2.00 sec (121.94 us/hop) 32803 KB/sec c0-8 c0-22 size= 8192, 16384 hops, 2 nodes in 6.21 sec (379.05 us/hop) 84421 KB/sec So basically on a tiny packet 45us of latency (normal for gigE), and on a large package 84MB/sec or so (normal for GigE). I'd start with 2 nodes, then if you are happy try it with all nodes. Now for infiniband you should see something like: c0-5 c0-4 size= 1, 16384 hops, 2 nodes in 0.03 sec ( 1.72 us/hop) 2274 KB/sec c0-5 c0-4 size= 1024, 16384 hops, 2 nodes in 0.16 sec ( 9.92 us/hop) 403324 KB/sec c0-5 c0-4 size= 8192, 16384 hops, 2 nodes in 0.50 sec ( 30.34 us/hop) 1054606 KB/sec Note the latency is some 25 times less and the bandwidth some 10+ times higher. Note the hostnames are different, don't run multiple copies on the same node unless you intend to. Running 4 copies on a 4 cpu node doesn't test infiniband. So once you get what you expect I'd suggest something a bit more comprehensive. Something like: mpirun -np <number of nodes> -machinefile <list of nodes> ./mpi_nxnlatbw I'd expect some different in latency and bandwidth between nodes, but not any big differences. Something like: [0<->1] 1.85us 1398.825264 (MillionBytes/sec) [0<->2] 1.75us 1300.812337 (MillionBytes/sec) [0<->3] 1.76us 1396.205242 (MillionBytes/sec) [0<->4] 1.68us 1398.647324 (MillionBytes/sec) [1<->0] 1.82us 1375.550155 (MillionBytes/sec) [1<->2] 1.69us 1397.936020 (MillionBytes/sec) ... Once those numbers are consistent and where you expect them (both latency and bandwidth) I'd follow up with a production code that produces a known answer and is likely to provide much wider MPI coverage.
- Previous message: [Beowulf] How Would You Test Infiniband in New Cluster?
- Next message: [Beowulf] How Would You Test Infiniband in New Cluster?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
