[Beowulf] NAMD/CUDA scaling: QDR Infiniband sufficient?

Nifty Tom Mitchell niftyompi at niftyegg.com
Mon Feb 9 15:51:28 PST 2009


On Mon, Feb 09, 2009 at 03:37:06PM -0500, Dow Hurst DPHURST wrote:
> Subject: [Beowulf] NAMD/CUDA scaling: QDR Infiniband sufficient?
> 
>    Has anyone tested scaling of NAMD/CUDA over QLogic or ConnectX QDR
>    interconnects for a large number of IB cards and GPUs?  I've listened
>    to John Stone's presentation on VMD and NAMD CUDA acceleration.  The
>    consensus I brought away from the presentation was that one QDR per GPU
>    would probably be necessary to scale efficiently.  The 60 node, 60 GPU,
>    DDR IB enabled cluster that was used for initial testing was saturating
>    the interconnect.  Later tests on the new GT200 based cards show even
>    more performance gains for the GPUs.  1 GPU performing the work of 12
>    CPUs or 8 CPUs equaling 96 cores were the numbers I saw.  So with a
>    ratio of 1gpu/12cores, interconnect performance will be very important.
>    Thanks,
>    Dow

Because of the GPU aspect I doubt any vendor has tested this.  The chassis
is a consideration needing multiple PCI Express 2.0 x16 slots for the
GPU cards and an additional PCIe x16 slot for the IB card with memory
system bandwidth to match.

There is info in:
  http://www.ks.uiuc.edu/Research/gpu/files/nvision2008compbio_stone.pdf
  http://www.ncsa.uiuc.edu/Projects/GPUcluster/
  http://www.ncsa.uiuc.edu/~kindr/papers/lci09_paper.pdf






-- 
	T o m  M i t c h e l l 
	Found me a new hat, now what?




More information about the Beowulf mailing list