[Beowulf] bandwidth to GPU

Igor Kozin i.n.kozin at googlemail.com
Fri May 21 08:15:06 PDT 2010

Hello everyone,

 I'm quite curios about the bandwidth to GPUs people are getting especially
with NVIDIA C1060 or Fermi on Intel hosts with two 5520 chipsets. Using
bandwidthTest from CUDA SDK and averaging the results over all cores and
GPUs (we have S1070) I'm getting with memory=pageable 3672 MB/s host to
device and 3023 MB/s device to host. With memory=pinned the numbers increase
to 5499 MB/s and 5291 MB/s respectively which look okay too me.

On a two chipset host 1) there is obviously asymmetry resulting in low and
high numbers depending on affinity and, worryingly, 2) pinned bandwidth is a
bit too low.


host to device: 3702/3716

device to host: 2880/1807


host to device: 5751/4709

device to host: 3264/1873

If you happen to have numbers for ATI GPUs and/or AMD based hosts please
post them too.


