<div dir="ltr">I would not draw too many conclusions, the SpecAcc is just telling you the quality of the OpenACC  compiler and the quality of the porting.<div>For example, if you look at the results for CloverLeaf  ( I am familiar with this application and have other reference points), you have:</div><div>AMD/Pathscale: 3.13 specaccel_peak</div><div>NVIDIA/PGI:       3.45 specaccel_peak</div><div><br></div><div><br></div><div>Keeping the HW constant and changing the software ( adding CUDA C and CUDA Fortran to the mix)  will give you </div><div>for the 3840x3840 grid  the following  average times per cell  (measured in 10^-8s): </div><div>OpenACC loops: 1.92</div><div>OpenACC kernels: 1.78</div><div>CUDA Fortran; 1.33</div><div>CUDA C: 1.25</div><div><br></div><div>Timing is on a K20c, but we are interested in the relative performance. Cuda C/Fortran in 30% faster.</div><div>There is also an OpenCL implementation of CloverLeaf but I don't have the results. It is probably in the same ballpark.<br></div><div>This is a "simple" CFD code with regular access pattern, a directive base porting gives you decent results.</div><div>You could try to run the OpenCL code on the AMD card and see how far the Pathscale compiler is from it, but I am</div><div>expecting something similar.</div><div><br></div><div>OpenACC is an interesting option for people looking for high level programming, but you usually pay a penalty. </div><div>How big is the penalty will depend on a lot of factors and it is very difficult to generalize.</div><div><br></div><div>M</div><div><br></div><div><br></div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Mar 4, 2015 at 12:26 PM, C Bergström <span dir="ltr"><<a href="mailto:cbergstrom@pathscale.com" target="_blank">cbergstrom@pathscale.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On Thu, Mar 5, 2015 at 3:10 AM, Craig Tierney - NOAA Affiliate<br>

<<a href="mailto:craig.tierney@noaa.gov">craig.tierney@noaa.gov</a>> wrote:<br>

><br>

> It appears to me that the numbers posted on that page for the card you are<br>

> testing are with ECC off?  I know you are asking the question "what if", but<br>

> the current test isn't even apples-to-apples.<br>

<br>

</span>SPEC does allow you 1:1 comparisons. In this case we're not yet<br>

showing the gains I know we can achieve. I'm mostly trying to stir the<br>

pot to see the level of interest.<br>

<br>

Here's NVIDIA's best published result<br>

<a href="http://spec.org/accel/results/res2014q1/accel-20140303-00018.html" target="_blank">http://spec.org/accel/results/res2014q1/accel-20140303-00018.html</a><br>

compared to ours<br>

<a href="http://spec.org/accel/results/res2015q1/accel-20150218-00045.html" target="_blank">http://spec.org/accel/results/res2015q1/accel-20150218-00045.html</a><br>

<br>

The specific Intel CPU is less a factor if you're concerned about<br>

that. I could put this card in the exact same system NVIDIA used and<br>

show some decent performance. (That 3.8Ghz boost in fact may help more<br>

than anything)<br>

<div class="HOEnZb"><div class="h5">_______________________________________________<br>

Beowulf mailing list, <a href="mailto:Beowulf@beowulf.org">Beowulf@beowulf.org</a> sponsored by Penguin Computing<br>

To change your subscription (digest mode or unsubscribe) visit <a href="http://www.beowulf.org/mailman/listinfo/beowulf" target="_blank">http://www.beowulf.org/mailman/listinfo/beowulf</a><br>

</div></div></blockquote></div><br></div>