Mark,<div>I am messing up with a GPU oriented cluster.</div><div><br class="webkit-block-placeholder"></div><div>I am now on travel to ISC, where I will show a sustained Teraflop with a workstation with 4 Tesla cards using VMD to do ion placement (for the list member going to Dresden stop by to the Nvidia booth to see the demo in action). This was a computation that used to take 100 CPU hours on an Altix and it is now done in the matter of minutes. Yes, the whole system probably consumes 900W ( the tdp of a tesla is 170W not 220W), but I can assure you that is nothing compared to a big Altix machine and you can put under your desk and do some real science.

</div><div><br class="webkit-block-placeholder"></div><div>Several groups are building gpu-oriented cluster. Once mine is completed ( 8 compute nodes, each one with 2 Tesla boards) , it should be accessible for testing to academic and research group. People interested in testing their CUDA codes on cluster could drop me an email.

</div><div><br class="webkit-block-placeholder"></div><div>On a side note, it is interesting to see all the speculations from people that have never used CUDA (and most of the time don't have a clue...) and at the same time to see quality software (mostly open source like VMD, NAMD, SOFA ) achieving pretty impressive results and enabling new science.  

</div><div><br class="webkit-block-placeholder"></div><div><br class="webkit-block-placeholder"></div><div>Massimiliano</div><div>PS:  Usual disclaimer, I work in the GPU Computing group at NVIDIA. </div><div><br class="webkit-block-placeholder">

</div><div><br><br><div><span class="gmail_quote">On 6/21/07, <b class="gmail_sendername">Mark Hahn</b> <<a href="mailto:hahn@mcmaster.ca">hahn@mcmaster.ca</a>> wrote:</span><blockquote class="gmail_quote" style="margin:0;margin-left:0.8ex;border-left:1px #ccc solid;padding-left:1ex">

Hi all,<br>is anyone messing with GPU-oriented clusters yet?<br><br>I'm working on a pilot which I hope will be something<br>like 8x workstations, each with 2x recent-gen gpu cards.<br>the goal would be to host cuda/rapidmind/ctm-type gp-gpu development.

<br><br>part of the motive here is just to create a gpu-friendly<br>infrastructure into which commodity cards can be added and<br>refreshed every 8-12 months.  as opposed to "investing" in<br>quadro-level cards which are too expensive enough to toss when obsoleted.

<br><br>nvidia's 1U tesla (with two g80 chips) looks potentially attractive,<br>though I'm guessing it'll be premium/quadro-priced - not really in<br>keeping with the hyper-moore's-law mantra...<br><br>if anyone has experience with clustered gp-gpu stuff, I'm interested

<br>in comments on particular tools, experiences, configuration of the host<br>machines and networks, etc.  for instance, is it naive to think that<br>gp-gpu is most suited to flops-heavy-IO-light apps, and therefore doesn't

<br>necessarily need a hefty (IB, 10Geth) network?<br><br>thanks, mark hahn.<br>_______________________________________________<br>Beowulf mailing list, <a href="mailto:Beowulf@beowulf.org">Beowulf@beowulf.org</a><br>To change your subscription (digest mode or unsubscribe) visit 

<a href="http://www.beowulf.org/mailman">http://www.beowulf.org/mailman</a>/listinfo/beowulf<br></blockquote></div><br></div>