[Beowulf] GPU diagnostics?
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Joe Landman landman at scalableinformatics.comMon Mar 30 16:09:31 PDT 2009
- Previous message: [Beowulf] GPU diagnostics?
- Next message: [Beowulf] GPU diagnostics?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Greg Lindahl wrote: > On Mon, Mar 30, 2009 at 06:31:17PM -0400, Joe Landman wrote: > >> This said, there really isn't a memory checker for GPUs just yet. Could >> be done, and probably should be ... > > But will it be like memtest86, which isn't as good as HPL at finding > problems? If you've got DGEMM for your GPU, you're there. Heh... I erased the paragraph where I tore into using memtest* as anything other than a gross checker ... felt it wasn't too relevant. We run a few parallel codes as our testers. Beats the heck out of the system (you can hear the fans spin up on variable speed systems). Specifically, we purposefully (computationally) overload the unit and make sure we don't throw EDACs/MCEs. Yeah, *GEMM is good (some GPU cards don't do DGEMMs on them though ... older nVidia/ATI don't). Too bad Cuda won't run on the ATIs. Would really make maintaining this thing easy. If people can live with SGEMMs, and other FFT-like things, we can probably leverage (and make available) an older code we used a while ago. Actually, for another project, we just did a DGETF and a few other ports. Let me know if you want me to clean it up and make it available. -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman at scalableinformatics.com web : http://www.scalableinformatics.com phone: +1 734 786 8423 fax : +1 734 786 8452 cell : +1 734 612 4615
- Previous message: [Beowulf] GPU diagnostics?
- Next message: [Beowulf] GPU diagnostics?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
