[Beowulf] hpl - large problems fail
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Guy Coates gmpc at sanger.ac.ukThu Mar 10 14:11:37 PST 2005
- Previous message: [Beowulf] hpl - large problems fail
- Next message: Clarification: [Beowulf] hpl - large problems fail
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Thu, 10 Mar 2005, Paul Johnson wrote: > All: > > I have a 4 node cluster(dont snicker :) ) Everyone starts off small. and Im trying to do some > benchmarking with HPL. I want to test 2 of the nodes with 1Gb of > ram each. I calculated the maximum problem size that can fit in 2Gb > and still allow for memory for the operating system. That came out to > be around 14500x14500. When I run that size of a test it always fails. > The largest problem that I can test and not have it fail on me is > 12500x12500. > What is the reason behind this? Im confused on what is going on here. > Thanks for any help. Do you know what actually caused the failure? If your problem size was too big, and you are really out of memory, you should see some messages in the system log saying the out-of-memory-killer was activated and HPL was zapped. If you know your machines was not actually out of memory, then you have broken hardware on one of your nodes. Run memtest+ or memtest on your nodes (Possibly the world's most useful pieces of diagnostic software). http://www.memtest86.com http://www.memtest.org If you haven't seen it, IBM have a redpaper on tuning HPL, which gives some good starting parameters, problem-sizing tips and an overview of different BLAS libraries you can compile against to get that extra few Gflops of performance. Cheers, Guy -- Dr. Guy Coates, Informatics System Group The Wellcome Trust Sanger Institute, Hinxton, Cambridge, CB10 1SA, UK Tel: +44 (0)1223 834244 ex 7199
- Previous message: [Beowulf] hpl - large problems fail
- Next message: Clarification: [Beowulf] hpl - large problems fail
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
