[Beowulf] hang-up of HPC Challenge
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Chris Samuel csamuel at vpac.orgSun Sep 7 01:58:32 PDT 2008
- Previous message: [Beowulf] Q: AMD Opteron (Barcelona) 2356 vs Intel Xeon 5460
- Next message: [Beowulf] Re: Re: GPU boards and cluster servers.
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
----- "Mikhail Kuzminsky" <kus at free.net> wrote: Hi Mikhail, Sorry for the delay in getting back to you, work has been keeping me very occupied! > In message from Chris Samuel <csamuel at vpac.org> (Wed, 20 Aug 2008 > 11:12:52 +1000 (EST)): > > >Does the code crash, does it just stop & idle, does it > >busy loop, does the node oops, does it lockup, etc ? > > I beleive that program crash is not hangup. When I wrote > about Linux hangup, I means that Linux don't response to > any interrupts - from keyboard, from ssh client requests etc. That really sounds like either your hitting a kernel or hardware issues - might be worth trying out the BreakIn tool that Jason posted about elsewhere on the list: http://www.advancedclustering.com/software/breakin.html > I use 2.6.22.5-31 kernel from SuSE 10.3 distribution. That's pretty old now, I'd strongly suggest trying out the current mainline kernel on there, this works pretty well on our SuperMicro based Barcelona cluster. cheers! Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency
- Previous message: [Beowulf] Q: AMD Opteron (Barcelona) 2356 vs Intel Xeon 5460
- Next message: [Beowulf] Re: Re: GPU boards and cluster servers.
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
