[Beowulf] Problems with a JS21 - Ah, the networking...
john.hearns at streamline-computing.com
Sat Sep 29 02:22:47 PDT 2007
On Fri, 2007-09-28 at 17:43 -0300, Ivan Paganini wrote:
> Hello everybody,
> I am beginning to take care of an IBM's JS21. The cluster consists of
> The myrinet connection was working right, but sometimes a user program
> just got stuck - one of the processes was sleeping, and all others
> were running. Then, the program hangs.
> Any suggestions?
Contact Myricom support?
BTW, if you are doing the debugging by yourself, start from the bottom.
Take two machines, run mx_info, mx_endpoint (should be nothing if no
programs running) and mx_counters.
Then do your pingpong and further stress tests as in the README.
More information about the Beowulf