[Beowulf] debugging

Matt Funk mafunk at nmsu.edu
Mon Apr 9 10:30:32 PDT 2007


Hi,

i hope this is the right mailing list to post to...

Anyway, i was wondering if i could get some advice/direction on how to debug 
my mpich program. I am running on a scyld configuration. What i am trying 
right now is the following:

mpirun -dbg=gdb -nolocal -np 32 exec

which starts the debugger in which i go
run args

which then start the program. However, it doesn't get very far until it just 
sits there. When i ps all the processes are defunced.

When i do the same thing except mpirun -dbg=gdb -nolocal -np 1 exec
and run it in the debugger, the program starts running well.

The reason i want to run on 32 processor though, is that it takes (on 32 
procs) several hours till my program crashes. Also, i would like to be able 
to keep the conditions under which it crashes intact as much as possible 
(i.e. run on 32 procs rather than 1).

Does anyone have any advice? I am open to try out other things as well if 
possible. I am just starting to learn debugger techniques for a parallel 
program.

thanks
mat



More information about the Beowulf mailing list