Parallel debug tools for Beowulf
lusk at mcs.anl.gov
Fri Dec 15 11:25:29 PST 2000
| I believe that the only *parallel* debugger that's possible is TotalView,
| which is a commercial product. There is a free debugger called P2D2 but I
| don't know any details about it. Of course, it's always possible to use gdb
| on any node if you know which one's going to crash.
The mpd process manager distributed with MPICH, together with the ch_p4mpd
device in MPICH, can cooperate to make a *sort* of parallel debugger for MPI
programs out of multiple gdb's by managing stdio. If you say
mpigdb -np 5 a.out
mpirun -np 5 a.out
then you will get five gdb's started, each debugging a.out. The 'z' command
(only single letter not used by gdb) can be used to switch stdin from
broadcasting to all the gdb's to being directed at a specific rank. Thus
you can single step them all together or separately, and look at variable
values (they will be labelled with the rank of the process), etc., using
normal gdb commands. We find this hack useful enough to actually debug with,
at least at small scale. The output merging necessary before this can be
scalable is in the works. Highly preliminary, but we like it.
It doesn't compare with TotalView, which is a *real* parallel debugger.
More information about the Beowulf