[Beowulf] debugging

Matt Funk mafunk at nmsu.edu
Thu Apr 12 16:42:11 PDT 2007


thanks for all the replies first of all,

i don't know the exact scyld distribution. However, i am running mpich 1.2.5.
When i run my program (stripped down to a mere MPI_INIT(...) call) and test it 
with valgrind i get something like :


==21799== Use of uninitialised value of size 8
==21799==    at 0x56F0252: vfprintf (in /lib64/libc-2.3.2.so)
==21799==    by 0x570C844: vsprintf (in /lib64/libc-2.3.2.so)
==21799==    by 0x56F7B69: sprintf (in /lib64/libc-2.3.2.so)
==21799==    by 0x4F81755: net_create_slave 
(in /usr/lib64/MPICH/p4/gnu/libmpich-gnu.so.1.0)
==21799==    by 0x4F810B8: create_remote_processes 
(in /usr/lib64/MPICH/p4/gnu/libmpich-gnu.so.1.0)
==21799==    by 0x4F7D37A: p4_startup 
(in /usr/lib64/MPICH/p4/gnu/libmpich-gnu.so.1.0)
==21799==    by 0x4F7D1CC: p4_create_procgroup 
(in /usr/lib64/MPICH/p4/gnu/libmpich-gnu.so.1.0)
==21799==    by 0x4F9383B: MPID_P4_Init 
(in /usr/lib64/MPICH/p4/gnu/libmpich-gnu.so.1.0)
==21799==    by 0x4F9271B: MPID_CH_InitMsgPass 
(in /usr/lib64/MPICH/p4/gnu/libmpich-gnu.so.1.0)
==21799==    by 0x4F87691: MPID_Init 
(in /usr/lib64/MPICH/p4/gnu/libmpich-gnu.so.1.0)
==21799==    by 0x4FA32B3: MPIR_Init 
(in /usr/lib64/MPICH/p4/gnu/libmpich-gnu.so.1.0)
==21799==    by 0x4FA2F68: PMPI_Init 
(in /usr/lib64/MPICH/p4/gnu/libmpich-gnu.so.1.0)

which i think is a problem with the mpi distribution. Does anyone have any 
experience with building a new mpi library on a scyld machine? Should it be 
straightforward?

thanks
mat




More information about the Beowulf mailing list