[Beowulf] Explanation of error message in MPICH-1.2.7
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Bill Rankin wrankin at ee.duke.eduMon Oct 16 06:07:01 PDT 2006
- Previous message: [Beowulf] Explanation of error message in MPICH-1.2.7
- Next message: [Beowulf] commercial clusters
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
I often see this error when a MPI_Barrier() call is not placed in front of the MPI_Finalize(). One of the processes exits early and MPICH doesn't like that too much. -b On Oct 6, 2006, at 3:50 PM, Jeffrey B. Layton wrote: > Afternoon cluster fans, > > I'm working with a CFD code using the PGI 6.1 compilers and > MPICH-1.2.7. The code runs fine for a while but I get an error > message that I've never seen before: > > > [2] MPI Internal Aborting program Deep nest in Check_incoming > [2] Deep nest in Check_incoming > > This error message is in the error file from PBS. The output from > the code gives the following: > > > p2_15458: p4_error: : 1 > p5_21530: p4_error: net_recv read: probable EOF on socket: 1 > p7_21548: p4_error: net_recv read: probable EOF on socket: 1 > p6_21539: p4_error: net_recv read: probable EOF on socket: 1 > rm_l_6_21544: (95.492188) net_send: could not write to fd=5, errno > = 32 > rm_l_2_15464: (95.835938) net_send: could not write to fd=5, errno > = 32 > rm_l_5_21535: (95.574219) net_send: could not write to fd=5, errno > = 32 > rm_l_7_21553: (95.410156) net_send: could not write to fd=5, errno > = 32 > > > The code runs fine with other MPI implementations (Scali, > MVAPICH, etc.) My googling efforts haven't yielded anything. > Does anyone have any input on this? > > Thanks! > > Jeff > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf
- Previous message: [Beowulf] Explanation of error message in MPICH-1.2.7
- Next message: [Beowulf] commercial clusters
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
