p4_error: net_recv read:

Gerard Gorman g.gorman at ic.ac.uk
Mon May 14 04:48:53 PDT 2001


Hi,

I'm having this problem on our cluster while using MPICH1.2 (alphas
running osf1 connected via a switch):

rm_l_5_3390:  p4_error: net_recv read:  probable EOF on socket: 1

I have been searching the archives/net for an insight to the problem but
all I have found is people reporting the same problem (under linux). 

The problem arises only when I run on some number of processors greater
than 4. The processes are all *reading* the same file which had been NFS
mounted across the nodes (I have included the usual perror checks). 

Has anyone experienced similar problems/know what I should be trying to
fix?

All help appreciated,
g


----------------------------------------------------------
Gerard Gorman (PhD Student)     
Applied Modelling and Computation Group
T. H. Huxley School             
Imperial College
Prince Consort Road             Tel. 00 44 (0)207 594 9323
London SW7 2BP                  Fax. 00 44 (0)207 594 9321
U.K.                    o~o A good slogan beats a good solution.
-----------------------w-v-w------------------------------





More information about the Beowulf mailing list