Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] TCP connect error: ECONNREFUSED.

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Jörg Saßmannshausen jorg.sassmannshausen at strath.ac.uk
Mon Mar 30 06:14:50 PDT 2009


Dear all,

I am having this rather anoying problem with the parallel execution of 
one of the programs (GAMESS US version) on our cluster. The error 
message is:

  TCP connect error: ECONNREFUSED.
  TCP: Connect failed. comp10 -> comp02.chem.strath.ac.uk:42208.
  A fatal error occurred on DDI Process 0.
  TCP connect error: ECONNREFUSED.
  TCP: Connect failed. comp10 -> comp02.chem.strath.ac.uk:42208.
  A fatal error occurred on DDI Process 60.
  TCP connect error: ECONNREFUSED.
  TCP: Connect failed. comp10 -> comp02.chem.strath.ac.uk:42208.
  A fatal error occurred on DDI Process 2.
  TCP connect error: ECONNREFUSED.

[ ... ]

Eventually, the ddicick tips over and the whole thing crashes. The 
program is using rsh (yes, I know, security, I did not install the 
cluster!) and I can rsh comp10 -> comp02 and there is no firewall 
installed between the nodes (at least, not that I am aware of). Trying 
to run the same job with the same number of nodes will fail X times and 
at X+1 suddenly work. I could not work out a pattern for that (other 
that I get exponentially annoyed). Right now, there is only one gigabit 
network connecting the cluster, so nfs, mpi etc. is all running over one 
interface (again, I did not set up the cluster).

I have run out of ideas of where to look. I checked (as quickly as 
possible) at some nodes with netstat, the ddicick program is acutally 
running. Changing to ssh did not solve the problem.

I would appreciate any feedback as it is highly anyoing to wait Y days 
to get the job running and then it crashes.

All the best from Glasgow!

Jörg


-- 
*************************************************************
Jörg Saßmannshausen
Research Fellow
University of Strathclyde
Department of Pure and Applied Chemistry
295 Cathedral St.
Glasgow
G1 1XL

email: jorg.sassmannshausen at strath.ac.uk
web: http://sassy.formativ.net

Please avoid sending me Word or PowerPoint attachments.
See http://www.gnu.org/philosophy/no-word-attachments.html






More information about the Beowulf mailing list