[Beowulf] TCP connect error: ECONNREFUSED.
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Jörg Saßmannshausen jorg.sassmannshausen at strath.ac.ukMon Mar 30 06:14:50 PDT 2009
- Previous message: [Beowulf] Memory errors poll
- Next message: [Beowulf] TCP connect error: ECONNREFUSED.
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Dear all, I am having this rather anoying problem with the parallel execution of one of the programs (GAMESS US version) on our cluster. The error message is: TCP connect error: ECONNREFUSED. TCP: Connect failed. comp10 -> comp02.chem.strath.ac.uk:42208. A fatal error occurred on DDI Process 0. TCP connect error: ECONNREFUSED. TCP: Connect failed. comp10 -> comp02.chem.strath.ac.uk:42208. A fatal error occurred on DDI Process 60. TCP connect error: ECONNREFUSED. TCP: Connect failed. comp10 -> comp02.chem.strath.ac.uk:42208. A fatal error occurred on DDI Process 2. TCP connect error: ECONNREFUSED. [ ... ] Eventually, the ddicick tips over and the whole thing crashes. The program is using rsh (yes, I know, security, I did not install the cluster!) and I can rsh comp10 -> comp02 and there is no firewall installed between the nodes (at least, not that I am aware of). Trying to run the same job with the same number of nodes will fail X times and at X+1 suddenly work. I could not work out a pattern for that (other that I get exponentially annoyed). Right now, there is only one gigabit network connecting the cluster, so nfs, mpi etc. is all running over one interface (again, I did not set up the cluster). I have run out of ideas of where to look. I checked (as quickly as possible) at some nodes with netstat, the ddicick program is acutally running. Changing to ssh did not solve the problem. I would appreciate any feedback as it is highly anyoing to wait Y days to get the job running and then it crashes. All the best from Glasgow! Jörg -- ************************************************************* Jörg Saßmannshausen Research Fellow University of Strathclyde Department of Pure and Applied Chemistry 295 Cathedral St. Glasgow G1 1XL email: jorg.sassmannshausen at strath.ac.uk web: http://sassy.formativ.net Please avoid sending me Word or PowerPoint attachments. See http://www.gnu.org/philosophy/no-word-attachments.html
- Previous message: [Beowulf] Memory errors poll
- Next message: [Beowulf] TCP connect error: ECONNREFUSED.
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
