Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] picking out a job scheduler

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Nathan Moore ntmoore at gmail.com
Tue Jan 2 20:55:10 PST 2007


Torque was really easy to install, but it seems like my /etc/hosts  
file must be screwed up, as I can't get the cluster nodes to  
respond.  Specifically, within a cluster of 3 machines, each having  
an /etc/hosts file of:

	127.0.0.1       localhost.localdomain   localhost
	199.17.152.17   runner
	199.17.152.135  muscovey
	199.17.152.13   pekin
	(( other workstations follow ))

Now, when I have the pbs_server running on runner, and the pbs_mom  
daemons running on muscovey, pekin, and runner, I et the following  
status message,

	[root at runner torque-2.1.6]# pbsnodes -a
	pekin
	     state = down
	     np = 1
	     ntype = cluster

	muscovey
	     state = down
	     np = 1
	     ntype = cluster

	runner
	     state = down	
	     np = 1
	     ntype = cluster

I realize this is a pretty low-level question, but what the heck is  
wrong with my /etc/hosts file?

regards,

NT


ps,  the trouble shooting message given by torque is,

	[root at runner torque-2.1.6]# momctl -d 3

	Host: runner/runner   Version: 2.1.6
	WARNING:  server not specified (set $pbsserver)
	PID:                    30531
	HomeDirectory:          /var/spool/torque/mom_priv
	MOM active:             2518 seconds
	Server Update Interval: 45 seconds
	LOGLEVEL:               0 (use SIGUSR1/SIGUSR2 to adjust)
	Communication Model:    RPP
	TCP Timeout:            20 seconds
	NOTE:  no prolog configured
	Alarm Time:             0 of 10 seconds
	Trusted Client List:    199.17.152.17,127.0.0.1
	Configured to use /usr/bin/scp -rpB
	NOTE:  no local jobs detected

	diagnostics complete



- - - - - - - - - - - - - - - - - - - - - - -

Nathan Moore
Physics
Winona State University
nmoore at winona.edu
AIM:nmoorewsu

- - - - - - - - - - - - - - - - - - - - - - -


On Jan 2, 2007, at 7:23 PM, Chris Samuel wrote:

On Wednesday 03 January 2007 08:06, Chris Dagdigian wrote:

> Both should be fine although if you are considering *PBS you should
> look at both Torque (a fork of OpenPBS I think)

That's correct, it (and ANU-PBS, another fork) seem to be the defacto  
queuing
systems in the state and national HPC centers down here.

Torque is just *so* much better than OpenPBS used to be (not that it was
particularly hard).

cheers,
Chris
-- 
  Christopher Samuel - (03)9925 4751 - VPAC Deputy Systems Manager
  Victorian Partnership for Advanced Computing http://www.vpac.org/
  Bldg 91, 110 Victoria Street, Carlton South, VIC 3053, Australia

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http:// 
www.beowulf.org/mailman/listinfo/beowulf




More information about the Beowulf mailing list