[Beowulf] Fwd: warewulf - cannot log into nodes

Gus Correa gus at ldeo.columbia.edu
Tue Nov 27 10:56:10 PST 2012


On 11/27/2012 01:52 PM, Gus Correa wrote:
> On 11/27/2012 02:14 AM, Duke Nguyen wrote:
>> On 11/27/12 1:44 PM, Christopher Samuel wrote:
>>> -----BEGIN PGP SIGNED MESSAGE-----
>>> Hash: SHA1
>>>
>>> On 27/11/12 15:51, Duke Nguyen wrote:
>>>
>>>> Thanks! Yes, I am trying to get the system work with
>>>> Torque/Maui/OpenMPI now.
>>> Make sure you build Open-MPI with support for Torques TM interface,
>>> that will save you a lot of hassle as it means mpiexec/mpirun will
>>> find out directly from Torque what nodes and processors have been
>>> allocated for the job.
>>
>> Christopher, how would I check that? I got Torque/Maui/OpenMPI up,
>> working with root (not with normal user yet :( !!!), tried mpirun and it
>> worked fine:
>>

PS - Do 'qsub myjob' as a regular user, not as root.

>> # /usr/lib64/openmpi/bin/mpirun -pernode --hostfile
>> /home/mpiwulf/.openmpihostfile /home/mpiwulf/test/mpihello
>> Hello world! I am process number: 3 on host node0118
>> Hello world! I am process number: 1 on host node0104
>> Hello world! I am process number: 0 on host node0103
>> Hello world! I am process number: 2 on host node0117
>>
>> Thanks,
>>
>> D.
>
> D.
>
> Try to omit the hostfile from your mpirun command line,
> put it inside a Torque/PBS script, and submit it with qsub.
> Like this:
>
> *********************************
> myPBSScript.tcsh
> *********************************
> #! /bin/tcsh
> #PBS -l nodes=2:ppn=8 [Assuming your Torque 'nodes' file has np=8]
> #PBS -q batch at mycluster.mydomain
> #PBS -N hello
> @ NP = `cat $PBS_NODEFILE | wc -l`
> mpirun -np ${NP} ./mpihello
> *********************************
>
> $ qsub myPBSScript.tcsh
>
>
> If OpenMPI was built with Torque support,
> the job will run on the nodes/processors allocated by Torque.
> [The nodes/processors are listed in $PBS_NODEFILE,
> but you don't need to refer to it in the mpirun line if
> OpenMPI was built with Torque support. If OpenMPI lacks
> Torque support, then you can use $PBS_NODEFILE as your hostfile:
> mpirun -hostfile $PBS_NODEFILE.]
>
> If Torque was installed in a standard place, say under /usr,
> then OpenMPI configure will pick it up automatically.
> If not in a standard location, then add
> --with-tm=/torque/directory
> to the OpenMPI configure line.
> [./configure --help is your friend!]
>
> Another check:
>
> $ ompi_info [tons of output that you can grep for "tm" to see
> if Torque was picked up.]
>
> I hope this helps,
> Gus Correa





More information about the Beowulf mailing list