Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] first cluster [was [OMPI users] trouble using openmpi under slurm]

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Douglas Guptill douglas.guptill at dal.ca
Tue Jul 13 15:05:38 PDT 2010


Hello Gus, list:

On Fri, Jul 09, 2010 at 07:06:05PM -0400, Gus Correa wrote:
> Douglas Guptill wrote:
>> On Thu, Jul 08, 2010 at 09:43:48AM -0400, Gus Correa wrote:
>>> Douglas Guptill wrote:
>>>> On Wed, Jul 07, 2010 at 12:37:54PM -0600, Ralph Castain wrote:
>>>>
>>>>> No....afraid not. Things work pretty well, but there are places
>>>>> where things just don't mesh. Sub-node allocation in particular is
>>>>> an issue as it implies binding, and slurm and ompi have conflicting
>>>>> methods.
>>>>>
>>>>> It all can get worked out, but we have limited time and nobody cares
>>>>> enough to put in the effort. Slurm just isn't used enough to make it
>>>>> worthwhile (too small an audience).
>>>> I am about to get my first HPC cluster (128 nodes), and was
>>>> considering slurm.  We do use MPI.
>>>>
>>>> Should I be looking at Torque instead for a queue manager?
>>>>
>>> Hi Douglas
>>>
>>> Yes, works like a charm along with OpenMPI.
>>> I also have MVAPICH2 and MPICH2, no integration w/ Torque,
>>> but no conflicts either.
>>
>> Thanks, Gus.
>>
>> After some lurking and reading, I plan this:
>>   Debian (lenny)
>>   + fai                   - for compute-node operating system install
>>   + Torque                - job scheduler/manager
>>   + MPI (Intel MPI)       - for the application
>>   + MPI (OpenMP)          - alternative MPI
>>
>> Does anyone see holes in this plan?
>>
>> Thanks,
>> Douglas
>
>
> Hi Douglas
>
> I never used Debian, fai, or Intel MPI.
>
> We have two clusters with cluster management software, i.e.,
> mostly the operating system install stuff.
>
> I made a toy Rocks cluster out of old computers.
> Rocks is a minimum-hassle way to deploy and maintain a cluster.
> Of course you can do the same from scratch, or do more, or do better,
> which makes some people frown at Rocks.
> However, Rocks works fine, particularly if your network(s)
> is (are) Gigabit Ethernet,
> and if you don't mix different processor architectures (i.e. only i386  
> or only x86_64, although there is some support for mixed stuff).
> It is developed/maintained by UCSD under an NSF grant (I think).
> It's been around for quite a while too.
>
> You may want to take a look, perhaps experiment with a subset of your
> nodes before you commit:
>
> http://www.rocksclusters.org/wordpress/

I am sure Rocks suits many, but not me, at first glance.  I am too
much of a tinkerer.  That comes, partially, from starting this
business too earlier; my first computer was a Univac II - vacuum
tubes, no operating system.

> What is the interconnect/network hardware you have for MPI?
> Gigabit Ethernet?  Infiniband?  Myrinet? Other?

Infiniband - QLogic 12300-BS18

> If Infiniband you may need to add the OFED packages,

Gotcha.  Thanks.

> If you are going to handle a variety of different compilers, MPI  
> flavors, with various versions, etc, I recommend using the
> "Environment module" package.

My one user has requested that.

> I hope this helps.

A Big help.  Much appreciated.

Douglas.
-- 
  Douglas Guptill                       voice: 902-461-9749
  Research Assistant, LSC 4640          email: douglas.guptill at dal.ca
  Oceanography Department               fax:   902-494-3877
  Dalhousie University
  Halifax, NS, B3H 4J1, Canada




More information about the Beowulf mailing list