Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] Can one Infiniband net support MPI and a parallel filesystem?

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Ashley Pittman apittman at concurrent-thinking.com
Wed Aug 13 03:29:05 PDT 2008


On Tue, 2008-08-12 at 12:09 -0600, Craig Tierney wrote:
> Chris Samuel wrote:
> > ----- "I Kozin (Igor)" <i.kozin at dl.ac.uk> wrote:

> > But that assumes you're not sharing a node with other
> > jobs that may well be doing I/O.
> > 
> I am wondering, who shares nodes in cluster systems with
> MPI codes?

In my experience, almost everyone.  In practise though most jobs ask for
even numbers of CPU's so larger jobs rarely get scheduled this way.

>  We never have shared nodes for codes that need
> multiple cores since be built our first SMP cluster
> in 2001.  The contention for shared resources (like memory
> bandwidth and disk IO) would lead to unpredictable code performance.

Unpredictable maybe but if the alternative is to not run at all then
it's still a win.  What you wouldn't want is to have a small number of
processes in a big job sharing a node with a resource hogging job and
slow down the entire big job however I've never seen this happening in
the wild.

> Also, a poorly behaved program can cause the other codes on
> that node to crash (which we don't want).

It goes without saying that this shouldn't be able to happen.

Ashley.




More information about the Beowulf mailing list