[Beowulf] Can one Infiniband net support MPI and a parallel filesystem?

Craig Tierney Craig.Tierney at noaa.gov
Tue Aug 12 11:09:28 PDT 2008


Chris Samuel wrote:
> ----- "I Kozin (Igor)" <i.kozin at dl.ac.uk> wrote:
> 
>>> Generally speaking, MPI programs will not be fetching/writing data
>>> from/to storage at the same time they are doing MPI calls so there
>>> tends to not be very much contention to worry about at the node
>>> level.
>> I tend to agree with this. 
> 
> But that assumes you're not sharing a node with other
> jobs that may well be doing I/O.
> 
> cheers,
> Chris

I am wondering, who shares nodes in cluster systems with
MPI codes?  We never have shared nodes for codes that need
multiple cores since be built our first SMP cluster
in 2001.  The contention for shared resources (like memory
bandwidth and disk IO) would lead to unpredictable code performance.
Also, a poorly behaved program can cause the other codes on
that node to crash (which we don't want).

Even at TACC (62000+ cores) with 16 cores per node, nodes
are dedicated to jobs.

Craig





-- 
Craig Tierney (craig.tierney at noaa.gov)



More information about the Beowulf mailing list