[Beowulf] Varying performance across identical cluster nodes.
bcostescu at gmail.com
Wed Sep 13 22:59:15 PDT 2017
On Fri, Sep 8, 2017 at 8:41 PM, Prentice Bisbal <pbisbal at pppl.gov> wrote:
> I have a dozen servers that are all identical hardware: SuperMicro servers
> with AMD Opteron 6320 processors. Every since we upgraded to CentOS 6, the
> users have been complaining of wildly inconsistent performance across these
> 12 nodes. I ran LINPACK on these nodes, and was able to duplicate the
> problem, with performance varying from ~14 GFLOPS to 64 GFLOPS.
Are all these applications using MPI? And do you have /tmp also as
part of the NFS root? If so, try moving /tmp to a local filesystem or
direct the MPI lib to use a local directory instead (f.e. by setting
TMPDIR environment variable on all nodes).
More information about the Beowulf