[Beowulf] Odd Infiniband scaling behaviour
tom.elken at qlogic.com
Mon Oct 8 09:32:55 PDT 2007
> -----Original Message-----
> [mailto:beowulf-bounces at beowulf.org] On Behalf Of Chris Samuel
> Sent: Sunday, October 07, 2007 10:25 PM
> To: beowulf at beowulf.org
> Subject: [Beowulf] Odd Infiniband scaling behaviour
> Hi fellow Beowulfers..
> We're currently building an Opteron based IB cluster, and are
> seeing some rather peculiar behaviour that has had us puzzled
> for a while.
To give us more info about your "scaling" problem, can you tell us
1) the elapsed run-time of the four scenarios you mention (or relative
2) how you measured the CPU usage?
> If I take a CPU bound application, like NAMD, I can run an 8 CPU job
> on a single node and it pegs the CPUs at 100% (this is built using
> Charm++ configured as an MPI system and using MVAPICH 0.9.8p3
> with the Portland Group Compilers).
> If I then run 2 x 4 CPU jobs of the *same* problem, they all
> run at 50% CPU.
> If I run 4 x 2 CPU jobs, again the same problem, they run at 25%..
> ..and yes, if I run 8 x 1 CPU jobs they run at around 12-13% CPU!
> I then replicated the same problem with the example MPI cpi.c
> program, to rule out some odd behaviour in NAMD.
> What really surprised me was when testing CPI built using
> OpenMPI (which doesn't use IB on our system) the problem
> vanished and I could run 8 x 1 CPU jobs, each using 100%!
> So (at the moment) it looks like we're seeing some form of
> contention on the Infiniband adapter..
> 07:00.0 InfiniBand: Mellanox Technologies MT25204 [InfiniHost
> III Lx HCA] (rev a0)
> Subsystem: Mellanox Technologies MT25204 [InfiniHost
> III Lx HCA]
> Flags: fast devsel, IRQ 19
> Memory at feb00000 (64-bit, non-prefetchable) [size=1M]
> Memory at fd800000 (64-bit, prefetchable) [size=8M]
> Capabilities:  Power Management version 2
> Capabilities:  Vital Product Data
> Capabilities:  Message Signalled Interrupts:
> 64bit+ Queue=0/5 Enable-
> Capabilities:  MSI-X: Enable- Mask- TabSize=32
> Capabilities:  Express Endpoint IRQ 0
> We see this problem with the standard CentOS kernel, with the
> latest stable kernel (18.104.22.168) and with 2.6.23-rc9-git5
> (which completely rips out and replaced the CPU scheduler
> with Ingo Molnar's CFS).
> This is on a SuperMicro based system with AMD's Barcelona
> quad core CPU (1.9GHz), but I see the same behaviour (scaled
> down) on dual core Opterons too.
> I've looked at what "modinfo ib_mthca" says are the tuneable
> options, but the few I've played with ("msi_x" and
> "tune_pci") haven't made any noticeable difference, sadly..
> Has anyone else run into this or got any clues they could
> pass on please ?
> Christopher Samuel - (03) 9925 4751 - Systems Manager The
> Victorian Partnership for Advanced Computing P.O. Box 201,
> Carlton South, VIC 3053, Australia VPAC is a not-for-profit
> Registered Research Agency
More information about the Beowulf