[Beowulf] 8p 16 core x86_64 systems
doug.lattman at L-3com.com
doug.lattman at L-3com.com
Tue Aug 12 11:35:42 PDT 2014
I have several 100 processes I send to all the cores via mpi from a simulation.
Presently, we have a fleet of computers to make up a boat load cores per computer.
VMware adds a layer of abstraction, which does not get me anything.
I need actual cores, running the same OS (netboot and shares via nfs).
The head node sends out to all cores through the mpi interface and all the nodes are transparent to the end user.
All the computers which net boot are diskless; so if I go from 4 processors to 8 processors and go from 2 u to 4u; I don't mind doubling my U to get 128 cores. At that point the energy savings, speed gain, reduced network traffic, memory efficiency increases; etc are huge.
The 64 core computer which replaced an 16 core computer is huge savings in memory overhead alone. The energy savings is also huge. We are now powering 64cores with the same power we powered 16cores.
My recollection was there was a HT connector which would allow us to bundle two motherboards to allow 8 cores to work together under one Redhat OS. Apparently, if I get the right numa chip and connectors I can get many more than 8 to tie more boards together?
From: "C. Bergström" [mailto:cbergstrom at pathscale.com]
Sent: Tuesday, August 12, 2014 2:14 PM
To: Joshua Mora
Cc: Lattman, Doug @ SSG - CAC; beowulf at beowulf.org
Subject: Re: [Beowulf] 8p 16 core x86_64 systems
On 08/13/14 01:10 AM, Joshua Mora wrote:
> Certainly my assumption/interpretation has been "what is the
> availability of the cheapest and largest SMP solution with full
> coherence in hardware that you can build".
> Notice I mention hardware based coherence since there are software
> based solutions available as well.
> If you need just plenty of cores at the highest core count density
> that you can get with small memory footprint per OS_instance/core(for
> instance, for consolidation/virtualization reasons), then you do not
> need coherence and a much wider range of solutions are available that
> can use other interconnects/fabrics.
Is consolidation/virtualization really applicable at all to HPC? I thought virtualbox/vmware and all other friends don't allow direct access to the x86 extensions. Disabling this, the overhead or any layer on top I think would cause unacceptable levels of performance hit.
(Someone please correct me if I'm wrong) The closest thing I can think which would play nice would be Solaris zones. (Which allow fine grained control of resources, but don't hide/limit the hw level capability)
I'm curious to see what Doug had in mind..
More information about the Beowulf