[Beowulf] Haswell as supercomputer microprocessors

Prentice Bisbal prentice.bisbal at rutgers.edu
Mon Aug 3 08:10:43 PDT 2015


The processor in the IBM BG/Q is actually a POWER A2.[1] I never 
understood why Top500 listed them as BQC. The POWER A2 processor 
actually has 18 cores: 16 for computations, 1 for the OS itself, and 1 
'spare'. I believe the spare is not a hot spare, but is there to 
increase the yield in chip manufacturing. If there are 18 usable cores 
on the chip, one is disabled. If one core is not usable, well, they 
still have the 17 they were hoping for. (This is what I heard, but I 
don't remember who the source was or how credible it was. If this is 
wrong, someone please correct me!).

I wouldn't core the for the OS redundant. It actually improves the 
performance of the total system, as documented by the well-known 'ASCI 
Q' paper [2].

Now to answer your question, the answer is yes. I highly recommend you 
read [2] for a good explanation of why (the authors did a better job 
explaining it than I can in a quick e-mail). However, the improvement in 
performance increases with the size of the cluster, so it probably won't 
be noticeable on small clusters.

In addition to dedicating a single core for the OS, you also want to 
reduce OS 'noise'  (also called 'jitter') as much as possible by 
reducing services on the head node. You can do this by turning off or 
uninstalling unnecessary services and building a custom kernel that has 
only the services and hardware support needed by your cluster. This is 
the idea being the very minimal kernel compute-node kernel (CNK) of the 
Blue Gene Nodes. This is an active area of research with many different 
groups working in this area:

https://en.wikipedia.org/wiki/Lightweight_Kernel_Operating_System
https://en.wikipedia.org/wiki/Compute_Node_Linux
http://www.mcs.anl.gov/research/projects/zeptoos/
http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=323279

[1] 
http://www-01.ibm.com/common/ssi/cgi-bin/ssialias?subtype=SP&infotype=PM&appname=STGE_DC_DC_USEN&htmlfid=DCD12345USEN&attachment=DCD12345USEN.PDF

[2] http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1592958


Prentice Bisbal
Systems Programmer/Administrator
Office of Instructional and Research Technology
Rutgers University
http://oirt.rutgers.edu

On 08/03/2015 05:06 AM, Mikhail Kuzminsky wrote:
> New special supercomputer microprocessors (like IBM Power BQC and 
> Fujitsu SPARC64 XIfx) have 2**N +2 cores (N=4 for 1st, N=5 for 2nd), 
> where 2 last cores are redundant, not for computations, but only for 
> other work w/Linux or even for replacing of failed computational core.
>
> Current Intel Haswell E5 v3 may also have 18 = 2**4 +2 cores.  Is 
> there some sense to try POWER BQC or SPARC64 XIfx ideas (not exactly), 
> and use only 16 Haswell cores for parallel computations ? If the 
> answer is "yes", then how to use this way under Linux ?
>
> Mikhail Kuzminsky,
> Zelinsky Institute of Organic Chemistry RAS,
> Moscow
>
>
>
>
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20150803/8b6c5738/attachment.html>


More information about the Beowulf mailing list