From stuartb at 4gh.net Sat Aug 1 15:24:18 2015 From: stuartb at 4gh.net (Stuart Barkley) Date: Sat, 1 Aug 2015 18:24:18 -0400 (EDT) Subject: [Beowulf] Scheduler question -- non-uniform memory allocation to MPI In-Reply-To: <55BA4414.7050601@harvill.net> References: <55A91292.1030303@rutgers.edu> <55BA4414.7050601@harvill.net> Message-ID: On Thu, 30 Jul 2015 at 11:34 -0000, Tom Harvill wrote: > We run SLURM with cgroups for memory containment of jobs. When > users request resources on our cluster many times they will specify > the number of (MPI) tasks and memory per task. The reality of much > of the software that runs is that most of the memory is used by MPI > rank 0 and much less on slave processes. This is wasteful and > sometimes causes bad outcomes (OOMs and worse) during job runs. I'll note that this problem also can occur in Grid Engine and OpenMPI. We would get user reports of random job failures. Sometimes the job would run and other times it would fail. We normally run allowing shared node access and the cases I've seen with problems were with a highly fragmented cluster with tasks spread 1-2 per node. Having the job request exclusive nodes (8 cores) was generally enough to consolidate the qrsh processes from ~200 to ~50 which provided enough headroom on the master process. The times I've observed have been due to the MPI startup process which spawns a qrsh/ssh login from the master node to each of the slave nodes (multiple MPI ranks on a slave share the same qrsh connection). The memory for all of these qrsh processes on the master node can eventually add up to be enough to cause out of memory conditions. This "solution" (workaround) has been good enough for our impacted users so far. Eventually without other changes this problem will return and not have as simple a solution. Stuart -- I've never been lost; I was once bewildered for three days, but never lost! -- Daniel Boone From mikky_m at mail.ru Mon Aug 3 02:06:27 2015 From: mikky_m at mail.ru (=?UTF-8?B?TWlraGFpbCBLdXptaW5za3k=?=) Date: Mon, 03 Aug 2015 12:06:27 +0300 Subject: [Beowulf] =?utf-8?q?Haswell_as_supercomputer_microprocessors?= In-Reply-To: References: Message-ID: <1438592787.955484558@f398.i.mail.ru> New special supercomputer microprocessors (like IBM Power BQC and Fujitsu SPARC64 XIfx) have 2**N +2 cores (N=4 for 1st, N=5 for 2nd), where 2 last cores are redundant, not for computations, but only for other work w/Linux or even for replacing of failed computational core. Current Intel Haswell E5 v3 may also have 18 = 2**4 +2 cores.? Is there some sense to try POWER BQC or SPARC64 XIfx ideas (not exactly), and use only 16 Haswell cores for parallel computations ? If the answer is "yes", then how to use this way under Linux ? Mikhail Kuzminsky, Zelinsky Institute of Organic Chemistry RAS, Moscow -------------- next part -------------- An HTML attachment was scrubbed... URL: From j.sassmannshausen at ucl.ac.uk Mon Aug 3 03:59:00 2015 From: j.sassmannshausen at ucl.ac.uk (=?iso-8859-15?q?J=F6rg_Sa=DFmannshausen?=) Date: Mon, 3 Aug 2015 11:59:00 +0100 Subject: [Beowulf] Haswell as supercomputer microprocessors In-Reply-To: <1438592787.955484558@f398.i.mail.ru> References: <1438592787.955484558@f398.i.mail.ru> Message-ID: <201508031159.03015.j.sassmannshausen@ucl.ac.uk> Hi Mikhail, I would guess your queueing system could take care of that. With SGE you can define how many cores each node has. Thus, if you only want to use 16 out of the 18 cores you simply define that. Alternatively, at least OpenMPI allows you to underpopulate the nodes as well. Having said that, is there a good reason why you want to purchase 18 cores and then only use 16? The only thing I can think of why one needs to / wants to do that is if your job requires more memory which you got on the node. For memory intensive work I am still thinking that less cores and more nodes are beneficial here. My 2 cents from a sunny London J?rg On Monday 03 Aug 2015 10:06:27 Mikhail Kuzminsky wrote: > New special supercomputer microprocessors (like IBM Power BQC and Fujitsu > SPARC64 XIfx) have 2**N +2 cores (N=4 for 1st, N=5 for 2nd), where 2 last > cores are redundant, not for computations, but only for other work w/Linux > or even for replacing of failed computational core. > > Current Intel Haswell E5 v3 may also have 18 = 2**4 +2 cores. Is there > some sense to try POWER BQC or SPARC64 XIfx ideas (not exactly), and use > only 16 Haswell cores for parallel computations ? If the answer is "yes", > then how to use this way under Linux ? > > Mikhail Kuzminsky, > Zelinsky Institute of Organic Chemistry RAS, > Moscow -- ************************************************************* Dr. J?rg Sa?mannshausen, MRSC University College London Department of Chemistry Gordon Street London WC1H 0AJ email: j.sassmannshausen at ucl.ac.uk web: http://sassy.formativ.net Please avoid sending me Word or PowerPoint attachments. See http://www.gnu.org/philosophy/no-word-attachments.html -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 230 bytes Desc: This is a digitally signed message part. URL: From samuel at unimelb.edu.au Mon Aug 3 06:12:09 2015 From: samuel at unimelb.edu.au (Chris Samuel) Date: Mon, 03 Aug 2015 23:12:09 +1000 Subject: [Beowulf] Haswell as supercomputer microprocessors In-Reply-To: <1438592787.955484558@f398.i.mail.ru> References: <1438592787.955484558@f398.i.mail.ru> Message-ID: <1672352.DTB073LGgN@quad> On Mon, 3 Aug 2015 12:06:27 PM Mikhail Kuzminsky wrote: > Current Intel Haswell E5 v3 may also have 18 = 2**4 +2 cores. Is there some > sense to try POWER BQC or SPARC64 XIfx ideas (not exactly), and use only 16 > Haswell cores for parallel computations ? If the answer is "yes", then how > to use this way under Linux ? Doing this with Linux predates BGQ for instance - the whole cpuset idea came from SGI and was used on their Itanic Altix systems to provide a boot CPU set that would have all system processes confined into and then the rest of the cores were available for jobs. When we used to use Torque I agitated for cpuset support, and for it to be done in a way that would allow this. We use Slurm now, but I've never looked at how easy to make it work in the boot cpuset type mode - it's probably just a matter of telling it there are N-1 cores per node and ensuring that it doesn't try and claim the same core you're using as the boot cpuset. :-) Best of luck! Chris -- Christopher Samuel Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.org.au/ http://twitter.com/vlsci From John.Hearns at xma.co.uk Mon Aug 3 06:28:17 2015 From: John.Hearns at xma.co.uk (John Hearns) Date: Mon, 3 Aug 2015 13:28:17 +0000 Subject: [Beowulf] Haswell as supercomputer microprocessors In-Reply-To: <1672352.DTB073LGgN@quad> References: <1438592787.955484558@f398.i.mail.ru> <1672352.DTB073LGgN@quad> Message-ID: <3004B1DE9C157E4585DD4B35D316EDFDACD27C@ALXEXCHMB01.xma.co.uk> On Mon, 3 Aug 2015 12:06:27 PM Mikhail Kuzminsky wrote: > Current Intel Haswell E5 v3 may also have 18 = 2**4 +2 cores. Is > there some sense to try POWER BQC or SPARC64 XIfx ideas (not exactly), > and use only 16 Haswell cores for parallel computations ? If the > answer is "yes", then how to use this way under Linux ? Doing this with Linux predates BGQ for instance - the whole cpuset idea came from SGI and was used on their Itanic Altix systems to provide a boot CPU set that would have all system processes confined into and then the rest of the cores were available for jobs. When we used to use Torque I agitated for cpuset support, and for it to be done in a way that would allow this. We use Slurm now, but I've never looked at how easy to make it work in the boot cpuset type mode - it's probably just a matter of telling it there are N-1 cores per node and ensuring that it doesn't try and claim the same core you're using as the boot cpuset. :-) Plus one to Chris with cpusets. Cpusets not only on Itanium - I used them on a large memory UV system. I Can see more and more people speccing high memory x86 systems these days, and they certainly should be looking at using cpusets. I have often though we should have 'donkey engine' CPUs for HPC. I thought these were the small enginers which powered up very large shipboard engines. I may have that wrong! https://en.wikipedia.org/wiki/Steam_donkey As Mikhail says, you run the OS and the batch system daemons on there, leaving the rest of the CPUs for 100% flat out HPC work. ##################################################################################### Scanned by MailMarshal - M86 Security's comprehensive email content security solution. ##################################################################################### Any views or opinions presented in this email are solely those of the author and do not necessarily represent those of the company. Employees of XMA Ltd are expressly required not to make defamatory statements and not to infringe or authorise any infringement of copyright or any other legal right by email communications. Any such communication is contrary to company policy and outside the scope of the employment of the individual concerned. The company will not accept any liability in respect of such communication, and the employee responsible will be personally liable for any damages or other liability arising. XMA Limited is registered in England and Wales (registered no. 2051703). Registered Office: Wilford Industrial Estate, Ruddington Lane, Wilford, Nottingham, NG11 7EP From landman at scalableinformatics.com Mon Aug 3 06:37:19 2015 From: landman at scalableinformatics.com (Joe Landman) Date: Mon, 3 Aug 2015 09:37:19 -0400 Subject: [Beowulf] Haswell as supercomputer microprocessors In-Reply-To: <1438592787.955484558@f398.i.mail.ru> References: <1438592787.955484558@f398.i.mail.ru> Message-ID: <55BF6E8F.9040709@scalableinformatics.com> On 08/03/2015 05:06 AM, Mikhail Kuzminsky wrote: > New special supercomputer microprocessors (like IBM Power BQC and > Fujitsu SPARC64 XIfx) have 2**N +2 cores (N=4 for 1st, N=5 for 2nd), > where 2 last cores are redundant, not for computations, but only for > other work w/Linux or even for replacing of failed computational core. > > Current Intel Haswell E5 v3 may also have 18 = 2**4 +2 cores. Is there > some sense to try POWER BQC or SPARC64 XIfx ideas (not exactly), and use > only 16 Haswell cores for parallel computations ? If the answer is > "yes", then how to use this way under Linux ? Its possible to do this with some taskset incantation with cpuset filesystem bits (burnt offerings generally not needed). I don't think there are "redundant" cores in the Intel product. Its left as an exercise to the reader to implement though ... More seriously, you can do some of this also with cgroups https://en.wikipedia.org/wiki/Cgroups which is actually what Docker et al. do (in part). There are many ways to attack this problem. If you are trying to isolate the OS from the computation, say to reduce OS jitter impacts upon processes, you might also like work on setting interrupt affinity, as well as start working with memory placement directly (to minimize QPI usage). The issue you will encounter is that most of the HPC systems with a single HCA/NIC will require IO to/from a remote (in a NUMA sense) node. Which means going over QPI. Unless you have the Intel Infinipath (or Omnipath ... I am not as up on the new naming as I should be) or a multi-rail config set up specifically to put one NIC/HCA on each socket. The point I am trying (subtly) to make here is that you can possibly spend more time and effort on optimization here. The question is (and for the above) the relative value of this. For various codes, OS jitter is very important, and you should seek to eliminate it. For others ... not so much. -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics, Inc. e: landman at scalableinformatics.com w: http://scalableinformatics.com t: @scalableinfo p: +1 734 786 8423 x121 c: +1 734 612 4615 From prentice.bisbal at rutgers.edu Mon Aug 3 08:10:43 2015 From: prentice.bisbal at rutgers.edu (Prentice Bisbal) Date: Mon, 03 Aug 2015 11:10:43 -0400 Subject: [Beowulf] Haswell as supercomputer microprocessors In-Reply-To: <1438592787.955484558@f398.i.mail.ru> References: <1438592787.955484558@f398.i.mail.ru> Message-ID: <55BF8473.3050802@rutgers.edu> The processor in the IBM BG/Q is actually a POWER A2.[1] I never understood why Top500 listed them as BQC. The POWER A2 processor actually has 18 cores: 16 for computations, 1 for the OS itself, and 1 'spare'. I believe the spare is not a hot spare, but is there to increase the yield in chip manufacturing. If there are 18 usable cores on the chip, one is disabled. If one core is not usable, well, they still have the 17 they were hoping for. (This is what I heard, but I don't remember who the source was or how credible it was. If this is wrong, someone please correct me!). I wouldn't core the for the OS redundant. It actually improves the performance of the total system, as documented by the well-known 'ASCI Q' paper [2]. Now to answer your question, the answer is yes. I highly recommend you read [2] for a good explanation of why (the authors did a better job explaining it than I can in a quick e-mail). However, the improvement in performance increases with the size of the cluster, so it probably won't be noticeable on small clusters. In addition to dedicating a single core for the OS, you also want to reduce OS 'noise' (also called 'jitter') as much as possible by reducing services on the head node. You can do this by turning off or uninstalling unnecessary services and building a custom kernel that has only the services and hardware support needed by your cluster. This is the idea being the very minimal kernel compute-node kernel (CNK) of the Blue Gene Nodes. This is an active area of research with many different groups working in this area: https://en.wikipedia.org/wiki/Lightweight_Kernel_Operating_System https://en.wikipedia.org/wiki/Compute_Node_Linux http://www.mcs.anl.gov/research/projects/zeptoos/ http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=323279 [1] http://www-01.ibm.com/common/ssi/cgi-bin/ssialias?subtype=SP&infotype=PM&appname=STGE_DC_DC_USEN&htmlfid=DCD12345USEN&attachment=DCD12345USEN.PDF [2] http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1592958 Prentice Bisbal Systems Programmer/Administrator Office of Instructional and Research Technology Rutgers University http://oirt.rutgers.edu On 08/03/2015 05:06 AM, Mikhail Kuzminsky wrote: > New special supercomputer microprocessors (like IBM Power BQC and > Fujitsu SPARC64 XIfx) have 2**N +2 cores (N=4 for 1st, N=5 for 2nd), > where 2 last cores are redundant, not for computations, but only for > other work w/Linux or even for replacing of failed computational core. > > Current Intel Haswell E5 v3 may also have 18 = 2**4 +2 cores. Is > there some sense to try POWER BQC or SPARC64 XIfx ideas (not exactly), > and use only 16 Haswell cores for parallel computations ? If the > answer is "yes", then how to use this way under Linux ? > > Mikhail Kuzminsky, > Zelinsky Institute of Organic Chemistry RAS, > Moscow > > > > > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -------------- next part -------------- An HTML attachment was scrubbed... URL: From kilian.cavalotti.work at gmail.com Mon Aug 3 09:18:09 2015 From: kilian.cavalotti.work at gmail.com (Kilian Cavalotti) Date: Mon, 3 Aug 2015 09:18:09 -0700 Subject: [Beowulf] Haswell as supercomputer microprocessors In-Reply-To: <1438592787.955484558@f398.i.mail.ru> References: <1438592787.955484558@f398.i.mail.ru> Message-ID: Hi Mikhail, That's something you can achieve with Slurm, using what they call "Core Specialization". See http://slurm.schedmd.com/core_spec.html for details. Cheers, -- Kilian On Mon, Aug 3, 2015 at 2:06 AM, Mikhail Kuzminsky wrote: > New special supercomputer microprocessors (like IBM Power BQC and Fujitsu > SPARC64 XIfx) have 2**N +2 cores (N=4 for 1st, N=5 for 2nd), where 2 last > cores are redundant, not for computations, but only for other work w/Linux > or even for replacing of failed computational core. > > Current Intel Haswell E5 v3 may also have 18 = 2**4 +2 cores. Is there some > sense to try POWER BQC or SPARC64 XIfx ideas (not exactly), and use only 16 > Haswell cores for parallel computations ? If the answer is "yes", then how > to use this way under Linux ? > > Mikhail Kuzminsky, > Zelinsky Institute of Organic Chemistry RAS, > Moscow > > > > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -- Kilian From mikky_m at mail.ru Tue Aug 4 03:38:48 2015 From: mikky_m at mail.ru (=?UTF-8?B?TWlraGFpbCBLdXptaW5za3k=?=) Date: Tue, 04 Aug 2015 13:38:48 +0300 Subject: [Beowulf] =?utf-8?q?Haswell_as_supercomputer_microprocessors?= In-Reply-To: References: <1438592787.955484558@f398.i.mail.ru> Message-ID: <1438684728.117628822@f300.i.mail.ru> By my opinion, PowerPC A2 more exactly should be used as name for *core*, not for IBM? BlueGene/Q *processor chip*. "Power BQC" name is used in TOP500, GREEN500, in a lot of Internet data, in IBM journal - see: Sugavanam K. et al. Design for low power and power management in IBM Blue Gene/Q //IBM Journal of Research and Development. ? 2013. ?v. 57. ? ?. 1/2. ? p. 3: 1-3: 11. PowerPC A2 is the core, see //en.wikipedia.org/wiki/Blue_Gene ???????????????????????????????????????????????????? //en.wikipedia.org/wiki/PowerPC A2 Mikhail -------------- next part -------------- An HTML attachment was scrubbed... URL: From John.Hearns at xma.co.uk Tue Aug 4 06:34:19 2015 From: John.Hearns at xma.co.uk (John Hearns) Date: Tue, 4 Aug 2015 13:34:19 +0000 Subject: [Beowulf] CAP Theorem Message-ID: <3004B1DE9C157E4585DD4B35D316EDFDACD7ED@ALXEXCHMB01.xma.co.uk> I have been reading two interesting articles on Docker: http://blog.circleci.com/its-the-future/ http://blog.circleci.com/it-really-is-the-future/ The first one is a good laugh and was meant as a parody. I guess there may have have been discussions on CAP Theorem with relevance to HPC, and especially exascale systems. However the term is new to me. https://en.wikipedia.org/wiki/CAP_theorem I realise that it is relevant to distributed databases, but how about distributed computation? ________________________________ Scanned by MailMarshal - M86 Security's comprehensive email content security solution. ________________________________ Any views or opinions presented in this email are solely those of the author and do not necessarily represent those of the company. Employees of XMA Ltd are expressly required not to make defamatory statements and not to infringe or authorise any infringement of copyright or any other legal right by email communications. Any such communication is contrary to company policy and outside the scope of the employment of the individual concerned. The company will not accept any liability in respect of such communication, and the employee responsible will be personally liable for any damages or other liability arising. XMA Limited is registered in England and Wales (registered no. 2051703). Registered Office: Wilford Industrial Estate, Ruddington Lane, Wilford, Nottingham, NG11 7EP -------------- next part -------------- An HTML attachment was scrubbed... URL: From mndoci at gmail.com Tue Aug 4 06:52:25 2015 From: mndoci at gmail.com (Deepak Singh) Date: Tue, 4 Aug 2015 06:52:25 -0700 Subject: [Beowulf] CAP Theorem In-Reply-To: <3004B1DE9C157E4585DD4B35D316EDFDACD7ED@ALXEXCHMB01.xma.co.uk> References: <3004B1DE9C157E4585DD4B35D316EDFDACD7ED@ALXEXCHMB01.xma.co.uk> Message-ID: If you ever want to dive deep into how well various systems handle partitions look no further than Aphyr's Jepsen series https://aphyr.com/tags/jepsen > On Aug 4, 2015, at 06:34, John Hearns wrote: > > I have been reading two interesting articles on Docker: > > http://blog.circleci.com/its-the-future/ > http://blog.circleci.com/it-really-is-the-future/ > > The first one is a good laugh and was meant as a parody. > > I guess there may have have been discussions on CAP Theorem with relevance to HPC, and especially exascale systems. > However the term is new to me. https://en.wikipedia.org/wiki/CAP_theorem > > I realise that it is relevant to distributed databases, but how about distributed computation? > > Scanned by MailMarshal - M86 Security's comprehensive email content security solution. > > Any views or opinions presented in this email are solely those of the author and do not necessarily represent those of the company. Employees of XMA Ltd are expressly required not to make defamatory statements and not to infringe or authorise any infringement of copyright or any other legal right by email communications. Any such communication is contrary to company policy and outside the scope of the employment of the individual concerned. The company will not accept any liability in respect of such communication, and the employee responsible will be personally liable for any damages or other liability arising. XMA Limited is registered in England and Wales (registered no. 2051703). Registered Office: Wilford Industrial Estate, Ruddington Lane, Wilford, Nottingham, NG11 7EP > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -------------- next part -------------- An HTML attachment was scrubbed... URL: From prentice.bisbal at rutgers.edu Tue Aug 4 07:52:00 2015 From: prentice.bisbal at rutgers.edu (Prentice Bisbal) Date: Tue, 04 Aug 2015 10:52:00 -0400 Subject: [Beowulf] Haswell as supercomputer microprocessors In-Reply-To: <1438684728.117628822@f300.i.mail.ru> References: <1438592787.955484558@f398.i.mail.ru> <1438684728.117628822@f300.i.mail.ru> Message-ID: <55C0D190.3010105@rutgers.edu> Seriously? Why does IBM have to make everything so difficult? Take GPFS. It was originally called MMFS for Multimedia filesystem, then GPFS for General Parallel Filesystem. A couple of years ago they decided to market it as a hardware/software solution called GPFS Storage Server, or GSS. Apparently, that didn't have enough buzzwordiness to it, so they changed it to ESS, for Elastic Storage Server. As if that wasn't enough, then they had to confuse their current and future customers by changing the name yet again to Spectra-scale. And yes, I am annoyed by all this! What's really ironic is that IBM is one of the leaders brand management/corporate identity, so you'd think they'd see the value in sticking with a name. Rant over. Prentice On 08/04/2015 06:38 AM, Mikhail Kuzminsky wrote: > By my opinion, PowerPC A2 more exactly should be used as name for > *core*, not for IBM BlueGene/Q *processor chip*. > "Power BQC" name is used in TOP500, GREEN500, in a lot of Internet > data, in IBM journal - see: > > Sugavanam K. et al. Design for low power and power management in IBM > Blue Gene/Q //IBM Journal of Research and > Development. ? 2013. ?v. 57. ? ?. 1/2. ? p. 3: 1-3: 11. > > PowerPC A2 is the core, see //en.wikipedia.org/wiki/Blue_Gene > //en.wikipedia.org/wiki/PowerPC A2 > > Mikhail > > > > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -------------- next part -------------- An HTML attachment was scrubbed... URL: From samuel at unimelb.edu.au Tue Aug 4 17:50:19 2015 From: samuel at unimelb.edu.au (Christopher Samuel) Date: Wed, 05 Aug 2015 10:50:19 +1000 Subject: [Beowulf] Haswell as supercomputer microprocessors In-Reply-To: <55C0D190.3010105@rutgers.edu> References: <1438592787.955484558@f398.i.mail.ru> <1438684728.117628822@f300.i.mail.ru> <55C0D190.3010105@rutgers.edu> Message-ID: <55C15DCB.9040003@unimelb.edu.au> On 05/08/15 00:52, Prentice Bisbal wrote: > Seriously? Why does IBM have to make everything so difficult? As I understand it Power BQC is the SoC (so CPU, networking, etc), whereas A2 is the CPU core & instruction set. So it's a fair distinction (though one that is often glossed over in practice). -- Christopher Samuel Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.org.au/ http://twitter.com/vlsci From prentice.bisbal at rutgers.edu Wed Aug 5 07:38:09 2015 From: prentice.bisbal at rutgers.edu (Prentice Bisbal) Date: Wed, 05 Aug 2015 10:38:09 -0400 Subject: [Beowulf] Haswell as supercomputer microprocessors In-Reply-To: <55C15DCB.9040003@unimelb.edu.au> References: <1438592787.955484558@f398.i.mail.ru> <1438684728.117628822@f300.i.mail.ru> <55C0D190.3010105@rutgers.edu> <55C15DCB.9040003@unimelb.edu.au> Message-ID: <55C21FD1.5010704@rutgers.edu> On 08/04/2015 08:50 PM, Christopher Samuel wrote: > On 05/08/15 00:52, Prentice Bisbal wrote: > >> Seriously? Why does IBM have to make everything so difficult? > As I understand it Power BQC is the SoC (so CPU, networking, etc), > whereas A2 is the CPU core & instruction set. So it's a fair > distinction (though one that is often glossed over in practice). Okay. That makes perfect sense, but I will still argue that if that correct, using that terminology in the Top500 list doesn't make sense. An SoC is equivalent to a motherboard, but for regular Intel/AMD systems, they list the processor model, not the motherboard, so to list BQC instead of POWER A2 for the BG systems is inconsistent. As an example, compare the Sequoia description from the Top500 list to Stampede, which is just a regular x86 system: *Sequoia*: BlueGene/Q, Power BQC 16C 1.60 GHz, Custom, IBM *Stampede*: PowerEdge C8220, Xeon E5-2680 8C 2.700GHz, Infiniband FDR, Intel Xeon Phi SE10P, Dell Prentice -------------- next part -------------- An HTML attachment was scrubbed... URL: From jason at lovesgoodfood.com Fri Aug 7 07:38:59 2015 From: jason at lovesgoodfood.com (Jason Riedy) Date: Fri, 07 Aug 2015 10:38:59 -0400 Subject: [Beowulf] Haswell as supercomputer microprocessors References: <1438592787.955484558@f398.i.mail.ru> <1438684728.117628822@f300.i.mail.ru> <55C0D190.3010105@rutgers.edu> <55C15DCB.9040003@unimelb.edu.au> <55C21FD1.5010704@rutgers.edu> Message-ID: <871tffjh8c.fsf@qNaN.sparse.dyndns.org> And Prentice Bisbal writes: > Okay. That makes perfect sense, but I will still argue that if that > correct, using that terminology in the Top500 list doesn't make > sense. I less care about the terminology than that the linpack results are cut and paste between identical configurations rather than actually run on them. But I'm sure no large system has a poor cable or connection in the mix that would be detected... From prentice.bisbal at rutgers.edu Fri Aug 7 13:34:02 2015 From: prentice.bisbal at rutgers.edu (Prentice Bisbal) Date: Fri, 07 Aug 2015 16:34:02 -0400 Subject: [Beowulf] Haswell as supercomputer microprocessors In-Reply-To: <871tffjh8c.fsf@qNaN.sparse.dyndns.org> References: <1438592787.955484558@f398.i.mail.ru> <1438684728.117628822@f300.i.mail.ru> <55C0D190.3010105@rutgers.edu> <55C15DCB.9040003@unimelb.edu.au> <55C21FD1.5010704@rutgers.edu> <871tffjh8c.fsf@qNaN.sparse.dyndns.org> Message-ID: <55C5163A.3010300@rutgers.edu> On 08/07/2015 10:38 AM, Jason Riedy wrote: > And Prentice Bisbal writes: >> Okay. That makes perfect sense, but I will still argue that if that >> correct, using that terminology in the Top500 list doesn't make >> sense. > I less care about the terminology than that the linpack results > are cut and paste between identical configurations rather than > actually run on them. But I'm sure no large system has a poor > cable or connection in the mix that would be detected... > > Well, the terminology helps us to make sure we're comparing apples to apples! -- Prentice From hakon.bugge at gmail.com Sat Aug 8 06:08:09 2015 From: hakon.bugge at gmail.com (=?utf-8?B?SMOla29uIEJ1Z2dl?=) Date: Sat, 08 Aug 2015 15:08:09 +0200 Subject: [Beowulf] =?utf-8?q?Haswell_as_supercomputer_microprocessors?= Message-ID: <55c5ff39.a181700a.5edf9.6934@mx.google.com> Sorry for top posting. Jason has more than a valid point. At least in former times, I do know that cut&paste from not only _identical_ configurations happened. For example, the system submitted to the list was equipped with eth NICs, whereas the performance quoted was from a similar system, but with a proprietary HPC interconnect. So much for the apples-to-apples. I favour the SPEC suites when it comes to comparing systems, but with the caveat that vendors or customers of the larger system show little interrest. H?kon Sendt fra min HTC ----- Reply message ----- Fra: "Prentice Bisbal" Til: Emne: [Beowulf] Haswell as supercomputer microprocessors Dato: fre., aug. 7, 2015 22:34 On 08/07/2015 10:38 AM, Jason Riedy wrote: > And Prentice Bisbal writes: >> Okay. That makes perfect sense, but I will still argue that if that >> correct, using that terminology in the Top500 list doesn't make >> sense. > I less care about the terminology than that the linpack results > are cut and paste between identical configurations rather than > actually run on them. But I'm sure no large system has a poor > cable or connection in the mix that would be detected... > > Well, the terminology helps us to make sure we're comparing apples to apples! -- Prentice _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -------------- next part -------------- An HTML attachment was scrubbed... URL: From samuel at unimelb.edu.au Sat Aug 8 06:41:07 2015 From: samuel at unimelb.edu.au (Chris Samuel) Date: Sat, 08 Aug 2015 23:41:07 +1000 Subject: [Beowulf] Haswell as supercomputer microprocessors In-Reply-To: <871tffjh8c.fsf@qNaN.sparse.dyndns.org> References: <55C21FD1.5010704@rutgers.edu> <871tffjh8c.fsf@qNaN.sparse.dyndns.org> Message-ID: <3080805.ZaGZyPBPp7@quad> On Fri, 7 Aug 2015 10:38:59 AM Jason Riedy wrote: > I less care about the terminology than that the linpack results > are cut and paste between identical configurations rather than > actually run on them. But I'm sure no large system has a poor > cable or connection in the mix that would be detected... IIRC (not at work to check) HPL is actually part of the BGQ diagnostics; BGQ also has some very useful cable diagnostics that it monitors and flags broken wires up proactively (and has spares to work around them). But not really a beowulf system.. :-) -- Christopher Samuel Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.org.au/ http://twitter.com/vlsci From jason at lovesgoodfood.com Sat Aug 8 14:04:18 2015 From: jason at lovesgoodfood.com (Jason Riedy) Date: Sat, 08 Aug 2015 17:04:18 -0400 Subject: [Beowulf] Haswell as supercomputer microprocessors References: <55C21FD1.5010704@rutgers.edu> <871tffjh8c.fsf@qNaN.sparse.dyndns.org> <3080805.ZaGZyPBPp7@quad> Message-ID: <87h9o9mqzx.fsf@qNaN.sparse.dyndns.org> And Chris Samuel writes: > IIRC (not at work to check) HPL is actually part of the BGQ diagnostics; BGQ > also has some very useful cable diagnostics that it monitors and flags broken > wires up proactively (and has spares to work around them). And part of most acceptance tests, but those aren't the results reported on the list. The variance in commercial systems' results could be a useful reliability-like metric. From John.Hearns at xma.co.uk Thu Aug 20 06:24:14 2015 From: John.Hearns at xma.co.uk (John Hearns) Date: Thu, 20 Aug 2015 13:24:14 +0000 Subject: [Beowulf] Accelio Message-ID: <3004B1DE9C157E4585DD4B35D316EDFDAD251D@ALXEXCHMB01.xma.co.uk> I saw this mentioned on the Mellanox site. Has anyone come across it: http://www.accelio.org/ Looks interesting. Dr. John Hearns Principal HPC Engineer Product Development T: M: F: 01727 201 800 07432 647 511 01727 201 814 Visit us at www.xma.co.uk Follow us @WeareXMA XMA 7 Handley Page Way Old Parkbury Lane Colney Street St. Albans Hertfordshire AL2 2DQ [We are XMA.] [XMA] ________________________________ Scanned by MailMarshal - M86 Security's comprehensive email content security solution. ________________________________ Any views or opinions presented in this email are solely those of the author and do not necessarily represent those of the company. Employees of XMA Ltd are expressly required not to make defamatory statements and not to infringe or authorise any infringement of copyright or any other legal right by email communications. Any such communication is contrary to company policy and outside the scope of the employment of the individual concerned. The company will not accept any liability in respect of such communication, and the employee responsible will be personally liable for any damages or other liability arising. XMA Limited is registered in England and Wales (registered no. 2051703). Registered Office: Wilford Industrial Estate, Ruddington Lane, Wilford, Nottingham, NG11 7EP -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 4814 bytes Desc: image001.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.png Type: image/png Size: 2187 bytes Desc: image002.png URL: From e.scott.atchley at gmail.com Thu Aug 20 11:22:06 2015 From: e.scott.atchley at gmail.com (Scott Atchley) Date: Thu, 20 Aug 2015 14:22:06 -0400 Subject: [Beowulf] Accelio In-Reply-To: <3004B1DE9C157E4585DD4B35D316EDFDAD251D@ALXEXCHMB01.xma.co.uk> References: <3004B1DE9C157E4585DD4B35D316EDFDAD251D@ALXEXCHMB01.xma.co.uk> Message-ID: They are using this as a basis for the XioMessenger within Ceph to get RDMA support. On Thu, Aug 20, 2015 at 9:24 AM, John Hearns wrote: > I saw this mentioned on the Mellanox site. Has anyone come across it: > > > > http://www.accelio.org/ > > > > Looks interesting. > > > > > > > > Dr. John Hearns > Principal HPC Engineer > Product Development > > T: > M: > F: > > > 01727 201 800 > 07432 647 511 > 01727 201 814 > > > Visit us at www.xma.co.uk > Follow us @WeareXMA > > > *XMA* > 7 Handley Page Way > Old Parkbury Lane > Colney Street > St. Albans > Hertfordshire > AL2 2DQ > > > > [image: We are XMA.] > > > > [image: XMA] > > > > > > ------------------------------ > > Scanned by *MailMarshal* - M86 Security's comprehensive email content > security solution. > > ------------------------------ > Any views or opinions presented in this email are solely those of the > author and do not necessarily represent those of the company. Employees of > XMA Ltd are expressly required not to make defamatory statements and not to > infringe or authorise any infringement of copyright or any other legal > right by email communications. Any such communication is contrary to > company policy and outside the scope of the employment of the individual > concerned. The company will not accept any liability in respect of such > communication, and the employee responsible will be personally liable for > any damages or other liability arising. XMA Limited is registered in > England and Wales (registered no. 2051703). Registered Office: Wilford > Industrial Estate, Ruddington Lane, Wilford, Nottingham, NG11 7EP > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 4814 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.png Type: image/png Size: 2187 bytes Desc: not available URL: From jason at lovesgoodfood.com Thu Aug 20 13:32:46 2015 From: jason at lovesgoodfood.com (Jason Riedy) Date: Thu, 20 Aug 2015 16:32:46 -0400 Subject: [Beowulf] Accelio References: <3004B1DE9C157E4585DD4B35D316EDFDAD251D@ALXEXCHMB01.xma.co.uk> Message-ID: <87zj1l1z0x.fsf@qNaN.sparse.dyndns.org> And John Hearns writes: > I saw this mentioned on the Mellanox site. Has anyone come across it: > http://www.accelio.org/ Why have one when you can have many? http://www.openucx.org/ From jcownie at gmail.com Thu Aug 20 14:13:45 2015 From: jcownie at gmail.com (James Cownie) Date: Thu, 20 Aug 2015 22:13:45 +0100 Subject: [Beowulf] Accelio In-Reply-To: <87zj1l1z0x.fsf@qNaN.sparse.dyndns.org> References: <3004B1DE9C157E4585DD4B35D316EDFDAD251D@ALXEXCHMB01.xma.co.uk> <87zj1l1z0x.fsf@qNaN.sparse.dyndns.org> Message-ID: > On 20 Aug 2015, at 21:32, Jason Riedy wrote: > > And John Hearns writes: >> I saw this mentioned on the Mellanox site. Has anyone come across it: >> http://www.accelio.org/ > > Why have one when you can have many? http://www.openucx.org/ Indeed, though maybe at a slightly lower level : http://ofiwg.github.io/libfabric/ -- Jim James Cownie Mob: +44 780 637 7146 http://skiingjim.blogspot.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From samuel at unimelb.edu.au Sat Aug 29 04:07:00 2015 From: samuel at unimelb.edu.au (Chris Samuel) Date: Sat, 29 Aug 2015 21:07 +1000 Subject: [Beowulf] glibc 2.22 includes a vector math library (x86_64 initially) Message-ID: <4314980.u8PKEVSjLZ@quad> Hi all, Don't know if many people noticed this, but this looks like a handy new feature for glibc to get (from the release announcement): https://www.sourceware.org/ml/libc-alpha/2015-08/msg00609.html #* Added vector math library named libmvec with the following vectorized # x86_64 implementations: cos, cosf, sin, sinf, sincos, sincosf, log, logf, # exp, expf, pow, powf. More info on the glibc website: https://sourceware.org/glibc/wiki/libmvec # Libmvec is vector math library added in Glibc 2.22. # # Vector math library was added to support SIMD constructs of OpenMP4.0 # (#2.8 in http://www.openmp.org/mp-documents/OpenMP4.0.0.pdf) by # adding vector implementations of vector math functions. # # Vector math functions are vector variants of corresponding scalar math # operations implemented using SIMD ISA extensions (e.g. SSE or AVX for # x86_64). They take packed vector arguments, perform the operation on # each element of the packed vector argument, and return a packed vector # result. Using vector math functions is faster than repeatedly calling the # scalar math routines. All the best, Chris -- Christopher Samuel Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.org.au/ http://twitter.com/vlsci From nick.c.evans at gmail.com Mon Aug 31 19:54:19 2015 From: nick.c.evans at gmail.com (Nick Evans) Date: Tue, 1 Sep 2015 12:54:19 +1000 Subject: [Beowulf] Diagnosing Discovery issue xCat Message-ID: Hi All, I am sure i am just doing something silly as i haven't had an issue in the past getting nodes discovered via the switch port lookup method. Currently the newly booting node goes through the following steps Get IP Get the "xcat/xnba.kpxe" file Download the Genisis discovery environment and boot into it re-request IP get certificate initiate discovery This then loops never actually discovering. I have also attached the output of the messages file from the management node. Hardware is IBM dx360m4 node attached to Cisco WS-C3750G-48PS-S switch Any pointers on where to look for anything that might shed some light on this issue will be helpful. Also do i need to specifically get the MIBS file for the switch as i don't recall needing to to this in the past? Thanks in advance Nick -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- Sep 1 12:45:51 mgt dhcpd: DHCPDISCOVER from 40:f2:e9:04:79:e2 via em3 Sep 1 12:45:52 mgt dhcpd: DHCPOFFER on 10.10.200.1 to 40:f2:e9:04:79:e2 via em3 Sep 1 12:45:53 mgt dhcpd: DHCPREQUEST for 10.10.200.1 (10.10.100.79) from 40:f2:e9:04:79:e2 via em3 Sep 1 12:45:53 mgt dhcpd: DHCPACK on 10.10.200.1 to 40:f2:e9:04:79:e2 via em3 Sep 1 12:45:53 mgt in.tftpd[28421]: RRQ from 10.10.200.1 filename xcat/xnba.kpxe Sep 1 12:45:53 mgt in.tftpd[28421]: tftp: client does not accept options Sep 1 12:45:53 mgt in.tftpd[28422]: RRQ from 10.10.200.1 filename xcat/xnba.kpxe Sep 1 12:45:53 mgt dhcpd: DHCPDISCOVER from 40:f2:e9:04:79:e2 via em3 Sep 1 12:45:54 mgt dhcpd: DHCPOFFER on 10.10.200.2 to 40:f2:e9:04:79:e2 via em3 Sep 1 12:45:54 mgt dhcpd: DHCPREQUEST for 10.10.200.2 (10.10.100.79) from 40:f2:e9:04:79:e2 via em3 Sep 1 12:45:54 mgt dhcpd: DHCPACK on 10.10.200.2 to 40:f2:e9:04:79:e2 via em3 Sep 1 12:47:07 mgt dhcpd: DHCPDISCOVER from 40:f2:e9:04:79:e2 via em3 Sep 1 12:47:08 mgt dhcpd: DHCPOFFER on 10.10.200.3 to 40:f2:e9:04:79:e2 via em3 Sep 1 12:47:08 mgt dhcpd: DHCPREQUEST for 10.10.200.3 (10.10.100.79) from 40:f2:e9:04:79:e2 via em3 Sep 1 12:47:08 mgt dhcpd: DHCPACK on 10.10.200.3 to 40:f2:e9:04:79:e2 via em3 Sep 1 12:47:10 mgt xcat[33070]: xCAT: Allowing getcredentials x509cert Sep 1 12:47:46 mgt xcat[25330]: xcatd: Processing discovery request from 10.10.200.3 Sep 1 12:47:47 mgt xcat[39277]: Error communicating with ms-h25-mgtobm-1g-40: Unable to get MAC entries via either BRIDGE or Q-BRIDE MIB Sep 1 12:47:53 mgt xcat[25330]: xcatd: Processing discovery request from 10.10.200.3 Sep 1 12:47:59 mgt xcat[25330]: xcatd: Processing discovery request from 10.10.200.3 Sep 1 12:48:05 mgt xcat[25330]: xcatd: Processing discovery request from 10.10.200.3 Sep 1 12:48:11 mgt xcat[25330]: xcatd: Processing discovery request from 10.10.200.3 Sep 1 12:48:11 mgt xcat[39294]: Error communicating with ms-h25-mgtobm-1g-40: Unable to get MAC entries via either BRIDGE or Q-BRIDE MIB From samuel at unimelb.edu.au Mon Aug 31 20:43:11 2015 From: samuel at unimelb.edu.au (Christopher Samuel) Date: Tue, 1 Sep 2015 13:43:11 +1000 Subject: [Beowulf] Diagnosing Discovery issue xCat In-Reply-To: References: Message-ID: <55E51ECF.50604@unimelb.edu.au> Hi Nick, On 01/09/15 12:54, Nick Evans wrote: > Any pointers on where to look for anything that might shed some light > on this issue will be helpful. Also do i need to specifically get the > MIBS file for the switch as i don't recall needing to to this in the > past? I'm just bringing up a new cluster with xCAT and found that I was having issues with xCAT talking to the switches for discovery of the blade chassis. It turned out that whilst the documentation said that xCAT defaults to using SNMPv1 by default it actually takes the default of the underlying library and that now is SNMPv3. So we did: # tabdump switches #switch,snmpversion,username,password,privacy,auth,linkports,sshusername,sshpassword,protocol,switchtype,comments,disable "sw18","SNMPv1",,,,,,,,,"BNT",, You can tell for certain with wireshark or tcpdump. If that is the case for that you can just set it as above (of course you'll want "Cisco" instead of "BNT" for yours). Best of luck! Chris -- Christopher Samuel Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.org.au/ http://twitter.com/vlsci From nick.c.evans at gmail.com Mon Aug 31 21:21:21 2015 From: nick.c.evans at gmail.com (Nick Evans) Date: Tue, 1 Sep 2015 14:21:21 +1000 Subject: [Beowulf] Diagnosing Discovery issue xCat In-Reply-To: <55E51ECF.50604@unimelb.edu.au> References: <55E51ECF.50604@unimelb.edu.au> Message-ID: HI Chris, Thanks for the insight. My switch table is as follows #switch,snmpversion,username,password,privacy,auth,linkports,sshusername,sshpassword,protocol,switchtype,comments,disable "ms-h25-data-10g-42","SNMPv1",,,,,,,,,"Cisco",, "ms-h25-mgtobm-1g-40","SNMPv1",,,,,,,,,"Cisco",, I did origionaly have just "2c" for the snmpversion and have now tried 1, 2c, SNMPv1, SNMPv2c... All with now luck. Will have to get wire shark onto it and find out what is happening. Thanks Nick On 1 September 2015 at 13:43, Christopher Samuel wrote: > Hi Nick, > > On 01/09/15 12:54, Nick Evans wrote: > > > Any pointers on where to look for anything that might shed some light > > on this issue will be helpful. Also do i need to specifically get the > > MIBS file for the switch as i don't recall needing to to this in the > > past? > > I'm just bringing up a new cluster with xCAT and found that I was having > issues with xCAT talking to the switches for discovery of the blade > chassis. > > It turned out that whilst the documentation said that xCAT defaults to > using SNMPv1 by default it actually takes the default of the underlying > library and that now is SNMPv3. > > So we did: > > # tabdump switches > > #switch,snmpversion,username,password,privacy,auth,linkports,sshusername,sshpassword,protocol,switchtype,comments,disable > "sw18","SNMPv1",,,,,,,,,"BNT",, > > You can tell for certain with wireshark or tcpdump. > > If that is the case for that you can just set it as above (of course > you'll want "Cisco" instead of "BNT" for yours). > > Best of luck! > Chris > -- > Christopher Samuel Senior Systems Administrator > VLSCI - Victorian Life Sciences Computation Initiative > Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545 > http://www.vlsci.org.au/ http://twitter.com/vlsci > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -------------- next part -------------- An HTML attachment was scrubbed... URL: