From cap at nsc.liu.se Thu Apr 1 00:59:10 2010 From: cap at nsc.liu.se (Peter Kjellstrom) Date: Thu, 1 Apr 2010 08:59:10 +0100 Subject: [Beowulf] AMD 6100 vs Intel 5600 In-Reply-To: References: <4BB369E1.60906@cora.nwra.com> <4BB3B5C6.8060001@cse.ucdavis.edu> Message-ID: <201004010959.15152.cap@nsc.liu.se> On Thursday 01 April 2010, Kilian CAVALOTTI wrote: > On Wed, Mar 31, 2010 at 10:51 PM, Bill Broadley wrote: > >> I would say that the 2x6-cores Magny-Cours probably has to be compared > >> to Nehalem-EX. > > > > Why? > > Maybe first because that's where the core spaces from AMD and Intel > intersect (8-cores Beckton and 8-cores Magny-Cours). I'm not sure it's > really significant to compare performance between a 6-cores Westmere > and a 12-cores Magny-Cours. I feel it makes more sense to compare > apples to apples, ie. same core count. I'm not convinced, is the number of cores more important that agg. performance and price? Also, if you turn on SMT/HT on a 6-core westmere it may appear very similar to a 12-core Magnycour (performance, appearance, price, ...). My experience is that in HPC it always boils down to price/performance and that would in my eyes make apples out of Magnycour and Westmere. /Peter -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part. URL: From eugen at leitl.org Thu Apr 1 01:39:18 2010 From: eugen at leitl.org (Eugen Leitl) Date: Thu, 1 Apr 2010 10:39:18 +0200 Subject: [Beowulf] CFP of the Energy-Aware High Performance Computing (EnA-HPC 2010) Conference Message-ID: <20100401083918.GH1964@leitl.org> ----- Forwarded message from Rolf Rabenseifner ----- From: Rolf Rabenseifner Date: Thu, 1 Apr 2010 10:07:57 +0200 (CEST) To: eugen at leitl.org Subject: CFP of the Energy-Aware High Performance Computing (EnA-HPC 2010) Conference -------------------------------------------------------------------------------- CALL FOR PAPERS 1st International Conference on Energy-Aware High Performance Computing (EnA-HPC 2010) http://www.ena-hpc.org/ Hamburg, Germany, September 8-10th, 2010 Submission deadline: May 2nd, 2010 -------------------------------------------------------------------------------- Power provisioning and energy consumption become major challenges in the field of high performance computing. Energy costs over the lifetime of an HPC installation are in the range of the acquisition costs. Green IT became the latest hype and promises to solve the problem, however, it is beyond the realm of HPC. The greening of HPC is a new research field that attracts many scientists. Up to now we see different approaches on different abstraction levels in an HPC environment. For example, vendors work on power efficient processor architectures and software developers on mechanisms of how to trigger them. However, there is no integrated approach yet that would show ways of how to operate an HPC environment in an energy efficient way. The First Conference on Energy-Aware High Performance Computing (EnA-HPC) aims at bringing together researchers, developers, and users to discuss the energy issue in HPC and to present novel solutions to tackle the problem of energy efficiency. Through the presentation of contributed papers, poster presentations, and invited talks, attendees will have the opportunity to share ideas and experiences to contribute to the improvement of energy efficiency in high performance computing. Topics of interest for the conference include, but are not limited to: * Modelling: How can we model the overall energy consumption of an HPC environment for given applications? * Simulation: How can we simulate the behavior of energy saving concepts for a given HPC environment and a given program? * Benchmarking: How can we benchmark the program/architecture energy efficiency? * Measurement: How can we measure relevant data in the hardware/software environment? * Analysis: How can we understand the measured data and deduce means to mitigate the energy problem? * Deployment of mechanisms: How can we reduce energy consumption by changing the HW/SW-environment? * Deployment of new hardware: Energy optimized processors, network components, storage components etc. * Facility issues: How can we optimze our computer room for optimal power efficiency? IMPORTANT DATES Full Paper Submission May 2nd, 2010 Acceptance Notification May 20th, 2010 Camera-Ready Submission June 5th, 2010 Poster Submission to be announced EnA-HPC Conference Sep 8-10th, 2010 For further Information please see the conference website: http://www.ena-hpc.org/ General Chair: Thomas Ludwig, University of Hamburg, Germany Program Committee: Cosimo Anglano, Universita del Piemonte Orientale, Alessandria, Italy Wu-chun Feng, Virginia Tech, Blacksburg, VA, USA Laurent Lefevre, INRIA, Ecole Normale Superieure, Lyon, France Matthias Mueller, Technische Universitaet Dresden, Germany Jean-Marc Pierson, Universit?t Paul Sabatier, Tolouse, France Erich Strohmeier, Lawrence Berkeley National Laboratory, Berkeley, CA, USA Local Organizing Comittee Michaele Hensel, German Climate Computing Centre (DKRZ), Hamburg, Germany Julian Kunkel, German Climate Computing Centre (DKRZ), Hamburg, Germany Michael Kuhn, University of Hamburg, Germany Timo Minartz, University of Hamburg, Germany Olga Mordvinova, SAP, Walldorf, Germany ----- End forwarded message ----- -- Eugen* Leitl leitl http://leitl.org ______________________________________________________________ ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org 8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE From john.hearns at mclaren.com Thu Apr 1 03:27:36 2010 From: john.hearns at mclaren.com (Hearns, John) Date: Thu, 1 Apr 2010 11:27:36 +0100 Subject: [Beowulf] AMD 6100 vs Intel 5600 In-Reply-To: References: <4BB369E1.60906@cora.nwra.com><4BB3B5C6.8060001@cse.ucdavis.edu> Message-ID: <68A57CCFD4005646957BD2D18E60667B0FEFA17A@milexchmb1.mil.tagmclarengroup.com> > > > Various vendors try various strategies to differentiate products > based > > on features. ?For the most part HPC types care about performance per > $, > > performance per watt, and reliability. ?I'd be pretty surprised to > see large > > HPC cluster built out of Nehalem-EX chips. Look at the announcement yesterday of the SGI UV 10 - 4xNehalem EX and 512Gbytes memory in a 4U box. There will be similar spec boxes from other vendors. I can see this being a very attractive workgroup solution. There's a very good recent Linux Journal article by Doug Eadline - where he discusses the future direction of clusters (*) Many workgroups have codes which scale to these 32- and 48-core sizes - why have a humungous cluster with expensive interconnects when you can run a 32-way job on an SMP machine with a decent amount of RAM? So my present dream system - a rack of 10 Ultraviolets, connected by 10gig Ethernet to a Blade systems rack top switch. In a 42 U rack that leaves me with a 1U for a batch master/login/PXE boot node. Connect it across to a rack of Panasas shelves, similarly with a 10gig racktop switch and you have a pretty powerful system - set your scheduler up to farm out jobs to each of these fat SMP nodes. If you do have a call for a bigger core count you can run as a cluster over the 10gig links. (*)http://www.linux-mag.com/id/7731 "These numbers are confirmed by a poll from ClusterMoney.net where 55% of those surveyed used 32 or less cores for their applications. When the clouds start forming around 48-core servers using the imminent Magny Cours processor from AMD many applications may fit on one server and thus eliminate the variability of server-to-server communication." The contents of this email are confidential and for the exclusive use of the intended recipient. If you receive this email in error you should not copy it, retransmit it, use it or disclose its contents but should return it to the sender immediately and delete your copy. From kilian.cavalotti.work at gmail.com Thu Apr 1 06:13:28 2010 From: kilian.cavalotti.work at gmail.com (Kilian CAVALOTTI) Date: Thu, 1 Apr 2010 15:13:28 +0200 Subject: [Beowulf] AMD 6100 vs Intel 5600 In-Reply-To: <4BB36C0E.1010502@atlanticlinux.ie> References: <4BB369E1.60906@cora.nwra.com> <4BB36C0E.1010502@atlanticlinux.ie> Message-ID: On Wed, Mar 31, 2010 at 5:36 PM, stephen mulcahy wrote: > http://www.anandtech.com/show/2978/amd-s-12-core-magny-cours-opteron-6174-vs-intel-s-6-core-xeon/10 And an other one here: http://www.bit-tech.net/hardware/cpus/2010/03/31/amd-opteron-6174-vs-intel-xeon-x5650-review/ Cheers, -- Kilian From tom.elken at qlogic.com Thu Apr 1 08:58:41 2010 From: tom.elken at qlogic.com (Tom Elken) Date: Thu, 1 Apr 2010 08:58:41 -0700 Subject: [Beowulf] AMD 6100 vs Intel 5600 In-Reply-To: <4BB3BBAE.5070802@pathscale.com> References: <4BB369E1.60906@cora.nwra.com> <4BB3B5C6.8060001@cse.ucdavis.edu> <4BB3BBAE.5070802@pathscale.com> Message-ID: <35AAF1E4A771E142979F27B51793A48887371CBE9D@AVEXMB1.qlogic.org> C. Bergstrom wrote: > While for my own selfish reasons I'm happy AMD may have some chance at > a comeback, but... I caution everyone to please ignore SPEC* as any > indicator of performance.. The non-profit Standard Performance Evaluation Corporation and volunteers who work for it (including me) would be pretty dismayed that anyone believes that it has no relevance to performance evaluation. I think Killion's reference to " Some SPEC results are being posted on http://www.spec.org/cpu2006/results/res2010q1/ " was entirely appropriate and useful for this thread about CPU performance. In this forum, people rightly and most-often quote SPEC rate metrics (e.g. SPECfp_rate_base2006 ) which test the performance of all sockets/cores on a system. The speed metrics (SPECfp_peak2006, SPECint_peak2006) are more difficult to interpret, since with many of the codes you are evaluating single-core, single-thread performance. But auto-parallel compiler flags are allowed, and not all compiler vendors have been equally successful in making those parallel flags work on the SPEC suite. So on the peak speed metrics you may be comparing one core of one CPU brand with multiple cores of another CPU brand if you are not careful. Certainly the SPEC benchmarks are not perfect. There is a certain time period needed to develop a new version of the SPEC CPU suite, as you see from the names of the most recent versions: CPU95, CPU2000, and CPU2006. As you get more years since the last version, the CPU and compiler vendors have had more time to optimize to the current suite of benchmarks. Those vendors who have more resources can spend more on this optimization to the suite. But a new benchmark suite is in development (I am not part of that effort this time, so I don't know when it will be available ... I know they are doing some ambitious things), and it will, once again, be more of an indication of CPU/compiler performance without years of optimization opportunity. But still, for a broad-based applications-based benchmark suite to evaluate single-system & CPU performance, it's tough to beat SPEC CPU2006. And for multiple-node, cluster performance, I would like to plug SPEC MPI2007. Certainly if you have the time and resources to benchmark your apps. with several systems and compilers, that is the most relevant data. That's not a luxury everyone has. And even if you can do that, the SPEC results help to narrow the set of CPUs/systems you might want to evaluate in more depth. -Tom > This will be especially true for any > benchmarks based on AMD's compiler. Your code will always be the best > benchmark and I'm happy to assist anyone offlist that needs help > getting > unbiased numbers. > > Best, > > ./C > > > #pathscale - irc.freenode.net > CTOPathScale - twitter > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin > Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf From joshua_mora at usa.net Thu Apr 1 02:28:20 2010 From: joshua_mora at usa.net (Joshua mora acosta) Date: Thu, 01 Apr 2010 04:28:20 -0500 Subject: [Beowulf] AMD 6100 vs Intel 5600 Message-ID: <939oDaJbu2726S12.1270114100@cmsweb12.cms.usa.net> It does not make sense to come up with a general/wide statement of product A better than product B and or product C. Each architecture/solution has its strong points and its weak points wrt others _for_a_given_feature. There is also certain level of overlapping of features between those solutions, hence the challenge in coming up with a unique solution to your needs. The fastest way without much analysis to decide what is the best is by running your own application or kernel on A,B and C on configurations that meet your requirements (usually 1st performance, 2nd price, 3rd capacity, 4rth power consumption,...order actually varies and combinations of those are also used). Each company though tries to do as clear as possible the positioning of their products by competitive analysis (benchmarketing) and business success cases. Understanding well each product it is very easy to come up with a specific benchmark or a "family" of benchmarks around the same way of stressing the system, for each of those products that make it look unbeatable. But that result can't be extrapolated nor is representative of the whole thing. With respect to SPEC benchmarks, there is in my opinion a tremendous effort in several directions that I consider valuable: "it tries" to come up with a set of representative real workloads so users can identify their own application's behavior on one or two benchmarks. Looking at the single final number/score isn't helpful for a customer that is running a single application, but it could be meaningful if you are running a rich variety of applications. SPEC is also a very fair assessment of architectures through careful benchmark designs and the review of results are done by a group where it gets to be analyzed how it correlates the performance of each benchmark with the features of the architectures. The benchmarks evolve with the architectures in order to show off the new features and to provide meaningful information to the decision makers. It also shows off the software technologies (compilers,OS,math libraries,...system settings) that allow you to exploit those systems in its best way. SPEC (rate,openmp,mpi,power..) is going to give you some good amount of information, HPCC is quite simple and provides "extreme stressing", kind of providing some boundaries on performance on a given direction. HPL for instance gives you a boundary on power consumption. Again, for the decision maker, you need to run your own workload and nail it down to the <2% error on each metric you are interested. Therefore, let the decision maker run their own benchmark and if they want to do the exercise to correlate their own benchmark with other things, I am sure there will be some good learning on why it is being used a certain benchmark on a given architecture and the statements that can be claimed and under what restrictions. Regards, Joshua ------ Original Message ------ Received: 03:09 AM CDT, 04/01/2010 From: Peter Kjellstrom To: beowulf at beowulf.orgCc: Subject: Re: [Beowulf] AMD 6100 vs Intel 5600 > On Thursday 01 April 2010, Kilian CAVALOTTI wrote: > > On Wed, Mar 31, 2010 at 10:51 PM, Bill Broadley > wrote: > > >> I would say that the 2x6-cores Magny-Cours probably has to be compared > > >> to Nehalem-EX. > > > > > > Why? > > > > Maybe first because that's where the core spaces from AMD and Intel > > intersect (8-cores Beckton and 8-cores Magny-Cours). I'm not sure it's > > really significant to compare performance between a 6-cores Westmere > > and a 12-cores Magny-Cours. I feel it makes more sense to compare > > apples to apples, ie. same core count. > > I'm not convinced, is the number of cores more important that agg. performance > and price? Also, if you turn on SMT/HT on a 6-core westmere it may appear > very similar to a 12-core Magnycour (performance, appearance, price, ...). > > My experience is that in HPC it always boils down to price/performance and > that would in my eyes make apples out of Magnycour and Westmere. > > /Peter > > --------------------------------------------- > Attachment:?signature.asc > MIME Type:?application/pgp-signature > --------------------------------------------- From jerker at Update.UU.SE Thu Apr 1 05:34:59 2010 From: jerker at Update.UU.SE (Jerker Nyberg) Date: Thu, 1 Apr 2010 14:34:59 +0200 (CEST) Subject: [Beowulf] AMD 6100 vs Intel 5600 In-Reply-To: <201004010959.15152.cap@nsc.liu.se> References: <4BB369E1.60906@cora.nwra.com> <4BB3B5C6.8060001@cse.ucdavis.edu> <201004010959.15152.cap@nsc.liu.se> Message-ID: On Thu, 1 Apr 2010, Peter Kjellstrom wrote: > My experience is that in HPC it always boils down to price/performance and > that would in my eyes make apples out of Magnycour and Westmere. I just ordered two desktop systems with Intel i7-860 2.8 GHz QC and 16 GB RAM for evaluation, to run as computation nodes for our CPU-bound batchlike application. I'll figure out the performance/price later but it seems to be significantly better than Xeons from the same vendor. It feels like 15 years ago all over again. I guess the major drawback is the lack of ECC RAM, so maybe they get their second life as ordinary desktops sooner rather than later... Regards, Jerker Nyberg. From bill at cse.ucdavis.edu Thu Apr 1 10:21:08 2010 From: bill at cse.ucdavis.edu (Bill Broadley) Date: Thu, 01 Apr 2010 10:21:08 -0700 Subject: [Beowulf] AMD 6100 vs Intel 5600 In-Reply-To: References: <4BB369E1.60906@cora.nwra.com> <4BB3B5C6.8060001@cse.ucdavis.edu> <201004010959.15152.cap@nsc.liu.se> Message-ID: <4BB4D604.7000107@cse.ucdavis.edu> On 04/01/2010 05:34 AM, Jerker Nyberg wrote: > On Thu, 1 Apr 2010, Peter Kjellstrom wrote: > >> My experience is that in HPC it always boils down to price/performance >> and >> that would in my eyes make apples out of Magnycour and Westmere. > > I just ordered two desktop systems with Intel i7-860 2.8 GHz QC and 16 > GB RAM for evaluation, to run as computation nodes for our CPU-bound > batchlike application. I'll figure out the performance/price later but > it seems to be significantly better than Xeons from the same vendor. It > feels like 15 years ago all over again. At least the premium is smaller these days. i7-860 has a market price of around $280. The equiv xeon is 2.66 GHz and costs $270 and supports ECC. From bill at cse.ucdavis.edu Thu Apr 1 10:30:34 2010 From: bill at cse.ucdavis.edu (Bill Broadley) Date: Thu, 01 Apr 2010 10:30:34 -0700 Subject: [Beowulf] AMD 6100 vs Intel 5600 In-Reply-To: <201004010959.15152.cap@nsc.liu.se> References: <4BB369E1.60906@cora.nwra.com> <4BB3B5C6.8060001@cse.ucdavis.edu> <201004010959.15152.cap@nsc.liu.se> Message-ID: <4BB4D83A.5080007@cse.ucdavis.edu> On 04/01/2010 12:59 AM, Peter Kjellstrom wrote: > I'm not convinced, is the number of cores more important that agg. performance > and price? Also, if you turn on SMT/HT on a 6-core westmere it may appear > very similar to a 12-core Magnycour (performance, appearance, price, ...). I'd be interested in the details if it works out that way. I've seen HT help on real codes on the order of 10-20%. I've not seen it help anywhere close to a factor of 2. The spec results posted use hypethreading and still showed a substantial Magnycour advantage. The main advantage I've seen is that a dual socket intel node with 2 sockets can handle 16 jobs with 1.5G each or 8 jobs with 3.0 GB each without a significant throughput penalty. On AMD systems you lose approximately half the throughput. > My experience is that in HPC it always boils down to price/performance and > that would in my eyes make apples out of Magnycour and Westmere. Agreed. Large number of sockets or large NUMA systems seem to be more specialized tools because of their higher cost. Justified by some applications, but not my general HPC workloads. From orion at cora.nwra.com Thu Apr 1 15:26:35 2010 From: orion at cora.nwra.com (Orion Poplawski) Date: Thu, 01 Apr 2010 16:26:35 -0600 Subject: [Beowulf] AMD 6100 vs Intel 5600 In-Reply-To: <4BB4D604.7000107@cse.ucdavis.edu> References: <4BB369E1.60906@cora.nwra.com> <4BB3B5C6.8060001@cse.ucdavis.edu> <201004010959.15152.cap@nsc.liu.se> <4BB4D604.7000107@cse.ucdavis.edu> Message-ID: <4BB51D9B.7080801@cora.nwra.com> On 04/01/2010 11:21 AM, Bill Broadley wrote: > On 04/01/2010 05:34 AM, Jerker Nyberg wrote: >> I just ordered two desktop systems with Intel i7-860 2.8 GHz QC and 16 >> GB RAM for evaluation, to run as computation nodes for our CPU-bound >> batchlike application. I'll figure out the performance/price later but >> it seems to be significantly better than Xeons from the same vendor. It >> feels like 15 years ago all over again. > > At least the premium is smaller these days. i7-860 has a market price of > around $280. The equiv xeon is 2.66 GHz and costs $270 and supports ECC. So the "premium" now may be the difference between a simple tower case and PS and a nice high density rack configuration. -- Orion Poplawski Technical Manager 303-415-9701 x222 NWRA/CoRA Division FAX: 303-415-9702 3380 Mitchell Lane orion at cora.nwra.com Boulder, CO 80301 http://www.cora.nwra.com From hahn at mcmaster.ca Thu Apr 1 21:48:54 2010 From: hahn at mcmaster.ca (Mark Hahn) Date: Fri, 2 Apr 2010 00:48:54 -0400 (EDT) Subject: [Beowulf] AMD 6100 vs Intel 5600 In-Reply-To: <35AAF1E4A771E142979F27B51793A48887371CBE9D@AVEXMB1.qlogic.org> References: <4BB369E1.60906@cora.nwra.com> <4BB3B5C6.8060001@cse.ucdavis.edu> <4BB3BBAE.5070802@pathscale.com> <35AAF1E4A771E142979F27B51793A48887371CBE9D@AVEXMB1.qlogic.org> Message-ID: >> While for my own selfish reasons I'm happy AMD may have some chance at >> a comeback, but... I caution everyone to please ignore SPEC* as any >> indicator of performance.. > > The non-profit Standard Performance Evaluation Corporation and volunteers > who work for it (including me) would be pretty dismayed that anyone > believes that it has no relevance to performance evaluation. SPEC self-limits its relevance by refusing to recognize that it should be open-source. being open-hostile means that it has very limited numbers of data points, very minimalistic UI (let alone data mining tools), and perhaps most importantly, slow adaptation to changes in how machines are used (memory footprint, etc). From sabujp at gmail.com Fri Apr 2 04:45:34 2010 From: sabujp at gmail.com (Sabuj Pattanayek) Date: Fri, 2 Apr 2010 05:45:34 -0600 Subject: [Beowulf] AMD 6100 vs Intel 5600 In-Reply-To: References: <4BB369E1.60906@cora.nwra.com> <4BB3B5C6.8060001@cse.ucdavis.edu> <4BB3BBAE.5070802@pathscale.com> <35AAF1E4A771E142979F27B51793A48887371CBE9D@AVEXMB1.qlogic.org> Message-ID: > SPEC self-limits its relevance by refusing to recognize that it should be > open-source. ?being open-hostile means that it has very limited numbers > of data points, very minimalistic UI (let alone data mining tools), and > perhaps most importantly, slow adaptation to changes in how machines > are used (memory footprint, etc). I quite like the phoronix test suite http://www.phoronix-test-suite.com/ From mathog at caltech.edu Fri Apr 2 09:15:11 2010 From: mathog at caltech.edu (David Mathog) Date: Fri, 02 Apr 2010 09:15:11 -0700 Subject: [Beowulf] test network link quality? Message-ID: Is there a common method for testing the quality of a network link between two networked machines? This is for situations where the link works 99.99% of the time, but should work 99.99999% of the time, with the failures being dropped packets or whatever. This would be used for tracking down slightly defective patch cables, switch ports, NICs, and the like. Is ping used like this: ping -i .0001 -c 1000000 -f targetmachine adequate? It gives decent statistics, but doesn't seem like a very good simulation of a typical network load, in particular, the packet contents aren't varying. Thanks, David Mathog mathog at caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech From tom.elken at qlogic.com Fri Apr 2 09:25:17 2010 From: tom.elken at qlogic.com (Tom Elken) Date: Fri, 2 Apr 2010 09:25:17 -0700 Subject: [Beowulf] AMD 6100 vs Intel 5600 In-Reply-To: References: <4BB369E1.60906@cora.nwra.com> <4BB3B5C6.8060001@cse.ucdavis.edu> <4BB3BBAE.5070802@pathscale.com> <35AAF1E4A771E142979F27B51793A48887371CBE9D@AVEXMB1.qlogic.org> Message-ID: <35AAF1E4A771E142979F27B51793A48887371CBFAE@AVEXMB1.qlogic.org> > > The non-profit Standard Performance Evaluation Corporation and > volunteers > > who work for it (including me) would be pretty dismayed that anyone > > believes that it has no relevance to performance evaluation. > > SPEC self-limits its relevance by refusing to recognize that it should > be open-source. being open-hostile means that it has very limited > numbers > of data points, Yup, only 9,335 submissions indexed on this page: http://www.spec.org/cpu2006/results/cpu2006.html I'm not getting into a debate. I'm glad there are open source benchmarks available too. I'll just provide some facts and let readers decide for themselves. > very minimalistic UI (let alone data mining tools), It is a limited text-based UI -- that runs on Linux, Windows, and proprietary Unixes. Portability was/is a major goal of SPEC. The search form: http://www.spec.org/cgi-bin/osgresults?conf=cpu2006&op=form is useful for data mining. > and perhaps most importantly, slow adaptation to changes in how > machines > are used (memory footprint, etc). True. But SPEC MPI2007 v2.0 is a 2.4 GB package of software [larger datasets, and a 128 GB minimum RAM / 64 cores (min) requirement to run the Large suite; the Medium suite (16 GB, ~8 core minimum) is still part of 2.0]. I just downloaded Phoronix, and the tarball was ~450KB. It looks like a good collection of tests. Like HPC Challenge, and NAS Parallel, it does not provide a single number as a metric of performance. There are always compromises and knashing of teeth in coming up with a formula for that single number, but SPEC and the Linpack/Top500 maintainers have found that people like it. -Tom From jmdavis1 at vcu.edu Fri Apr 2 09:51:06 2010 From: jmdavis1 at vcu.edu (Mike Davis) Date: Fri, 02 Apr 2010 12:51:06 -0400 Subject: [Beowulf] test network link quality? In-Reply-To: References: Message-ID: <4BB6207A.2060208@vcu.edu> David, I use a combination of ping and ifconfig as my first line of network troubleshooting. Sometimes, though it's a detective game. -- Mike Davis Technical Director (804) 828-3885 Center for High Performance Computing jmdavis1 at vcu.edu Virginia Commonwealth University "Never tell people how to do things. Tell them what to do and they will surprise you with their ingenuity." George S. Patton From hahn at mcmaster.ca Fri Apr 2 11:06:28 2010 From: hahn at mcmaster.ca (Mark Hahn) Date: Fri, 2 Apr 2010 14:06:28 -0400 (EDT) Subject: [Beowulf] AMD 6100 vs Intel 5600 In-Reply-To: <35AAF1E4A771E142979F27B51793A48887371CBFAE@AVEXMB1.qlogic.org> References: <4BB369E1.60906@cora.nwra.com> <4BB3B5C6.8060001@cse.ucdavis.edu> <4BB3BBAE.5070802@pathscale.com> <35AAF1E4A771E142979F27B51793A48887371CBE9D@AVEXMB1.qlogic.org> <35AAF1E4A771E142979F27B51793A48887371CBFAE@AVEXMB1.qlogic.org> Message-ID: >> SPEC self-limits its relevance by refusing to recognize that it should >> be open-source. being open-hostile means that it has very limited >> numbers >> of data points, > > Yup, only 9,335 submissions indexed on this page: > http://www.spec.org/cpu2006/results/cpu2006.html I think SPEC is worthwhile, just don't understand why the organization has persisted in self-limiting its accessibility/relevance. that number is also a bit disingenuous, as it lumps together int/fp and the rate versions. further, you see many cases where the results are shown for systems that differ only in packaging (who would guess that a dell 1950 performs the same as a 2950 when configured identically!) actually, the latter is one of SPEC's uses: allowing a vendor to confirm in a public way that a particular model is not broken. it's also nice to see a particular model/config with a range of different CPUs. and it's sometimes possible to compare across vendors, as well. these are all wonderful, and I value SPEC for them. what I don't understand is why it helps SPEC or consumers of SPEC to keep the rest of the world from running the tests. >> very minimalistic UI (let alone data mining tools), > > It is a limited text-based UI -- that runs on Linux, Windows, and >proprietary Unixes. Portability was/is a major goal of SPEC. sure, everyone understands portability. but wider availability of the source (and the vast increase in data that would result) would make it far more interesting to do real data mining. > The search form: > http://www.spec.org/cgi-bin/osgresults?conf=cpu2006&op=form > is useful for data mining. I use it, but let's not kid ourselves - it's not data mining by any meaningful definition. and I definitely mean no slight to whoever put together the interface - it's nice given its mandate. >> and perhaps most importantly, slow adaptation to changes in how >> machines >> are used (memory footprint, etc). > > True. But SPEC MPI2007 v2.0 is a 2.4 GB package of software [larger > datasets, and a 128 GB minimum RAM / 64 cores (min) requirement to run the > Large suite; the Medium suite (16 GB, ~8 core minimum) is still part of > 2.0]. I meant SPECCPU, of course. actually, I'd like to ask you how to think about SPECMPI results. I spent some time staring at them just now, and am not sure how to draw conclusions. for instance, with SPECCPU, one of the first things you have to do is trim the results: http://www.spec.org/cpu2006/results/res2008q2/cpu2006-20080328-03888.html that cactusADM result is not informative. also, of the 187 SPECMPI results, there are only a handful of vendors: given that there are hundreds of CPU centers that would love to be able to profile their clusters wrt other clusters, don't you think that being open-source would fundamentally change the value of the benchmark? > Like HPC Challenge, and NAS Parallel, it does not provide a single number > as a metric of performance. There are always compromises and knashing of > teeth in coming up with a formula for that single number, but > SPEC and the Linpack/Top500 maintainers have found that people like it. I guess it's a question of what your goals are. the single scalar result is good for the marketing folk (and I would claim that SPEC is pretty driven by them). for customers, I don't think the single scalar is much used or wanted, since it's too hard to tell what it means. with top500, it's perfectly clear what the number means: raw incache flops with a minor adjustment for interconnect. yes, it's entirely possible to extract the raw component-level data from SPEC and produce your own (trimmed, perhaps more discipline-focused) metric. but how many people do it? I did it for our last major hardware refresh ~5 years ago, but if it was possible to have community involvement (variety, eyes, fertilization) in SPEC, the benchmarks could be an entirely different kind of garden kind of garden. From mathog at caltech.edu Fri Apr 2 12:13:37 2010 From: mathog at caltech.edu (David Mathog) Date: Fri, 02 Apr 2010 12:13:37 -0700 Subject: [Beowulf] test network link quality? Message-ID: > we've noticed that occasionally cables of dubious quality > pop up and cause issues with packet flow in either one or both > directions. That's one sort of error that would be nice to be able to detect, especially in cases where the error is subtle. > Using iperf we can see nodes that don't perform properly, > and we notice a lot of problems that don't show up with simple ping > tests. Iperf is not exactly what I was looking for, but close. An iperf report looks like: [ 3] 0.0-10.0 sec 1.25 MBytes 1.05 Mbits/sec 0.002 ms 0/ 893 (0%) For comparison of (low) error rates, all that's useful is the 0/893. The numbers in this run are too small to distinguish between 99.99 and 99.9999, and there is probably no variation in packet size or content to help push the test to the edge. Think of this another way - imagine you want to do quality tests on a bunch of cables, and you don't have the proper test tools to make direct electrical measurements. You might be comparing samples between different brands, or looking for slightly defective cables in a large batch. What program could you run that would give a useful metric for transmission quality? Granted, it may be that all of the cables are "more than good enough", so that there is no correspondence between cable quality and dropped/corrupted packets, in which case no difference could be measured. That's an OK outcome. But if there is such a correspondence, and it is small, what tool would you use to see it? One that would at the end of the process allow you to say "this cable is unacceptable because it measures >4 sigmas below the mean for the batch". Such a cable would probably be perfectly acceptable for a desktop connection, it just isn't the one you want shuffling data day and night in a cluster. (Same argument for NICs and switch ports.) Thanks, David Mathog mathog at caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech From tegner at renget.se Fri Apr 2 12:15:21 2010 From: tegner at renget.se (Jon Tegner) Date: Fri, 02 Apr 2010 21:15:21 +0200 Subject: [Beowulf] test network link quality? In-Reply-To: References: Message-ID: <4BB64249.2060806@renget.se> What about netpipe? www.scl.ameslab.gov/netpipe/ /jon David Mathog wrote: > Is there a common method for testing the quality of a network link > between two networked machines? This is for situations where the link > works 99.99% of the time, but should work 99.99999% of the time, with > the failures being dropped packets or whatever. This would be used for > tracking down slightly defective patch cables, switch ports, NICs, and > the like. Is ping used like this: > > ping -i .0001 -c 1000000 -f targetmachine > > adequate? It gives decent statistics, but doesn't seem like a very good > simulation of a typical network load, in particular, the packet contents > aren't varying. > > Thanks, > > David Mathog > mathog at caltech.edu > Manager, Sequence Analysis Facility, Biology Division, Caltech > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > From mathog at caltech.edu Fri Apr 2 14:15:05 2010 From: mathog at caltech.edu (David Mathog) Date: Fri, 02 Apr 2010 14:15:05 -0700 Subject: [Beowulf] which 24 port unmanaged GigE switch? Message-ID: Which of these would be good for a cluster? Reliability is more important here than speed, not that I'm looking for slow. It should not burn up if the jobs move data through all ports near full capacity for hours or days at a time! NETGEAR JGS524 D-Link DGS-1024D SMC EZ Switch SMCGS24 HP J9078A The D-link and Netgear are within $10 of each other after the rebate on the latter, both are about $60 less than the SMC and HP. Thanks, David Mathog mathog at caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech From landman at scalableinformatics.com Fri Apr 2 14:32:46 2010 From: landman at scalableinformatics.com (Joe Landman) Date: Fri, 02 Apr 2010 17:32:46 -0400 Subject: [Beowulf] which 24 port unmanaged GigE switch? In-Reply-To: References: Message-ID: <4BB6627E.7000809@scalableinformatics.com> David Mathog wrote: > Which of these would be good for a cluster? Reliability is more > important here than speed, not that I'm looking for slow. It should not > burn up if the jobs move data through all ports near full capacity for > hours or days at a time! > > NETGEAR JGS524 > D-Link DGS-1024D > SMC EZ Switch SMCGS24 > HP J9078A > > The D-link and Netgear are within $10 of each other after the rebate on > the latter, both are about $60 less than the SMC and HP. I am biased. Get the HP. -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics Inc. email: landman at scalableinformatics.com web : http://scalableinformatics.com http://scalableinformatics.com/jackrabbit phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 From ahoward at purdue.edu Fri Apr 2 10:53:07 2010 From: ahoward at purdue.edu (Andrew Howard) Date: Fri, 2 Apr 2010 13:53:07 -0400 Subject: [Beowulf] test network link quality? In-Reply-To: References: Message-ID: iperf can also be a nice way to diagnose problems. In our 10gig networks, we've noticed that occasionally cables of dubious quality pop up and cause issues with packet flow in either one or both directions. Using iperf we can see nodes that don't perform properly, and we notice a lot of problems that don't show up with simple ping tests. -- Andrew Howard ahoward at purdue.edu Assistant Research Programmer Rosen Center for Advanced Computing, Purdue University On Fri, Apr 2, 2010 at 12:15 PM, David Mathog wrote: > Is there a common method for testing the quality of a network link > between two networked machines? ?This is for situations where the link > works 99.99% of the time, but should work 99.99999% of the time, with > the failures being dropped packets or whatever. ?This would be used for > tracking down slightly defective patch cables, switch ports, NICs, and > the like. ?Is ping used like this: > > ?ping -i .0001 -c 1000000 -f targetmachine > > adequate? ?It gives decent statistics, but doesn't seem like a very good > simulation of a typical network load, in particular, the packet contents > aren't varying. > > Thanks, > > David Mathog > mathog at caltech.edu > Manager, Sequence Analysis Facility, Biology Division, Caltech > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > From mdidomenico4 at gmail.com Mon Apr 5 06:39:40 2010 From: mdidomenico4 at gmail.com (Michael Di Domenico) Date: Mon, 5 Apr 2010 09:39:40 -0400 Subject: [Beowulf] which 24 port unmanaged GigE switch? In-Reply-To: <4BB6627E.7000809@scalableinformatics.com> References: <4BB6627E.7000809@scalableinformatics.com> Message-ID: I would have to agree. I have Netgears in my lab now and for light use they seem to be okay, but once you run a communications heavy MPI job over them they seem to fall down On Fri, Apr 2, 2010 at 5:32 PM, Joe Landman wrote: > David Mathog wrote: >> >> Which of these would be good for a cluster? ?Reliability is more >> important here than speed, not that I'm looking for slow. ?It should not >> burn up if the jobs move data through all ports near full capacity for >> hours or days at a time! >> >> NETGEAR JGS524 D-Link DGS-1024D SMC EZ Switch SMCGS24 HP J9078A >> >> The D-link and Netgear are within $10 of each other after the rebate on >> the latter, both are about $60 less than the SMC and HP. > > I am biased. ?Get the HP. > > -- > Joseph Landman, Ph.D > Founder and CEO > Scalable Informatics Inc. > email: landman at scalableinformatics.com > web ?: http://scalableinformatics.com > ? ? ? http://scalableinformatics.com/jackrabbit > phone: +1 734 786 8423 x121 > fax ?: +1 866 888 3112 > cell : +1 734 612 4615 > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > From landman at scalableinformatics.com Mon Apr 5 06:49:08 2010 From: landman at scalableinformatics.com (Joe Landman) Date: Mon, 05 Apr 2010 09:49:08 -0400 Subject: [Beowulf] which 24 port unmanaged GigE switch? In-Reply-To: References: <4BB6627E.7000809@scalableinformatics.com> Message-ID: <4BB9EA54.4020007@scalableinformatics.com> Michael Di Domenico wrote: > I would have to agree. I have Netgears in my lab now and for light > use they seem to be okay, but once you run a communications heavy MPI > job over them they seem to fall down I seem to remember that the Dell switches are rebadged SMC or Netgear units. They are great for office level work, not sure how they are for heavy MPI work, though I have heard not such great things. (note: we aren't paid agents of HP, or resellers of them, etc ... the switches are just good in clusters). > > On Fri, Apr 2, 2010 at 5:32 PM, Joe Landman > wrote: >> David Mathog wrote: >>> Which of these would be good for a cluster? Reliability is more >>> important here than speed, not that I'm looking for slow. It should not >>> burn up if the jobs move data through all ports near full capacity for >>> hours or days at a time! >>> >>> NETGEAR JGS524 D-Link DGS-1024D SMC EZ Switch SMCGS24 HP J9078A >>> >>> The D-link and Netgear are within $10 of each other after the rebate on >>> the latter, both are about $60 less than the SMC and HP. >> I am biased. Get the HP. >> >> -- >> Joseph Landman, Ph.D >> Founder and CEO >> Scalable Informatics Inc. >> email: landman at scalableinformatics.com >> web : http://scalableinformatics.com >> http://scalableinformatics.com/jackrabbit >> phone: +1 734 786 8423 x121 >> fax : +1 866 888 3112 >> cell : +1 734 612 4615 >> _______________________________________________ >> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing >> To change your subscription (digest mode or unsubscribe) visit >> http://www.beowulf.org/mailman/listinfo/beowulf >> -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics Inc. email: landman at scalableinformatics.com web : http://scalableinformatics.com http://scalableinformatics.com/jackrabbit phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 From mathog at caltech.edu Mon Apr 5 12:40:06 2010 From: mathog at caltech.edu (David Mathog) Date: Mon, 05 Apr 2010 12:40:06 -0700 Subject: [Beowulf] which 24 port unmanaged GigE switch? Message-ID: Michael Di Domenico > I would have to agree. I have Netgears in my lab now and for light > use they seem to be okay, but once you run a communications heavy MPI > job over them they seem to fall down Please define "fall down". One test I have applied to a switch (only 100baseT) to see if it could handle "full traffic" was running the script below on all nodes: #!/bin/bash TINFO=`topology_info` NEXT=`echo $TINFO | extract -mt -cols [3]` if [ $NEXT != "none" ] then TIME=`accudate -t0` dd if=/dev/zero bs=4096 count=1000000 | rsh $NEXT 'cat - >/dev/null' accudate -ds $TIME >/tmp/elapsed_${HOSTNAME}.txt fi Where topology_info defines a linear chain through all nodes, and what ends up in the elapsed_HOSTNAME.txt files is transmission time from this to the next node. extract and accudate are mine, the former is like "cut" and the latter is just used here to calculate an elapsed time. This is slightly apples and oranges because in the two node (reference) test the target node is only accepting packets, whereas when they are all running it is also sending packets, and those compete with the ack's going back to the first node. The D-Link switch held up quite well, I thought. One pair of nodes tested this way completed in 350 seconds (+/-), whereas it and the others took 370-380 seconds when they were all running at once (20 compute nodes, first only sends, last only receives). That is, 11.7 MB/sec for the pair, 10.8 MB/sec for all pairs. For GigE it should come out at 117 and 108 (or so), if the switch can keep up. I'm curious what the netgears and HP do in a test like this. If anybody would like to try this, all the pieces for this simple test (if you can run binaries for a 32 bit x86 environment) are here: http://saf.bio.caltech.edu/pub/software/linux_or_unix_tools/testswitch.tar.gz (For other platforms obtain source for accudate and extract from here http://saf.bio.caltech.edu/pub/software/linux_or_unix_tools/drm_tools.tar.gz ) Start the jobs simultaneously on all nodes using whichever queue system you have installed. Be sure to run it once first with a small count number to force anything coming over nfs into cache before doing the big test. (Or one could run netpipe on each pair of nodes, or anything else really that loads the network.) Regards, David Mathog mathog at caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech From mdidomenico4 at gmail.com Mon Apr 5 13:27:32 2010 From: mdidomenico4 at gmail.com (Michael Di Domenico) Date: Mon, 5 Apr 2010 16:27:32 -0400 Subject: [Beowulf] which 24 port unmanaged GigE switch? In-Reply-To: References: Message-ID: A couple small 10node clusters we have setup used to routinely drop off the network and the switch would have to be hard reset for it to return. Granted we didn't do any deep analysis (just replaced with cisco) and it could be attributed to some bad switches, but i've also seen this at home with some 1gb switches i bought. over the years i've been using netgear enterprise and home products, they are wonderful in light use 80-85% max throughput, but once you hit the 90+ areas they seem to start to degrade either through packet loss or over heating we still buy them for our management network, they're cheaper then hp and we just need it for kickstarts, snmp, etc.. as joe said, its just our opinion, your mileage may vary On Mon, Apr 5, 2010 at 3:40 PM, David Mathog wrote: > Michael Di Domenico >> I would have to agree. ?I have Netgears in my lab now and for light >> use they seem to be okay, but once you run a communications heavy MPI >> job over them they seem to fall down > > Please define "fall down". > > One test I have applied to a switch (only 100baseT) to see if it could > handle "full traffic" was running the script below on all nodes: > > #!/bin/bash > TINFO=`topology_info` > NEXT=`echo $TINFO | extract -mt -cols [3]` > if [ $NEXT != "none" ] > then > ?TIME=`accudate -t0` > ?dd if=/dev/zero bs=4096 count=1000000 | rsh $NEXT 'cat - >/dev/null' > ?accudate -ds $TIME >/tmp/elapsed_${HOSTNAME}.txt > fi > > Where topology_info defines a linear chain through all nodes, and what > ends up in the elapsed_HOSTNAME.txt files is transmission time from this > to the next node. ?extract and accudate are mine, the former is like > "cut" and the latter is just used here to calculate an elapsed time. > > This is slightly apples and oranges because in the two node (reference) > test the target node is only accepting packets, whereas when they are > all running it is also sending packets, and those compete with the ack's > going back to the first node. ?The D-Link switch held up quite well, I > thought. ?One pair of nodes tested this way completed in 350 seconds > (+/-), whereas it and the others took 370-380 seconds when they were all > running at once (20 compute nodes, first only sends, last only > receives). ?That is, 11.7 MB/sec for the pair, 10.8 MB/sec for all > pairs. ?For GigE it should come out at 117 and 108 (or so), if the > switch can keep up. > > I'm curious what the netgears and HP do in a test like this. ?If anybody > would like to try this, all the pieces for this simple test (if you can > run binaries for a 32 bit x86 environment) are here: > > ?http://saf.bio.caltech.edu/pub/software/linux_or_unix_tools/testswitch.tar.gz > > (For other platforms obtain source for accudate and extract from here > > http://saf.bio.caltech.edu/pub/software/linux_or_unix_tools/drm_tools.tar.gz > ) > > Start the jobs simultaneously on all nodes using whichever queue system > you have installed. ?Be sure to run it once first with a small count > number to force anything coming over nfs into cache before doing the big > test. ?(Or one could run netpipe on each pair of nodes, or anything else > really that loads the network.) > > Regards, > > David Mathog > mathog at caltech.edu > Manager, Sequence Analysis Facility, Biology Division, Caltech > From mathog at caltech.edu Mon Apr 5 15:01:41 2010 From: mathog at caltech.edu (David Mathog) Date: Mon, 05 Apr 2010 15:01:41 -0700 Subject: [Beowulf] which 24 port unmanaged GigE switch? Message-ID: > over the years i've been using netgear enterprise and home products, > they are wonderful in light use 80-85% max throughput, but once you > hit the 90+ areas they seem to start to degrade either through packet > loss or over heating OK, Netgear is officially scratched off the list. Two votes for HP. Anybody have experience with a D-Link GigE switch? Thanks, David Mathog mathog at caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech From beckerjes at mail.nih.gov Mon Apr 5 15:35:22 2010 From: beckerjes at mail.nih.gov (Jesse Becker) Date: Mon, 5 Apr 2010 18:35:22 -0400 Subject: [Beowulf] which 24 port unmanaged GigE switch? In-Reply-To: References: Message-ID: <20100405223522.GE6153@mail.nih.gov> On Mon, Apr 05, 2010 at 06:01:41PM -0400, David Mathog wrote: >> over the years i've been using netgear enterprise and home products, >> they are wonderful in light use 80-85% max throughput, but once you >> hit the 90+ areas they seem to start to degrade either through packet >> loss or over heating > >OK, Netgear is officially scratched off the list. >Two votes for HP. +1 from me as well, at least for the 2800 line (which is managed). HP procurves have a lifetime warranty as well. Also, as Joe Landman mentioned, Dell PowerConnect switches are just rebranded SMC units. Decent for the price, but not powerhouses. -- Jesse Becker NHGRI Linux support (Digicon Contractor) From skylar at cs.earlham.edu Mon Apr 5 16:41:24 2010 From: skylar at cs.earlham.edu (Skylar Thompson) Date: Mon, 05 Apr 2010 16:41:24 -0700 Subject: [Beowulf] which 24 port unmanaged GigE switch? In-Reply-To: References: Message-ID: <4BBA7524.7030804@cs.earlham.edu> On 4/5/2010 1:27 PM, Michael Di Domenico wrote: > A couple small 10node clusters we have setup used to routinely drop > off the network and the switch would have to be hard reset for it to > return. Granted we didn't do any deep analysis (just replaced with > cisco) and it could be attributed to some bad switches, but i've also > seen this at home with some 1gb switches i bought. > > over the years i've been using netgear enterprise and home products, > they are wonderful in light use 80-85% max throughput, but once you > hit the 90+ areas they seem to start to degrade either through packet > loss or over heating > > we still buy them for our management network, they're cheaper then hp > and we just need it for kickstarts, snmp, etc.. > > > This has been my experience too. We had a pair of managed Netgear gigabit switches at my last job with the two GBIC as uplinks bonded together with LACP. We probably burned out all four GBICs every year, and although Netgear was happy to continue replacing them it was certainly annoying. -- -- Skylar Thompson (skylar at cs.earlham.edu) -- http://www.cs.earlham.edu/~skylar/ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 251 bytes Desc: OpenPGP digital signature URL: From coutinho at dcc.ufmg.br Mon Apr 5 17:01:10 2010 From: coutinho at dcc.ufmg.br (Bruno Coutinho) Date: Mon, 5 Apr 2010 21:01:10 -0300 Subject: [Beowulf] which 24 port unmanaged GigE switch? In-Reply-To: References: Message-ID: 2010/4/5 David Mathog > > over the years i've been using netgear enterprise and home products, > > they are wonderful in light use 80-85% max throughput, but once you > > hit the 90+ areas they seem to start to degrade either through packet > > loss or over heating > > OK, Netgear is officially scratched off the list. > Two votes for HP. > Anybody have experience with a D-Link GigE switch? > The D-link DGS-1024D and DGS-1224T behave like Netgears. Once you have many links at near full speed they start to drop packets. D-link DGS-3100-24 handles large all to all traffic reasonably well, but has some fans that seem to be very weak. > > Thanks, > > David Mathog > mathog at caltech.edu > Manager, Sequence Analysis Facility, Biology Division, Caltech > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mathog at caltech.edu Tue Apr 6 10:01:17 2010 From: mathog at caltech.edu (David Mathog) Date: Tue, 06 Apr 2010 10:01:17 -0700 Subject: [Beowulf] which 24 port unmanaged GigE switch? Message-ID: Bruno Coutinho wrote: > 2010/4/5 David Mathog > > > > OK, Netgear is officially scratched off the list. > > Two votes for HP. > > Anybody have experience with a D-Link GigE switch? > > > > > The D-link DGS-1024D and DGS-1224T behave like Netgears. > Once you have many links at near full speed they start to drop packets. > > D-link DGS-3100-24 handles large all to all traffic reasonably well, but has > some fans that seem to be very weak. Dropping packets is one thing (D-Link), dropping dead and needing to be rebooted (Netgear) is another. In the case of the reported Netgear problems it sounds a lot like under heavy load these may be overheating and crashing. The D-link issue you describe sounds more like a real performance problem, as if it cannot actually deliver the claimed forwarding capacity. What's bothering me though is that both D-link and Netgear claim the same top forwarding rate as the more expensive switches, at 48 Gbps. Giving them the benefit of the doubt, that they are not intentionally misrepresenting the speed of these units, perhaps there is some other variable in play which degrades their performance more than the other switches? For instance, cable quality. Were you using cat 5, cat 5e, or cat 6 on the D-links that dropped packets under heavy load? Perhaps with lower speed patch cables the less expensive switches cannot meet their published specs? This is another one of those cases where having a benchmark would be enormously helpful. Since reviews for these devices are hardly ever published, it would at least be nice if end users had a simple way to compare the performance they observe with different devices. Regards, David Mathog mathog at caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech From h-bugge at online.no Tue Apr 6 12:01:11 2010 From: h-bugge at online.no (=?iso-8859-1?Q?H=E5kon_Bugge?=) Date: Tue, 6 Apr 2010 21:01:11 +0200 Subject: [Beowulf] which 24 port unmanaged GigE switch? In-Reply-To: References: Message-ID: On Apr 6, 2010, at 19:01 , David Mathog wrote: > What's bothering me though is that both D-link and Netgear claim the > same top forwarding rate as the more expensive switches, at 48 Gbps. > Giving them the benefit of the doubt, that they are not intentionally > misrepresenting the speed of these units, perhaps there is some other > variable in play which degrades their performance more than the other > switches? You should check that XON/XOFF (PAUSE protocol) is enabled in _both_ directions by means of ethtool. DIfferent switches have different defaults. Sometimes, enabling XON/XOFF has a dramatic positive effect on packet loss. > This is another one of those cases where having a benchmark would be > enormously helpful. I tend to use an MPI which can run over TCP/IP and run a naive all-to-all benchmark, changing payload sizes and number of active MPI processes per node. As a side note, Linux has in the order of ten different congestion protocols, but none of them seems useful in an HPC environment, when you're exposed to packet loss. H?kon From chekh at genomics.upenn.edu Tue Apr 6 12:45:48 2010 From: chekh at genomics.upenn.edu (Alex Chekholko) Date: Tue, 6 Apr 2010 15:45:48 -0400 Subject: [Beowulf] AMD 6100 vs Intel 5600 In-Reply-To: <201004010959.15152.cap@nsc.liu.se> References: <4BB369E1.60906@cora.nwra.com> <4BB3B5C6.8060001@cse.ucdavis.edu> <201004010959.15152.cap@nsc.liu.se> Message-ID: <20100406154548.42509f05.chekh@genomics.upenn.edu> On Thu, 1 Apr 2010 08:59:10 +0100 Peter Kjellstrom wrote: > My experience is that in HPC it always boils down to price/performance and > that would in my eyes make apples out of Magnycour and Westmere. Here's an example system you can build with the new hardware: SuperMicro 1042G-TF + 4 x AMD 6134 + 32 x 4GB sticks = ~$9560 for 32 cores, 128GB RAM in 1U (1400W) That sure is compelling, especially since I'm still running racks of Opteron 270 4-core 4GB RAM nodes. "sweet spot" analysis: the 8GB sticks are much more than 2x expensive compared to 4GB sticks (which are less than 2x as expensive as 2GB sticks). The AMD 6172 is more than 2x the price of the AMD 6134 and it gets you 50% more cores, at lower clock. The new AMD 6100-series SuperMicro chassis are priced linearly: $700/$1100/$1600 for 1P, 2P, 4P. So getting the 4P chassis with 4 CPUs is cheaper than 4 1P chassis or 2 2P chassis. I think this is an unusual situation, historically speaking. Regards, -- Alex Chekholko chekh at genomics.upenn.edu From tegner at renget.se Wed Apr 7 04:08:06 2010 From: tegner at renget.se (Jon Tegner) Date: Wed, 07 Apr 2010 13:08:06 +0200 Subject: [Beowulf] 96 cores in silent and small enclosure Message-ID: <20100407110806.3AF522C01002@bmail01.one.com> We (me and my brother) have been into silent computing, and clusters, for quite some time now. We just recently designed and built a unit equipped with 4 supermicro boards (H8DMT) and 8 cpus. In the actual unit the cpus are Opterons with 6 cores each, but it would be easy enough to switch to cpus with 12 cores. The box are of dimensions 40x42x58 cm, so it is reasonably small. Two large and silent fans are used to cool the system. For a picture of the unit check www.renget.se/bilder/redBox.jpg Comments on this? /jon -------------- next part -------------- An HTML attachment was scrubbed... URL: From eagles051387 at gmail.com Wed Apr 7 06:34:26 2010 From: eagles051387 at gmail.com (Jonathan Aquilina) Date: Wed, 7 Apr 2010 15:34:26 +0200 Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: <20100407110806.3AF522C01002@bmail01.one.com> References: <20100407110806.3AF522C01002@bmail01.one.com> Message-ID: is one of you an engineer? how well does the air flow in the case. what gets air into and out of the case if there are just only 2 fans on the box? you guys ever consider marketing this product? On Wed, Apr 7, 2010 at 1:08 PM, Jon Tegner wrote: > We (me and my brother) have been into silent computing, and clusters, for > quite some time now. We just recently designed and built a unit equipped > with 4 supermicro boards (H8DMT) and 8 cpus. In the actual unit the cpus are > Opterons with 6 cores each, but it would be easy enough to switch to cpus > with 12 cores. > > The box are of dimensions 40x42x58 cm, so it is reasonably small. Two large > and silent fans are used to cool the system. > > For a picture of the unit check > > www.renget.se/bilder/redBox.jpg > > Comments on this? > > /jon > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > -- Jonathan Aquilina -------------- next part -------------- An HTML attachment was scrubbed... URL: From tegner at renget.se Wed Apr 7 06:56:23 2010 From: tegner at renget.se (Jon Tegner) Date: Wed, 07 Apr 2010 15:56:23 +0200 Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: References: <20100407110806.3AF522C01002@bmail01.one.com> Message-ID: <20100407135623.B5309304A01A@bmail00.one.com> In this case we have large "home made" heat sinks, I'll put up more pictures when I get the time. You can check other designs at www.renget.se/gallery.html In contrast to our "Second cluster" on that site, which is completely fan less, this has two large fans. By using the fans one can get adequate cooling in a smaller box (than what would be possible without fans). When I get the time (when?) I intend to do combined cfd/heat flow simulations of our designs (have a PhD in numerical analysis). Eventually we hope to market this. Regards, /jon On Apr 7, 2010 15:34 "Jonathan Aquilina" wrote: > is one of you an engineer? how well does the air flow in the case. > what > gets air into and out of the case if there are just only 2 fans on the > box? you guys ever consider marketing this product? > > > On Wed, Apr 7, 2010 at 1:08 PM, Jon Tegner <> wrote: > > > We (me and my brother) have been into silent computing, and > > clusters, > > for quite some time now. We just recently designed and built a unit > > equipped with 4 supermicro boards (H8DMT) and 8 cpus. In the actual > > unit the cpus are Opterons with 6 cores each, but it would be easy > > enough to switch to cpus with 12 cores. > > > > > > > > > > The box are of dimensions 40x42x58 cm, so it is reasonably small. > > Two > > large and silent fans are used to cool the system. > > > > > > > > > > For a picture of the unit check > > > > > > > > > > > > > > Comments on this? > > > > > > /jon > > > > _______________________________________________ > > Beowulf mailing list, sponsored by Penguin > > Computing > > To change your subscription (digest mode or unsubscribe) visit > > > > > > > > > > -- > Jonathan Aquilina -------------- next part -------------- An HTML attachment was scrubbed... URL: From eagles051387 at gmail.com Wed Apr 7 06:57:27 2010 From: eagles051387 at gmail.com (Jonathan Aquilina) Date: Wed, 7 Apr 2010 15:57:27 +0200 Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: <20100407135623.B5309304A01A@bmail00.one.com> References: <20100407110806.3AF522C01002@bmail01.one.com> <20100407135623.B5309304A01A@bmail00.one.com> Message-ID: how much would something like this go for? On Wed, Apr 7, 2010 at 3:56 PM, Jon Tegner wrote: > In this case we have large "home made" heat sinks, I'll put up more > pictures when I get the time. You can check other designs at > > www.renget.se/gallery.html > > In contrast to our "Second cluster" on that site, which is completely fan > less, this has two large fans. By using the fans one can get adequate > cooling in a smaller box (than what would be possible without fans). > > When I get the time (when?) I intend to do combined cfd/heat flow > simulations of our designs (have a PhD in numerical analysis). > > Eventually we hope to market this. > > Regards, > > /jon > > > On Apr 7, 2010 15:34 "Jonathan Aquilina" wrote: > > is one of you an engineer? how well does the air flow in the case. what > gets air into and out of the case if there are just only 2 fans on the > box? you guys ever consider marketing this product? > > > On Wed, Apr 7, 2010 at 1:08 PM, Jon Tegner <> wrote: > > > We (me and my brother) have been into silent computing, and clusters, > > for quite some time now. We just recently designed and built a unit > > equipped with 4 supermicro boards (H8DMT) and 8 cpus. In the actual > > unit the cpus are Opterons with 6 cores each, but it would be easy > > enough to switch to cpus with 12 cores. > > > > > > > > > > The box are of dimensions 40x42x58 cm, so it is reasonably small. Two > > large and silent fans are used to cool the system. > > > > > > > > > > For a picture of the unit check > > > > > > > > http://www.renget.se/bilder/redBox.jpg > > > > > > > Comments on this? > > > > > > /jon > > > > _______________________________________________ > > Beowulf mailing list, sponsored by Penguin > > Computing > > To change your subscription (digest mode or unsubscribe) visit > > http://www.beowulf.org/mailman/listinfo/beowulf > > > > > > > > -- > Jonathan Aquilina > > -- Jonathan Aquilina -------------- next part -------------- An HTML attachment was scrubbed... URL: From eagles051387 at gmail.com Wed Apr 7 06:58:59 2010 From: eagles051387 at gmail.com (Jonathan Aquilina) Date: Wed, 7 Apr 2010 15:58:59 +0200 Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: References: <20100407110806.3AF522C01002@bmail01.one.com> <20100407135623.B5309304A01A@bmail00.one.com> Message-ID: sry for another post but i just got an idea. im not sure if you have seen that you tube video of a guy who put his whole setup in a fish tank and was using cooking oil i believe to cool everything. would be interesting to see the 2nd cluster put into a big enough tank and cooled with oil. On Wed, Apr 7, 2010 at 3:57 PM, Jonathan Aquilina wrote: > how much would something like this go for? > > > On Wed, Apr 7, 2010 at 3:56 PM, Jon Tegner wrote: > >> In this case we have large "home made" heat sinks, I'll put up more >> pictures when I get the time. You can check other designs at >> >> www.renget.se/gallery.html >> >> In contrast to our "Second cluster" on that site, which is completely fan >> less, this has two large fans. By using the fans one can get adequate >> cooling in a smaller box (than what would be possible without fans). >> >> When I get the time (when?) I intend to do combined cfd/heat flow >> simulations of our designs (have a PhD in numerical analysis). >> >> Eventually we hope to market this. >> >> Regards, >> >> /jon >> >> >> On Apr 7, 2010 15:34 "Jonathan Aquilina" wrote: >> >> is one of you an engineer? how well does the air flow in the case. what >> gets air into and out of the case if there are just only 2 fans on the >> box? you guys ever consider marketing this product? >> >> >> On Wed, Apr 7, 2010 at 1:08 PM, Jon Tegner <> wrote: >> >> > We (me and my brother) have been into silent computing, and clusters, >> > for quite some time now. We just recently designed and built a unit >> > equipped with 4 supermicro boards (H8DMT) and 8 cpus. In the actual >> > unit the cpus are Opterons with 6 cores each, but it would be easy >> > enough to switch to cpus with 12 cores. >> > >> > >> > >> > >> > The box are of dimensions 40x42x58 cm, so it is reasonably small. Two >> > large and silent fans are used to cool the system. >> > >> > >> > >> > >> > For a picture of the unit check >> > >> > >> > >> > http://www.renget.se/bilder/redBox.jpg >> >> > >> > >> > Comments on this? >> > >> > >> > /jon >> > >> > _______________________________________________ >> > Beowulf mailing list, sponsored by Penguin >> > Computing >> > To change your subscription (digest mode or unsubscribe) visit >> > http://www.beowulf.org/mailman/listinfo/beowulf >> > >> > >> >> >> >> -- >> Jonathan Aquilina >> >> > > > -- > Jonathan Aquilina > -- Jonathan Aquilina -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.hearns at mclaren.com Wed Apr 7 07:07:09 2010 From: john.hearns at mclaren.com (Hearns, John) Date: Wed, 7 Apr 2010 15:07:09 +0100 Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: <20100407135623.B5309304A01A@bmail00.one.com> References: <20100407110806.3AF522C01002@bmail01.one.com> <20100407135623.B5309304A01A@bmail00.one.com> Message-ID: <68A57CCFD4005646957BD2D18E60667B0FF7FB5D@milexchmb1.mil.tagmclarengroup.com> In this case we have large "home made" heat sinks, I'll put up more pictures when I get the time. You can check other designs at www.renget.se/gallery.html I had a look at your site - looks like these clusters mount the motherboards on a central heatsink, which is hollow uses convection to cool the system. Pretty smart. Let us know when you have pretty pictures from your CFD simulations! The contents of this email are confidential and for the exclusive use of the intended recipient. If you receive this email in error you should not copy it, retransmit it, use it or disclose its contents but should return it to the sender immediately and delete your copy. -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.hearns at mclaren.com Wed Apr 7 07:13:22 2010 From: john.hearns at mclaren.com (Hearns, John) Date: Wed, 7 Apr 2010 15:13:22 +0100 Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: References: <20100407110806.3AF522C01002@bmail01.one.com><20100407135623.B5309304A01A@bmail00.one.com> Message-ID: <68A57CCFD4005646957BD2D18E60667B0FF7FB7B@milexchmb1.mil.tagmclarengroup.com> sry for another post but i just got an idea. im not sure if you have seen that you tube video of a guy who put his whole setup in a fish tank and was using cooking oil i believe to cool everything. would be interesting to see the 2nd cluster put into a big enough tank and cooled with oil. These systems look to have rotating platter hard drives in them. substitute for solid state drives and you probably have a better chance of liquid cooling. I'd guess having vats of cooking oil in an office environment is a no-no - what oil do radio hams etc. use for cooling dummy loads for high power RF work? The contents of this email are confidential and for the exclusive use of the intended recipient. If you receive this email in error you should not copy it, retransmit it, use it or disclose its contents but should return it to the sender immediately and delete your copy. From eagles051387 at gmail.com Wed Apr 7 07:39:16 2010 From: eagles051387 at gmail.com (Jonathan Aquilina) Date: Wed, 7 Apr 2010 16:39:16 +0200 Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: <68A57CCFD4005646957BD2D18E60667B0FF7FB7B@milexchmb1.mil.tagmclarengroup.com> References: <20100407110806.3AF522C01002@bmail01.one.com> <20100407135623.B5309304A01A@bmail00.one.com> <68A57CCFD4005646957BD2D18E60667B0FF7FB7B@milexchmb1.mil.tagmclarengroup.com> Message-ID: john sry for the mis understanding in regards to the type of oil used i did some searchign and i turned up alot of videos of people doing something similar using mineral oil as the coolant. -------------- next part -------------- An HTML attachment was scrubbed... URL: From james.p.lux at jpl.nasa.gov Wed Apr 7 07:41:38 2010 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Wed, 7 Apr 2010 07:41:38 -0700 Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: <68A57CCFD4005646957BD2D18E60667B0FF7FB7B@milexchmb1.mil.tagmclarengroup.com> Message-ID: For electronics cooling.. Mineral oil, usually with a oxidation inhibitor. Shell Diala AX is what I use. For HV, dry and clean is important. If your hardware isn't oil compatible, then a variety of silicones (e.g. Fluorinert) are used (at substantially higher cost). For really high power density, fluorinert in an ebullient cooling (boiling) mode is used (either by reducing the pressure in the vessel or choosing a coolant with a suitable boiling point (e.g. 40-50C)). This is a tricky design, though, because though the bubbles rising helps circulation, you have to worry about film formation, etc. On 4/7/10 7:13 AM, "Hearns, John" wrote: sry for another post but i just got an idea. im not sure if you have seen that you tube video of a guy who put his whole setup in a fish tank and was using cooking oil i believe to cool everything. would be interesting to see the 2nd cluster put into a big enough tank and cooled with oil. These systems look to have rotating platter hard drives in them. substitute for solid state drives and you probably have a better chance of liquid cooling. I'd guess having vats of cooking oil in an office environment is a no-no - what oil do radio hams etc. use for cooling dummy loads for high power RF work? The contents of this email are confidential and for the exclusive use of the intended recipient. If you receive this email in error you should not copy it, retransmit it, use it or disclose its contents but should return it to the sender immediately and delete your copy. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -------------- next part -------------- An HTML attachment was scrubbed... URL: From eagles051387 at gmail.com Wed Apr 7 07:59:18 2010 From: eagles051387 at gmail.com (Jonathan Aquilina) Date: Wed, 7 Apr 2010 16:59:18 +0200 Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: References: <68A57CCFD4005646957BD2D18E60667B0FF7FB7B@milexchmb1.mil.tagmclarengroup.com> Message-ID: filming in what sense with the heat though i thought that the heat would prevent the oil from congealing? > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tegner at renget.se Wed Apr 7 10:14:01 2010 From: tegner at renget.se (Jon Tegner) Date: Wed, 07 Apr 2010 19:14:01 +0200 Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: References: <20100407110806.3AF522C01002@bmail01.one.com> <20100407135623.B5309304A01A@bmail00.one.com> Message-ID: <4BBCBD59.40203@renget.se> If you used oil, you could build something smaller (at least if you don't count the tank). The 2nd cluster gets sufficient cooling from the natural convection which develops in the channel (no forced convection). The same would hold for water. It would be fun to try, but I think air is more convenient. But even with air you can build small systems. Check our "third HTPC", which would be hard to build much smaller even with water-cooling (or oil for that matter). /jon Jonathan Aquilina wrote: > sry for another post but i just got an idea. im not sure if you have > seen that you tube video of a guy who put his whole setup in a fish > tank and was using cooking oil i believe to cool everything. would be > interesting to see the 2nd cluster put into a big enough tank and > cooled with oil. > > On Wed, Apr 7, 2010 at 3:57 PM, Jonathan Aquilina > > wrote: > > how much would something like this go for? > > > On Wed, Apr 7, 2010 at 3:56 PM, Jon Tegner > wrote: > > In this case we have large "home made" heat sinks, I'll put up > more pictures when I get the time. You can check other designs at > > www.renget.se/gallery.html > > In contrast to our "Second cluster" on that site, which is > completely fan less, this has two large fans. By using the > fans one can get adequate cooling in a smaller box (than what > would be possible without fans). > > When I get the time (when?) I intend to do combined cfd/heat > flow simulations of our designs (have a PhD in numerical > analysis). > > Eventually we hope to market this. > > Regards, > > /jon > > > On Apr 7, 2010 15:34 "Jonathan Aquilina" > wrote: > >> is one of you an engineer? how well does the air flow in the >> case. what >> gets air into and out of the case if there are just only 2 >> fans on the >> box? you guys ever consider marketing this product? >> >> >> On Wed, Apr 7, 2010 at 1:08 PM, Jon Tegner <> >> wrote: >> >> > We (me and my brother) have been into silent computing, and >> clusters, >> > for quite some time now. We just recently designed and >> built a unit >> > equipped with 4 supermicro boards (H8DMT) and 8 cpus. In >> the actual >> > unit the cpus are Opterons with 6 cores each, but it would >> be easy >> > enough to switch to cpus with 12 cores. >> > >> > >> > >> > >> > The box are of dimensions 40x42x58 cm, so it is reasonably >> small. Two >> > large and silent fans are used to cool the system. >> > >> > >> > >> > >> > For a picture of the unit check >> > >> > >> > >> > http://www.renget.se/bilder/redBox.jpg >> >> > >> > >> > Comments on this? >> > >> > >> > /jon >> > >> > _______________________________________________ >> > Beowulf mailing list, >> sponsored by Penguin >> > Computing >> > To change your subscription (digest mode or unsubscribe) visit >> > http://www.beowulf.org/mailman/listinfo/beowulf >> > >> > >> >> >> >> -- >> Jonathan Aquilina > > > > > -- > Jonathan Aquilina > > > > > -- > Jonathan Aquilina > ------------------------------------------------------------------------ > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > From eagles051387 at gmail.com Wed Apr 7 10:18:41 2010 From: eagles051387 at gmail.com (Jonathan Aquilina) Date: Wed, 7 Apr 2010 19:18:41 +0200 Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: <4BBCBD59.40203@renget.se> References: <20100407110806.3AF522C01002@bmail01.one.com> <20100407135623.B5309304A01A@bmail00.one.com> <4BBCBD59.40203@renget.se> Message-ID: keep up the good work would love to try one of these systems in a rendering environment. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gerry.creager at tamu.edu Wed Apr 7 10:52:01 2010 From: gerry.creager at tamu.edu (Gerald Creager) Date: Wed, 07 Apr 2010 12:52:01 -0500 Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: <68A57CCFD4005646957BD2D18E60667B0FF7FB7B@milexchmb1.mil.tagmclarengroup.com> References: <20100407110806.3AF522C01002@bmail01.one.com><20100407135623.B5309304A01A@bmail00.one.com> <68A57CCFD4005646957BD2D18E60667B0FF7FB7B@milexchmb1.mil.tagmclarengroup.com> Message-ID: <4BBCC641.6010005@tamu.edu> Hearns, John wrote: > sry for another post but i just got an idea. im not sure if you have > seen that you tube video of a guy who put his whole setup in a fish tank > and was using cooking oil i believe to cool everything. would be > interesting to see the 2nd cluster put into a big enough tank and cooled > with oil. > > These systems look to have rotating platter hard drives in them. > substitute for solid state drives and you probably have a better chance > of liquid cooling. > > I'd guess having vats of cooking oil in an office environment is a no-no > - what oil do radio hams etc. use for cooling dummy loads for high power > RF work? I remember when I could get a bucket of transformer oil from the light company, but that was laced with PCB, and is no longer available, responsible or legal. Now, mineral oil. Clean, uncontaminated, mineral oil. Filtered mineral oil. > The contents of this email are confidential and for the exclusive use of the intended recipient. If you receive this email in error you should not copy it, retransmit it, use it or disclose its contents but should return it to the sender immediately and delete your copy. > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- Gerry Creager -- gerry.creager at tamu.edu Texas Mesonet -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.862.3983 Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843 From atp at piskorski.com Wed Apr 7 11:24:41 2010 From: atp at piskorski.com (Andrew Piskorski) Date: Wed, 7 Apr 2010 14:24:41 -0400 Subject: [Beowulf] which 24 port unmanaged GigE switch? In-Reply-To: <20100405223522.GE6153@mail.nih.gov> References: <20100405223522.GE6153@mail.nih.gov> Message-ID: <20100407182441.GB53946@piskorski.com> On Mon, Apr 05, 2010 at 06:35:22PM -0400, Jesse Becker wrote: > Also, as Joe Landman mentioned, Dell PowerConnect switches are just > rebranded SMC units. Decent for the price, but not powerhouses. Well, this place is selling ("refurbished") 48 port SMC 8648T gigabit switches for $300, which seem worth trying at that price, no? Back in January they had them on sale for $200 each: http://www.unityelectronics.com/products/5517/SMC_TigerSwitch_48_Port_10_100_1000_Gigabit_Ethernet_Managed_Layer_2_Switch_with_4_mini_GBIC_Ports_SMC8648T_Refurbished http://www.smc.com/index.cfm?event=viewProduct&cid=8&scid=44&localeCode=EN_USA&pid=1192 -- Andrew Piskorski http://www.piskorski.com/ From deadline at eadline.org Wed Apr 7 11:26:54 2010 From: deadline at eadline.org (Douglas Eadline) Date: Wed, 7 Apr 2010 14:26:54 -0400 (EDT) Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: <20100407135623.B5309304A01A@bmail00.one.com> References: <20100407110806.3AF522C01002@bmail01.one.com> <20100407135623.B5309304A01A@bmail00.one.com> Message-ID: <44060.192.168.1.213.1270664814.squirrel@mail.eadline.org> You can also read about some of Jon's clusters here: http://www.clustermonkey.net//content/view/273/33/ -- Doug > In this case we have large "home made" heat sinks, I'll put up more > pictures when I get the time. You can check other designs at > > > > > www.renget.se/gallery.html > > > > > In contrast to our "Second cluster" on that site, which is completely > fan less, this has two large fans. By using the fans one can get > adequate cooling in a smaller box (than what would be possible without > fans). > > > > > When I get the time (when?) I intend to do combined cfd/heat flow > simulations of our designs (have a PhD in numerical analysis). > > > > > Eventually we hope to market this. > > > > > Regards, > > > > > /jon > > > > > > On Apr 7, 2010 15:34 "Jonathan Aquilina" wrote: > >> is one of you an engineer? how well does the air flow in the case. >> what >> gets air into and out of the case if there are just only 2 fans on the >> box? you guys ever consider marketing this product? >> >> >> On Wed, Apr 7, 2010 at 1:08 PM, Jon Tegner <> wrote: >> >> > We (me and my brother) have been into silent computing, and >> > clusters, >> > for quite some time now. We just recently designed and built a unit >> > equipped with 4 supermicro boards (H8DMT) and 8 cpus. In the actual >> > unit the cpus are Opterons with 6 cores each, but it would be easy >> > enough to switch to cpus with 12 cores. >> > >> > >> > >> > >> > The box are of dimensions 40x42x58 cm, so it is reasonably small. >> > Two >> > large and silent fans are used to cool the system. >> > >> > >> > >> > >> > For a picture of the unit check >> > >> > >> > >> > >> > >> > >> > Comments on this? >> > >> > >> > /jon >> > >> > _______________________________________________ >> > Beowulf mailing list, sponsored by Penguin >> > Computing >> > To change your subscription (digest mode or unsubscribe) visit >> > >> > >> > >> >> >> >> -- >> Jonathan Aquilina_______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -- Doug From james.p.lux at jpl.nasa.gov Wed Apr 7 11:39:40 2010 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Wed, 7 Apr 2010 11:39:40 -0700 Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: <4BBCC641.6010005@tamu.edu> References: <20100407110806.3AF522C01002@bmail01.one.com><20100407135623.B5309304A01A@bmail00.one.com> <68A57CCFD4005646957BD2D18E60667B0FF7FB7B@milexchmb1.mil.tagmclarengroup.com> <4BBCC641.6010005@tamu.edu> Message-ID: > -----Original Message----- > From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Gerald Creager > Sent: Wednesday, April 07, 2010 10:52 AM > To: Hearns, John > Cc: beowulf at beowulf.org > Subject: Re: [Beowulf] 96 cores in silent and small enclosure > > Hearns, John wrote: > > > > I'd guess having vats of cooking oil in an office environment is a no-no > > - what oil do radio hams etc. use for cooling dummy loads for high power > > RF work? > > I remember when I could get a bucket of transformer oil from the light > company, but that was laced with PCB, and is no longer available, > responsible or legal. That wasn't oil, anyway.. askarels most likely. Or, used oil that was drained from something that had previously been filled with PCBs, and is now "contaminated". It's not the actual PCBs in the askarel that's the problem, of course, it's the inevitable dioxin (another PCB) trace contaminant that is the issue. Now, mineral oil. Clean, uncontaminated, mineral > oil. Filtered mineral oil. The tesla coil world spends an inordinate amount of time trying to find substitutes for the real thing, which is available for about $5-10/gallon in 5 gallon pails. (the price varies surprisingly, and doesn't always track crude oil/gasoline prices.. mostly because the jobber has some amount of stock, so they tend to charge you what they paid for it, which might have been some months back). If you have an agricultural supply (feed and farm type thing) they sell USP white mineral oil as an animal laxative in gallon bottles. It's not the driest in the world, so I wouldn't use it for HV, but it is clean and filtered. The important thing you want to avoid in non-electrical mineral oils is contamination with Poly Aromatic Hydrocarbons (PAH), which don't cause problems for things like using oil for machining, but do cause health issues for other uses. If the oil ever got hot, it "cracks", forming PAHs. USP oil has very low PAH. Hydraulic fluid is another possibility, but watch the additives that get put in. Something that helps keep your pumps, valves, and cylinders from rusting may not be the best thing for your PC board. From lindahl at pbm.com Wed Apr 7 11:47:18 2010 From: lindahl at pbm.com (Greg Lindahl) Date: Wed, 7 Apr 2010 11:47:18 -0700 Subject: [Beowulf] AMD 6100 vs Intel 5600 In-Reply-To: <20100406154548.42509f05.chekh@genomics.upenn.edu> References: <4BB369E1.60906@cora.nwra.com> <4BB3B5C6.8060001@cse.ucdavis.edu> <201004010959.15152.cap@nsc.liu.se> <20100406154548.42509f05.chekh@genomics.upenn.edu> Message-ID: <20100407184718.GB17256@bx9.net> On Tue, Apr 06, 2010 at 03:45:48PM -0400, Alex Chekholko wrote: > I think this is an unusual situation, historically speaking. The price/perf curve has always had a knee, and it is always advancing towards more cores/box. One nice thing about AMD's market share woes is that you can often get their higher-end processors at a discount -- which means you can get more cores for your $$ and choose bigger boxes. -- greg From james.p.lux at jpl.nasa.gov Wed Apr 7 12:22:35 2010 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Wed, 7 Apr 2010 12:22:35 -0700 Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: References: <68A57CCFD4005646957BD2D18E60667B0FF7FB7B@milexchmb1.mil.tagmclarengroup.com> Message-ID: If you put something hot into a liquid, you have to worry about forming a film of vapor that keeps the liquid from touching the hot thing, and radically reduces the heat transfer. It's all tied up with the turbulence in the liquid, the surface tension of the liquid, etc. Boiling is a really good way to move heat: the heat of vaporization is huge, for a small temperature change, compared to just the liquid's specific heat. But, it's more complex to design. It's used in very high power solid state electronics and in high power vacuum tubes, as well. The key is that the boiling point of the liquid has to be close to the desired operating temperature of the parts being cooled. Various Freons work well. Look up Leidenfrost effect (why LN2 droplets skitter around, or water on a hot pancake griddle).. It's also related to why you can walk across burning coals in bare feet. (the true test of belief in Physics) James Lux, P.E. Task Manager, SOMD Software Defined Radios Flight Communications Systems Section Jet Propulsion Laboratory 4800 Oak Grove Drive, Mail Stop 161-213 Pasadena, CA, 91109 +1(818)354-2075 phone +1(818)393-6875 fax From: Jonathan Aquilina [mailto:eagles051387 at gmail.com] Sent: Wednesday, April 07, 2010 7:59 AM To: Lux, Jim (337C) Cc: Hearns, John; beowulf at beowulf.org Subject: Re: [Beowulf] 96 cores in silent and small enclosure filming in what sense with the heat though i thought that the heat would prevent the oil from congealing? -------------- next part -------------- An HTML attachment was scrubbed... URL: From eagles051387 at gmail.com Wed Apr 7 12:57:07 2010 From: eagles051387 at gmail.com (Jonathan Aquilina) Date: Wed, 7 Apr 2010 21:57:07 +0200 Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: References: <68A57CCFD4005646957BD2D18E60667B0FF7FB7B@milexchmb1.mil.tagmclarengroup.com> Message-ID: then if that is a problem then how does water cooling work? -------------- next part -------------- An HTML attachment was scrubbed... URL: From eagles051387 at gmail.com Wed Apr 7 21:57:23 2010 From: eagles051387 at gmail.com (Jonathan Aquilina) Date: Thu, 8 Apr 2010 06:57:23 +0200 Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: References: <68A57CCFD4005646957BD2D18E60667B0FF7FB7B@milexchmb1.mil.tagmclarengroup.com> Message-ID: i know there is non conductive water which if it gets on something shouldnt conduct electricity but how safe is a water cooled system? On Thu, Apr 8, 2010 at 12:04 AM, Jack Carrozzo wrote: > Water cooling for computers just uses the water to suck away heat, not > the boiling business (which is, however, very smart). A block from the > processor has a lot of surface area through which the water flows, so > the temperature differential between the water and the block is small > compared to other applications of liquid cooling. Hence no issues. > > -Jack Carrozzo > > On Wed, Apr 7, 2010 at 3:57 PM, Jonathan Aquilina > wrote: > > then if that is a problem then how does water cooling work? > > > > _______________________________________________ > > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > > To change your subscription (digest mode or unsubscribe) visit > > http://www.beowulf.org/mailman/listinfo/beowulf > > > > > -- Jonathan Aquilina -------------- next part -------------- An HTML attachment was scrubbed... URL: From prentice at ias.edu Thu Apr 8 06:33:07 2010 From: prentice at ias.edu (Prentice Bisbal) Date: Thu, 08 Apr 2010 09:33:07 -0400 Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: References: <68A57CCFD4005646957BD2D18E60667B0FF7FB7B@milexchmb1.mil.tagmclarengroup.com> Message-ID: <4BBDDB13.4050701@ias.edu> Lux, Jim (337C) wrote: > If you put something hot into a liquid, you have to worry about forming > a film of vapor that keeps the liquid from touching the hot thing, and > radically reduces the heat transfer. It?s all tied up with the > turbulence in the liquid, the surface tension of the liquid, etc. > I'm having flashbacks of my Transport Phenomena class from college. Thanks, Jim! > > Boiling is a really good way to move heat: the heat of vaporization is > huge, for a small temperature change, Technically, the heat of vaporization occurs at zero temperature change. ;) >compared to just the liquid?s > specific heat. But, it?s more complex to design. It?s used in very > high power solid state electronics and in high power vacuum tubes, as > well. The key is that the boiling point of the liquid has to be close > to the desired operating temperature of the parts being cooled. Various > Freons work well. > Look up Leidenfrost effect (why LN2 droplets skitter around, or water on > a hot pancake griddle).. > > It?s also related to why you can walk across burning coals in bare feet. > (the true test of belief in Physics) > Here's another party trick based on this: Fill a cup (preferably a Styrofoam cup for insulation purposes) with liquid nitrogen (LN2) . Then stick your finger in it and pull it out real quick. Even though LN2 is very cold, you won't fell a thing - the heat from your finger causes the LN2 vaporize before you even contact it, creating an insulating layer (film) of nitrogen gas. It's not stable, so if your keep your finger in it for longer than a split second, you WILL get freeze your finger! Of course, this requires you bringing our own tank of LN2 to the party in the first place. -- Prentice From prentice at ias.edu Thu Apr 8 06:37:25 2010 From: prentice at ias.edu (Prentice Bisbal) Date: Thu, 08 Apr 2010 09:37:25 -0400 Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: References: <68A57CCFD4005646957BD2D18E60667B0FF7FB7B@milexchmb1.mil.tagmclarengroup.com> Message-ID: <4BBDDC15.8090500@ias.edu> It's only a problem if the temperatures are near the boiling temperature of the liquid. The liquid can vaporize, creating a film of gas between the coolant and object being cooled. Since gases have a low conductivity, the gas acts as an insulator, retarding heat transfer. Jonathan Aquilina wrote: > then if that is a problem then how does water cooling work? > > > ------------------------------------------------------------------------ > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- Prentice From prentice at ias.edu Thu Apr 8 06:48:01 2010 From: prentice at ias.edu (Prentice Bisbal) Date: Thu, 08 Apr 2010 09:48:01 -0400 Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: References: <68A57CCFD4005646957BD2D18E60667B0FF7FB7B@milexchmb1.mil.tagmclarengroup.com> Message-ID: <4BBDDE91.3070004@ias.edu> Jonathan Aquilina wrote: > i know there is non conductive water which if it gets on something > shouldnt conduct electricity but how safe is a water cooled system? Pure water (distilled, deionized water) will not conduct electricity. We consider water to be electrically conductive because of the ions created by dissolved minerals in it, which are almost always present in water. Unless you handle the distilled, deionized water very carefully, odds are pretty good that it will come into contact with something that can dissolve in it to produce ions. This "something" can be metal pipes, or even gases. Polymers, like PEX tubing, should be safe. In a typical water-cooling system, where the water doesn't come into direct contact with the electronic components, its safety depends on the durability of the materials used to contain it. and the liquid-tightness of the connections. For a boiling system like Jim Lux brought up, it's very unsafe, since odds are good it will come into contact with something that will produce dissolved ions in it and make it electrically conductive. Prentice > > On Thu, Apr 8, 2010 at 12:04 AM, Jack Carrozzo > wrote: > > Water cooling for computers just uses the water to suck away heat, not > the boiling business (which is, however, very smart). A block from the > processor has a lot of surface area through which the water flows, so > the temperature differential between the water and the block is small > compared to other applications of liquid cooling. Hence no issues. > > -Jack Carrozzo > > On Wed, Apr 7, 2010 at 3:57 PM, Jonathan Aquilina > > wrote: > > then if that is a problem then how does water cooling work? > > > > _______________________________________________ > > Beowulf mailing list, Beowulf at beowulf.org > sponsored by Penguin Computing > > To change your subscription (digest mode or unsubscribe) visit > > http://www.beowulf.org/mailman/listinfo/beowulf > > > > > > > > > -- > Jonathan Aquilina > From james.p.lux at jpl.nasa.gov Thu Apr 8 06:54:02 2010 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Thu, 8 Apr 2010 06:54:02 -0700 Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: Message-ID: Very pure deionized (DI) water has low conductivity, and is used in some HV apparatus, mostly because of the spectacularly high dielectric constant (80). It's also used in some clever cooling systems where the anode of the tube is at HV, but the pump and radiator is at ground potential. The problem is that you have to keep the water pure, because it's always dissolving whatever it's contacting. The other problem is that liquid containing systems almost always leak. On 4/7/10 9:57 PM, "Jonathan Aquilina" wrote: i know there is non conductive water which if it gets on something shouldnt conduct electricity but how safe is a water cooled system? On Thu, Apr 8, 2010 at 12:04 AM, Jack Carrozzo wrote: Water cooling for computers just uses the water to suck away heat, not the boiling business (which is, however, very smart). A block from the processor has a lot of surface area through which the water flows, so the temperature differential between the water and the block is small compared to other applications of liquid cooling. Hence no issues. -Jack Carrozzo On Wed, Apr 7, 2010 at 3:57 PM, Jonathan Aquilina wrote: > then if that is a problem then how does water cooling work? > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From james.p.lux at jpl.nasa.gov Thu Apr 8 07:01:14 2010 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Thu, 8 Apr 2010 07:01:14 -0700 Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: <4BBDDC5E.4080609@earlham.edu> Message-ID: The challenge with all liquid cooling schemes is leaking and spils. If you have pumps and tubes, inevitably, something leaks. If you have a tank of coolant, you wind up taking the equipment out of the tank, and it drips. The most annoying thing about oil is that it is really good at wicking up the inside of wire by capillary action. So you either have to arrange for sealed feed throughs or use only bare solid wire where it penetrates the surface. A remarkably small crack or hole will let the oil through. There's also the "thermal pumping" problem. As the oil changes temperature (either from turning the gear on and off, or because of temperature changes of the room) it expands and contracts. If you have a perfectly sealed (hah) container, that means the pressure of the oil will change, tending to push it into small crevices/cracks/along the gap between insulation and conductor in a wire. But, if you vent it, then you have to worry about atmospheric air bringing water into the system as it "breathes" And, if you have a vent, inevitably, the oil will find a way out. Don't get me wrong....liquid cooling is great.. Oil insulation is great. It's just a mess and you need to be ready for it. On 4/8/10 6:38 AM, "Kevin Hunter" wrote: At 12:57am -0400 Thu, 08 Apr 2010, Jonathan Aquilina wrote: > i know there is non conductive water which if it gets on > something shouldnt conduct electricity but how safe is a > water cooled system? How safe is it? I can't answer empirically (no experience), but in theory it's just as safe as air. Water is never in contact with any electrically charged object, and never leaves it's tubing channels. Kevin -------------- next part -------------- An HTML attachment was scrubbed... URL: From james.p.lux at jpl.nasa.gov Thu Apr 8 08:58:54 2010 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Thu, 8 Apr 2010 08:58:54 -0700 Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: <4BBDDE91.3070004@ias.edu> References: <68A57CCFD4005646957BD2D18E60667B0FF7FB7B@milexchmb1.mil.tagmclarengroup.com> <4BBDDE91.3070004@ias.edu> Message-ID: > For a boiling system like Jim Lux brought up, it's very unsafe, since > odds are good it will come into contact with something that will produce > dissolved ions in it and make it electrically conductive. > When they use boilers for cooling power vacuum tubes, they don't worry about the conductivity (as much), because they deal with the voltage issues in other ways. The old Eimac "Care and Feeding of Power Grid Tubes" book talks about this and has pictures as well. As I recall, it's online at CPI (what Eimac had become part of, long after they were part of Varian) For lower voltage gear, various halogenated hydrocarbons are used (because you typically want something that boils at a temperature well below 100C... 40-50C is nice). They're all insulators, so from that standpoint it's easier to use. And I suppose we should also talk about "heat pipes" which come in a variety of forms, some of which use evaporation/condensation for transport (others use density gradients). From richard.walsh at comcast.net Thu Apr 8 09:13:21 2010 From: richard.walsh at comcast.net (richard.walsh at comcast.net) Date: Thu, 8 Apr 2010 16:13:21 +0000 (UTC) Subject: [Beowulf] QDR InfiniBand interconnect architectures ... approaches ... Message-ID: <2134167717.9394181270743201960.JavaMail.root@sz0135a.emeryville.ca.mail.comcast.net> All, What are the approaches and experiences of people interconnecting clusters of more than128 compute nodes with QDR InfiniBand technology? Are people directly connecting to chassis-sized switches? Using multi-tiered approaches which combine 36-port leaf switches? What are your experiences? What products seem to be living up to expectations? I am looking for some real world feedback before making a decision on architecture and vendor. Thanks, rbw -------------- next part -------------- An HTML attachment was scrubbed... URL: From casey at grlug.org Wed Apr 7 07:25:09 2010 From: casey at grlug.org (Casey DuBois) Date: Wed, 7 Apr 2010 10:25:09 -0400 Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: <68A57CCFD4005646957BD2D18E60667B0FF7FB7B@milexchmb1.mil.tagmclarengroup.com> References: <20100407110806.3AF522C01002@bmail01.one.com> <20100407135623.B5309304A01A@bmail00.one.com> <68A57CCFD4005646957BD2D18E60667B0FF7FB7B@milexchmb1.mil.tagmclarengroup.com> Message-ID: Hi All, These guys are using dielectric fluid coolant (GreenDEF? coolant). www.grcooling.com http://www.youtube.com/watch?v=-q0sTFX1DFM Is this an option for silent small enclosures? Casey DuBois 616-808-6942 casey at grlug.org From akshar.bhosale at gmail.com Wed Apr 7 10:55:50 2010 From: akshar.bhosale at gmail.com (akshar bhosale) Date: Wed, 7 Apr 2010 23:25:50 +0530 Subject: [Beowulf] in pbs submit script, setenv command is not working Message-ID: Hi, we have cluser of 8 nodes and it is rhel 5.2 (64 bit). We have torque and here is my submit script which is #!/bin/csh -f #PBS -l nodes=2:ppn=2 #PBS -r n #PBS -A ourproj #PBS -V #PBS -o output_pvd3.6.txt #PBS -e error_pvd3.6.txt echo PBS JOB id is $PBS_JOBID echo PBS_NODEFILE is $PBS_NODEFILE echo PBS_QUEUE is $PBS_QUEUE setenv NPROCS `cat $PBS_NODEFILE|wc -l` echo "NPOCS is $NPOCS" #/opt/intel/mpi/bin64/mpirun --totalnum=$NPROCS --file=$PBS_NODEFILE --rsh=/usr/bin/ssh -1 --ordered --verbose -l -machinefile $PBS_NODEFILE -np $NPROCS /home/aksharb/helloworld #sleep 100 /bin/hostname cat $PBS_NODEFILE in output file i get PBS JOB id is 1725.server1.gnps.tkl PBS_NODEFILE is /opt/PBS/aux//1725.server1.gnps.tkl PBS_QUEUE is batch NPOCS is y8.gnps.tkl y8.gnps.tkl y7.gnps.tkl y7.gnps.tkl and error file says : /opt/PBS/mom_priv/jobs/1725.server1.gnps.tkl.SC: line 11: setenv: command not found kindly guide me. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jack at crepinc.com Wed Apr 7 15:04:14 2010 From: jack at crepinc.com (Jack Carrozzo) Date: Wed, 7 Apr 2010 18:04:14 -0400 Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: References: <68A57CCFD4005646957BD2D18E60667B0FF7FB7B@milexchmb1.mil.tagmclarengroup.com> Message-ID: Water cooling for computers just uses the water to suck away heat, not the boiling business (which is, however, very smart). A block from the processor has a lot of surface area through which the water flows, so the temperature differential between the water and the block is small compared to other applications of liquid cooling. Hence no issues. -Jack Carrozzo On Wed, Apr 7, 2010 at 3:57 PM, Jonathan Aquilina wrote: > then if that is a problem then how does water cooling work? > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > From hunteke at earlham.edu Thu Apr 8 06:38:38 2010 From: hunteke at earlham.edu (Kevin Hunter) Date: Thu, 08 Apr 2010 09:38:38 -0400 Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: References: <68A57CCFD4005646957BD2D18E60667B0FF7FB7B@milexchmb1.mil.tagmclarengroup.com> Message-ID: <4BBDDC5E.4080609@earlham.edu> At 12:57am -0400 Thu, 08 Apr 2010, Jonathan Aquilina wrote: > i know there is non conductive water which if it gets on > something shouldnt conduct electricity but how safe is a > water cooled system? As regards non-conductive water, you're correct: *pure* water has a very high resistivity, something like 18 M?-cm. (Effectively, not conductive for home-uses.) However, pure water has to be manufactured, and water is also very good at dissolving and dispersing conductive ions. (Sugar with tea, anyone?) So, it's still not smart to play with a toaster in the tub. I have no experience with water cooled systems specifically, but I believe the point is to suck heat from the high-heat components, and not to just willy-nilly douse your entire box in water. For instance, you might replace the standard fan and heatsink on top of your CPU with a waterblock. The water would then be pumped through tubing of some kind to the waterblock (on top of the CPU), and back to a cooling radiator of some kind. The water never leaves it's circuit, but still disperses heat from the top of the chip in the socket. How safe is it? I can't answer empirically (no experience), but in theory it's just as safe as air. Water is never in contact with any electrically charged object, and never leaves it's tubing channels. Kevin From lindahl at pbm.com Thu Apr 8 11:14:11 2010 From: lindahl at pbm.com (Greg Lindahl) Date: Thu, 8 Apr 2010 11:14:11 -0700 Subject: [Beowulf] QDR InfiniBand interconnect architectures ... approaches ... In-Reply-To: <2134167717.9394181270743201960.JavaMail.root@sz0135a.emeryville.ca.mail.comcast.net> References: <2134167717.9394181270743201960.JavaMail.root@sz0135a.emeryville.ca.mail.comcast.net> Message-ID: <20100408181411.GA28585@bx9.net> On Thu, Apr 08, 2010 at 04:13:21PM +0000, richard.walsh at comcast.net wrote: > > What are the approaches and experiences of people interconnecting > clusters of more than128 compute nodes with QDR InfiniBand technology? > Are people directly connecting to chassis-sized switches? Using multi-tiered > approaches which combine 36-port leaf switches? I would expect everyone to use a chassis at that size, because it's cheaper than having more cables. That was true on day 1 with IB, the only question is "are the switch vendors charging too high of a price for big switches?" > I am looking for some real world feedback before making a decision on > architecture and vendor. Hopefully you're planning on benchmarking your own app -- both the HCAs and the switch silicon have considerably different application- dependent performance characteristics between QLogic and Mellanox silicon. -- greg From Craig.Tierney at noaa.gov Thu Apr 8 11:42:49 2010 From: Craig.Tierney at noaa.gov (Craig Tierney) Date: Thu, 08 Apr 2010 12:42:49 -0600 Subject: [Beowulf] QDR InfiniBand interconnect architectures ... approaches ... In-Reply-To: <2134167717.9394181270743201960.JavaMail.root@sz0135a.emeryville.ca.mail.comcast.net> References: <2134167717.9394181270743201960.JavaMail.root@sz0135a.emeryville.ca.mail.comcast.net> Message-ID: <4BBE23A9.408@noaa.gov> richard.walsh at comcast.net wrote: > > All, > > > What are the approaches and experiences of people interconnecting > clusters of more than128 compute nodes with QDR InfiniBand technology? > Are people directly connecting to chassis-sized switches? Using multi-tiered > approaches which combine 36-port leaf switches? What are your experiences? > What products seem to be living up to expectations? > > > I am looking for some real world feedback before making a decision on > architecture and vendor. > > We have been telling our vendors to design a multi-level tree using 36 port switches that provides approximately 70% bisection bandwidth. On a 448 node Nehalem cluster, this has worked well (weather, hurricane, and some climate modeling). This design (15 up/21 down) allows us to scale the system to 714 nodes. Craig > Thanks, > > > rbw > > > ------------------------------------------------------------------------ > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From djholm at fnal.gov Thu Apr 8 11:44:19 2010 From: djholm at fnal.gov (Don Holmgren) Date: Thu, 08 Apr 2010 13:44:19 -0500 (CDT) Subject: [Beowulf] in pbs submit script, setenv command is not working In-Reply-To: References: Message-ID: You should add, as the second line, #PBS -S /bin/csh else PBS will (I think) use your default login shell, which I assume is not csh (explaining the "setenv" error message). Also, you have a typo. You should have echo "NPROCS is $NPROCS" instead of echo "NPOCS is $NPOCS" Don Holmgren Fermilab On Wed, 7 Apr 2010, akshar bhosale wrote: > Hi, > we have cluser of 8 nodes and it is rhel 5.2 (64 bit). We have torque and > here is my submit script which is > #!/bin/csh -f > #PBS -l nodes=2:ppn=2 > #PBS -r n > #PBS -A ourproj > #PBS -V > #PBS -o output_pvd3.6.txt > #PBS -e error_pvd3.6.txt > echo PBS JOB id is $PBS_JOBID > echo PBS_NODEFILE is $PBS_NODEFILE > echo PBS_QUEUE is $PBS_QUEUE > setenv NPROCS `cat $PBS_NODEFILE|wc -l` > echo "NPOCS is $NPOCS" > #/opt/intel/mpi/bin64/mpirun --totalnum=$NPROCS --file=$PBS_NODEFILE > --rsh=/usr/bin/ssh -1 --ordered --verbose -l -machinefile $PBS_NODEFILE > -np $NPROCS /home/aksharb/helloworld > #sleep 100 > /bin/hostname > cat $PBS_NODEFILE > > > in output file i get > > PBS JOB id is 1725.server1.gnps.tkl > PBS_NODEFILE is /opt/PBS/aux//1725.server1.gnps.tkl > PBS_QUEUE is batch > NPOCS is > y8.gnps.tkl > y8.gnps.tkl > y7.gnps.tkl > y7.gnps.tkl > > and error file says : > /opt/PBS/mom_priv/jobs/1725.server1.gnps.tkl.SC: line 11: setenv: command > not found > > kindly guide me. > From prentice at ias.edu Thu Apr 8 12:48:10 2010 From: prentice at ias.edu (Prentice Bisbal) Date: Thu, 08 Apr 2010 15:48:10 -0400 Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: References: <20100407110806.3AF522C01002@bmail01.one.com> <20100407135623.B5309304A01A@bmail00.one.com> <68A57CCFD4005646957BD2D18E60667B0FF7FB7B@milexchmb1.mil.tagmclarengroup.com> Message-ID: <4BBE32FA.9090203@ias.edu> Casey DuBois wrote: > Hi All, > > These guys are using dielectric fluid coolant (GreenDEF? coolant). > www.grcooling.com > > http://www.youtube.com/watch?v=-q0sTFX1DFM > Those guys were at SC09. There is one reason above all else why oil cooling like this will never work it's just too messy. If you touched anything at that booth, you were almost guaranteed to get oil on your hands. There were plenty of paper towels being handed out at that booth. And imagine taking a system out of the oil to replace components. Can you imagine trying to replace a slippery hard drive or CPU? Think of all those little damn screws that you always drop into the case when trying to reattach something. Now add some oily slipperyness. -- Prentice From prentice at ias.edu Thu Apr 8 12:52:02 2010 From: prentice at ias.edu (Prentice Bisbal) Date: Thu, 08 Apr 2010 15:52:02 -0400 Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: References: <68A57CCFD4005646957BD2D18E60667B0FF7FB7B@milexchmb1.mil.tagmclarengroup.com> <4BBDDE91.3070004@ias.edu> Message-ID: <4BBE33E2.3030803@ias.edu> Lux, Jim (337C) wrote: > And I suppose we should also talk about "heat pipes" which come in a variety of forms, some of which use evaporation/condensation for transport (others use density gradients). > density gradients = natural convection ? From jmdavis1 at vcu.edu Thu Apr 8 12:53:09 2010 From: jmdavis1 at vcu.edu (Mike Davis) Date: Thu, 08 Apr 2010 15:53:09 -0400 Subject: [Beowulf] QDR InfiniBand interconnect architectures ... approaches ... In-Reply-To: <4BBE23A9.408@noaa.gov> References: <2134167717.9394181270743201960.JavaMail.root@sz0135a.emeryville.ca.mail.comcast.net> <4BBE23A9.408@noaa.gov> Message-ID: <4BBE3425.9010306@vcu.edu> We are DDR but we use a flat switching model for our Infiniband cluster. Thus far most work is MD and QC and scaling has been good. From richard.walsh at comcast.net Thu Apr 8 13:29:39 2010 From: richard.walsh at comcast.net (richard.walsh at comcast.net) Date: Thu, 8 Apr 2010 20:29:39 +0000 (UTC) Subject: [Beowulf] QDR InfiniBand interconnect architectures ... approaches ... In-Reply-To: <20100408181411.GA28585@bx9.net> Message-ID: <1399092641.9512091270758579118.JavaMail.root@sz0135a.emeryville.ca.mail.comcast.net> On Thursday, April 8, 2010 2:14:11 PM Greg Lindahl wrote: >> What are the approaches and experiences of people interconnecting >> clusters of more than128 compute nodes with QDR InfiniBand technology? >> Are people directly connecting to chassis-sized switches? Using multi-tiered >> approaches which combine 36-port leaf switches? > >I would expect everyone to use a chassis at that size, because it's cheaper >than having more cables. That was true on day 1 with IB, the only question is >"are the switch vendors charging too high of a price for big switches?" Hey Greg, I think my target is around 192 compute nodes, with room for a head node(s), and ports to a Lustre file server. So, 216 ports looks like a reasonable number to me (6 x 36). The price for an integrated chassis model solution should not exceed the price for a multi-tiered solution using 36-port (or some other switch smaller than 216) plus the cabling costs. Reliability and labor would also have to factored in with an advantage going to the chassis I assume based also on fewer cables? Looks like the chassis options are between $375 and $400 a port, while the 36 port options are running at about $175 to $200 a port (but you need more ports and cables). >> I am looking for some real world feedback before making a decision on >> architecture and vendor. > >Hopefully you're planning on benchmarking your own app -- both the >HCAs and the switch silicon have considerably different application- >dependent performance characteristics between QLogic and Mellanox >silicon. Yes, I assume that people would also recommend matching NIC and switch hardware. Thanks for your input ... rbw _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -------------- next part -------------- An HTML attachment was scrubbed... URL: From james.p.lux at jpl.nasa.gov Thu Apr 8 13:49:59 2010 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Thu, 8 Apr 2010 13:49:59 -0700 Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: <4BBE33E2.3030803@ias.edu> References: <68A57CCFD4005646957BD2D18E60667B0FF7FB7B@milexchmb1.mil.tagmclarengroup.com> <4BBDDE91.3070004@ias.edu> <4BBE33E2.3030803@ias.edu> Message-ID: > -----Original Message----- > From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Prentice Bisbal > Sent: Thursday, April 08, 2010 12:52 PM > To: Beowulf Mailing List > Subject: Re: [Beowulf] 96 cores in silent and small enclosure > > > > Lux, Jim (337C) wrote: > > > And I suppose we should also talk about "heat pipes" which come in a variety of forms, some of which > use evaporation/condensation for transport (others use density gradients). > > > > density gradients = natural convection ? Thermosiphoning.. From james.p.lux at jpl.nasa.gov Thu Apr 8 13:57:31 2010 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Thu, 8 Apr 2010 13:57:31 -0700 Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: <4BBE32FA.9090203@ias.edu> References: <20100407110806.3AF522C01002@bmail01.one.com> <20100407135623.B5309304A01A@bmail00.one.com> <68A57CCFD4005646957BD2D18E60667B0FF7FB7B@milexchmb1.mil.tagmclarengroup.com> <4BBE32FA.9090203@ias.edu> Message-ID: > > > > Those guys were at SC09. There is one reason above all else why oil > cooling like this will never work it's just too messy. If you touched > anything at that booth, you were almost guaranteed to get oil on your > hands. There were plenty of paper towels being handed out at that booth. This is why the halogenated hydrocarbons (Freons) are nice.. they evaporate quickly. (at the expense of ozone layer depletion.. but hey, with the big cluster being cooled by the Freon, you'll be able to more accurately predict the effect, right?) > > And imagine taking a system out of the oil to replace components. Can > you imagine trying to replace a slippery hard drive or CPU? Think of all > those little damn screws that you always drop into the case when trying > to reattach something. Now add some oily slipperyness. With oil, you can hose it off with kerosene.. or better yet hexane or pentane, which is a lot less viscous and dissolves the oil nicely. Then the hydrocarbon evaporates. Still a mess, though, but better. (we won't get into other industrial issues that arise.. kids, don't try this at home) From richard.walsh at comcast.net Thu Apr 8 14:30:31 2010 From: richard.walsh at comcast.net (richard.walsh at comcast.net) Date: Thu, 8 Apr 2010 21:30:31 +0000 (UTC) Subject: Fwd: [Beowulf] QDR InfiniBand interconnect architectures ... approaches ... In-Reply-To: <260035600.9535281270761554695.JavaMail.root@sz0135a.emeryville.ca.mail.comcast.net> Message-ID: <828000640.9540691270762230984.JavaMail.root@sz0135a.emeryville.ca.mail.comcast.net> ----- Forwarded Message ----- From: "richard walsh" To: "Craig Tierney" Sent: Thursday, April 8, 2010 5:19:14 PM GMT -05:00 US/Canada Eastern Subject: Re: [Beowulf] QDR InfiniBand interconnect architectures ... approaches ... On Thursday, April 8, 2010 2:42:49 PM Craig Tierney wrote: >We have been telling our vendors to design a multi-level tree using >36 port switches that provides approximately 70% bisection bandwidth. >On a 448 node Nehalem cluster, this has worked well (weather, hurricane, and >some climate modeling). This design (15 up/21 down) allows us to >scale the system to 714 nodes. Hey Craig, Thanks for the information. So are you driven mostly by the need for incremental expandability with this design, or do you disagree with Greg and think that the cost is as good or better than a chassis based approach? What about reliability (assuming the vendor is putting it together for you) and maintenance headaches? Not so bad? What kind of cabling are you using? Trying to do the math on the design ... for the 448 nodes you would need 22 switches for the first tier (22 * 21 = 462 down). That gives you (15 * 22 = 330 uplinks), so you need at least 10 switches in the second tier (10 * 36 = 360) which leaves you some spare ports for other things. Am I getting this right? Could you lay out the design in a bit more detail? Did you consider building things from medium size switches (say 108 port models)? Are you paying a premium for incremental expandability or not? How many ports are you using for your file server? Our system is likely to come in at 192 nodes with some additional ports for file server connection. I would like to compare the cost of a 216 port switch to your 15/21 design using 36 port switches. Thanks much, rbw > Thanks, > > > rbw > > > ------------------------------------------------------------------------ > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -------------- next part -------------- An HTML attachment was scrubbed... URL: From Craig.Tierney at noaa.gov Thu Apr 8 15:12:26 2010 From: Craig.Tierney at noaa.gov (Craig Tierney) Date: Thu, 08 Apr 2010 16:12:26 -0600 Subject: Fwd: [Beowulf] QDR InfiniBand interconnect architectures ... approaches ... In-Reply-To: <828000640.9540691270762230984.JavaMail.root@sz0135a.emeryville.ca.mail.comcast.net> References: <828000640.9540691270762230984.JavaMail.root@sz0135a.emeryville.ca.mail.comcast.net> Message-ID: <4BBE54CA.9030208@noaa.gov> richard.walsh at comcast.net wrote: > > On Thursday, April 8, 2010 2:42:49 PM Craig Tierney wrote: > > > > >> >> We have been telling our vendors to design a multi-level tree using >> >> 36 port switches that provides approximately 70% bisection bandwidth. >> >> On a 448 node Nehalem cluster, this has worked well (weather, hurricane, and >> >> some climate modeling). This design (15 up/21 down) allows us to >> >> scale the system to 714 nodes. > > > > > > Hey Craig, > > > > > > Thanks for the information. So are you driven mostly by the need > > for incremental expandability with this design, or do you disagree > > with Greg and think that the cost is as good or better than a chassis > > based approach? What about reliability (assuming the vendor is > > putting it together for you) and maintenance headaches? Not so > > bad? What kind of cabling are you using? > > It was cheaper at the time by a lot. The vendor did not put it together for us. However, we have had the same team doing this stuff (building clusters) for 10 years. So we tell vendors what to do as we have all the value-add we need. That will end sometime, but it works for now. We have had no maintenance headaches. the reliability is fine. We are using copper QDR cables for the short runs and go to fibre for some of them. I can get specifics on the cable manufacturers if you need it. > > > > Trying to do the math on the design ... for the 448 nodes you would > > need 22 switches for the first tier (22 * 21 = 462 down). That gives > > you (15 * 22 = 330 uplinks), so you need at least 10 switches in the > > second tier (10 * 36 = 360) which leaves you some spare ports for > > other things. Am I getting this right? Could you lay out the design > > in a bit more detail? Did you consider building things from medium > > size switches (say 108 port models)? Are you paying a premium > > for incremental expandability or not? How many ports are you using > > for your file server? We have 7 racks of compute nodes, each with 64 nodes. Each rack has three 36 port switches. 21 nodes plug into each switch, with the last one plugging into to a switch in the main IB switch rack. We run 15 cables from each of the node switches to the spines. For a full tree that will lead us to having 34 ports used. The other two ports (fromt the 15 spine switches) have cables that run up to a higher tier for IO. As far as the IO goes, you need visualize that we have 4 clusters. Three of them (360 node, 252 node, 448 node) have two levels to the tree. The last is a GPU cluster with 16 nodes. All of these systems connect up to another level of IB switches (small-ones, not large ones). Our filesystems plug into this tree as well. We used to have Rapidscale (ugh), but now we have 3 DDN/Lustre solutions and 1 Panasas solution. The Rapidscale is repurposed for Lustre as well for testing, and so in aggregate we have about 30 GB/s of IO across all the systems. We consider every technical configuration that can save us money. There was no design that used larger switches as building blocks that would reduce the price. We paid extra for the expandability, but it still wasn't as much as buying the big switches. Yes, the 2 cables from each spine is overkill for performance. The designer is planning on not using 2 from each switch next time. > > > > > > Our system is likely to come in at 192 nodes with some additional > > ports for file server connection. I would like to compare the cost > > of a 216 port switch to your 15/21 design using 36 port switches. > > > > So if you did a design like ours, you would have 4 racks. Three would be for compute nodes (if you use the twin type supermicro solution or similar) each with 3 36-port switches. The fourth rack would be for the IB switches and other equipment. That system is small enough that you shouldn't need any fibre. So you would have 9 switches in the racks, and 5 spine switches (3 cables from each rack switch). Each spine would use 27 ports to the compute, and you would have 9 extra (overkill) for your IO system. Total parts: 14 36 port switches 36*15 cables, but all copper. Cost doesn't change much by length. Additional cables to connect IO system. If you find that is cheaper than a single switch, please let me know. Who sells a 216 port switch? Are you looking at the larger Voltaire where you install a number of line boards? Craig From gerry.creager at tamu.edu Thu Apr 8 20:06:27 2010 From: gerry.creager at tamu.edu (Gerry Creager) Date: Thu, 08 Apr 2010 22:06:27 -0500 Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: <4BBDDB13.4050701@ias.edu> References: <68A57CCFD4005646957BD2D18E60667B0FF7FB7B@milexchmb1.mil.tagmclarengroup.com> <4BBDDB13.4050701@ias.edu> Message-ID: <4BBE99B3.5090108@tamu.edu> Prentice Bisbal wrote: > Lux, Jim (337C) wrote: >> If you put something hot into a liquid, you have to worry about forming >> a film of vapor that keeps the liquid from touching the hot thing, and >> radically reduces the heat transfer. It?s all tied up with the >> turbulence in the liquid, the surface tension of the liquid, etc. >> > > I'm having flashbacks of my Transport Phenomena class from college. > Thanks, Jim! There is treatment available... >> Boiling is a really good way to move heat: the heat of vaporization is >> huge, for a small temperature change, > > Technically, the heat of vaporization occurs at zero temperature change. ;) > >> compared to just the liquid?s >> specific heat. But, it?s more complex to design. It?s used in very >> high power solid state electronics and in high power vacuum tubes, as >> well. The key is that the boiling point of the liquid has to be close >> to the desired operating temperature of the parts being cooled. Various >> Freons work well. > > >> Look up Leidenfrost effect (why LN2 droplets skitter around, or water on >> a hot pancake griddle).. >> >> It?s also related to why you can walk across burning coals in bare feet. >> (the true test of belief in Physics) >> > > Here's another party trick based on this: Fill a cup (preferably a > Styrofoam cup for insulation purposes) with liquid nitrogen (LN2) . Then > stick your finger in it and pull it out real quick. Even though LN2 is > very cold, you won't fell a thing - the heat from your finger causes the > LN2 vaporize before you even contact it, creating an insulating layer > (film) of nitrogen gas. It's not stable, so if your keep your finger in > it for longer than a split second, you WILL get freeze your finger! > > Of course, this requires you bringing our own tank of LN2 to the party > in the first place. > From eagles051387 at gmail.com Thu Apr 8 22:31:07 2010 From: eagles051387 at gmail.com (Jonathan Aquilina) Date: Fri, 9 Apr 2010 07:31:07 +0200 Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: <4BBE99B3.5090108@tamu.edu> References: <68A57CCFD4005646957BD2D18E60667B0FF7FB7B@milexchmb1.mil.tagmclarengroup.com> <4BBDDB13.4050701@ias.edu> <4BBE99B3.5090108@tamu.edu> Message-ID: thats one thing that really puts me off of water cooling as was mentioned is springing a leak. also would distilled water collected from dehumidifiers be non conductive as well? -------------- next part -------------- An HTML attachment was scrubbed... URL: From cap at nsc.liu.se Fri Apr 9 02:16:39 2010 From: cap at nsc.liu.se (Peter Kjellstrom) Date: Fri, 9 Apr 2010 10:16:39 +0100 Subject: [Beowulf] QDR InfiniBand interconnect architectures ... approaches ... In-Reply-To: <20100408181411.GA28585@bx9.net> References: <2134167717.9394181270743201960.JavaMail.root@sz0135a.emeryville.ca.mail.comcast.net> <20100408181411.GA28585@bx9.net> Message-ID: <201004091116.44184.cap@nsc.liu.se> On Thursday 08 April 2010, Greg Lindahl wrote: > On Thu, Apr 08, 2010 at 04:13:21PM +0000, richard.walsh at comcast.net wrote: > > What are the approaches and experiences of people interconnecting > > clusters of more than128 compute nodes with QDR InfiniBand technology? > > Are people directly connecting to chassis-sized switches? Using > > multi-tiered approaches which combine 36-port leaf switches? > > I would expect everyone to use a chassis at that size, because it's cheaper > than having more cables. That was true on day 1 with IB, the only question > is "are the switch vendors charging too high of a price for big switches?" Recently we've (swedish academic centre) got offers using 1U 36-port switches not chassis from both Voltaire and Qlogic reason given: lower cost. So from our point of view, yes, "switch vendors [are] charging too high of a price for big switches" :-) One "pro" for many 1U switches compared to a chassi is that it gives you more topological flexibility. For example, you can build a 4:1 over subscribed fat-tree and that will obviously be cheaper than a chassi (even if they were more reasonably priced). /Peter -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part. URL: From tom.ammon at utah.edu Fri Apr 9 11:04:47 2010 From: tom.ammon at utah.edu (Tom Ammon) Date: Fri, 09 Apr 2010 12:04:47 -0600 Subject: [Beowulf] QDR InfiniBand interconnect architectures ... approaches ... In-Reply-To: <201004091116.44184.cap@nsc.liu.se> References: <2134167717.9394181270743201960.JavaMail.root@sz0135a.emeryville.ca.mail.comcast.net> <20100408181411.GA28585@bx9.net> <201004091116.44184.cap@nsc.liu.se> Message-ID: <4BBF6C3F.7020602@utah.edu> Another thing to remember with chassis switches is that you can also build them in an oversubscribed model by removing spine cards. Most chassis' have at least 3 spine modules so you lose some granularity in oversubscription, but you can still cut costs. You don't have to go with fully nonblocking in a chassis if you want to save money. Tom On 04/09/2010 03:16 AM, Peter Kjellstrom wrote: > On Thursday 08 April 2010, Greg Lindahl wrote: > >> On Thu, Apr 08, 2010 at 04:13:21PM +0000, richard.walsh at comcast.net wrote: >> >>> What are the approaches and experiences of people interconnecting >>> clusters of more than128 compute nodes with QDR InfiniBand technology? >>> Are people directly connecting to chassis-sized switches? Using >>> multi-tiered approaches which combine 36-port leaf switches? >>> >> I would expect everyone to use a chassis at that size, because it's cheaper >> than having more cables. That was true on day 1 with IB, the only question >> is "are the switch vendors charging too high of a price for big switches?" >> > Recently we've (swedish academic centre) got offers using 1U 36-port switches > not chassis from both Voltaire and Qlogic reason given: lower cost. So from > our point of view, yes, "switch vendors [are] charging too high of a price > for big switches" :-) > > One "pro" for many 1U switches compared to a chassi is that it gives you more > topological flexibility. For example, you can build a 4:1 over subscribed > fat-tree and that will obviously be cheaper than a chassi (even if they were > more reasonably priced). > > /Peter > -- -------------------------------------------------------------------- Tom Ammon Network Engineer Office: 801.587.0976 Mobile: 801.674.9273 Center for High Performance Computing University of Utah http://www.chpc.utah.edu From mathog at caltech.edu Fri Apr 9 12:43:03 2010 From: mathog at caltech.edu (David Mathog) Date: Fri, 09 Apr 2010 12:43:03 -0700 Subject: [Beowulf] 96 cores in silent and small enclosure Message-ID: Jonathan Aquilina wrote: >also would distilled water collected from dehumidifiers be > non conductive as well? Well "non-conductive" is a relative term, it won't be as conductive as ocean water, nor as non-conductive as you want for this application. Any type of ions in the fluid will drive the conductivity up, and if all you are doing is precipitating water out of the air with a typical commercial A/C component it will not be very clean. Labs that need really clean water usually distill it multiple times or put it through a chain of filters. Some of these systems have a resistivity meter at the end, as that's an easy way to check that the water is as pure as is needed. We had a system like that when I was in grad school, and the water just sitting in it would increase in conductivity over time as ions leached out of the internal parts. So when it was turned on the first part of the flow would be quite conductive, and it would be directed into the sink. Only when the meter showed the water was clean enough would water be collected. Regards, David Mathog mathog at caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech From tegner at renget.se Sun Apr 11 12:19:41 2010 From: tegner at renget.se (Jon Tegner) Date: Sun, 11 Apr 2010 21:19:41 +0200 Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: References: <20100407110806.3AF522C01002@bmail01.one.com> Message-ID: <4BC220CD.1040709@renget.se> Have done some preliminary tests on the system. Indicates a CPU temperature of 60-65 C after half an hour (will do longer test soon). Have a few questions: * How high cpu temperatures are acceptable (our cluster is built on 6 core AMD opterons)? I know life span is reduced if temperature is high, but due to performance reasons life span of a CPU is pretty short anyway. * I used lm-sensors to check the temp, how accurate is that? * There are two fans in the system, noctua NF-P14 FLX www.noctua.at/main.php?show=productview&products_id=33&lng=en&set=1 according to the specifications their acoustical noise is 19.6 dB. Are there some simple guidelines of how the noise "adds up", i.e. what would be the sound level of two fans, or three? * Would there be a market potential for a system like this? I naturally tend to think this is a very "cool" computer, but maybe that's just me? After some optimization (and a bit of CFD) I'm sure I could make it a bit smaller. Regards, /jon From james.p.lux at jpl.nasa.gov Sun Apr 11 17:13:40 2010 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Sun, 11 Apr 2010 17:13:40 -0700 Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: <4BC220CD.1040709@renget.se> Message-ID: * There are two fans in the system, noctua NF-P14 FLX www.noctua.at/main.php?show=productview&products_id=33&lng=en&set=1 according to the specifications their acoustical noise is 19.6 dB. Are there some simple guidelines of how the noise "adds up", i.e. what would be the sound level of two fans, or three? There's no real way to figure it out. Sure, in theory, twice the noise power means 3dB increase, but fan noise is a funny thing. Some of the noise might be just from air rushing through your equipment. There's also the difference between "blade noise" which is tied to the rotation rate (and harmonics), general mechanical noise, which is broadband, and not so "noticeable". The "rating" is in a special test jig, and not real representative of what a particular fan will do in your particular installation. All it really tells you is that a fan rated at 19dB is noticeably quieter than a fan rated at 29dB, and a whole lot quieter than a fan rated at 40dB. -------------- next part -------------- An HTML attachment was scrubbed... URL: From hahn at mcmaster.ca Sun Apr 11 19:59:04 2010 From: hahn at mcmaster.ca (Mark Hahn) Date: Sun, 11 Apr 2010 22:59:04 -0400 (EDT) Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: <4BC220CD.1040709@renget.se> References: <20100407110806.3AF522C01002@bmail01.one.com> <4BC220CD.1040709@renget.se> Message-ID: > Have done some preliminary tests on the system. Indicates a CPU temperature > of 60-65 C after half an hour (will do longer test soon). Have a few that's pretty hot. some servers will shutdown at 65 (our DL145g2's, for instance). of course, the metric is poorly defined: is that a thermister under the CPU, or a sensor on the die itself? > * How high cpu temperatures are acceptable (our cluster is built on 6 core > AMD opterons)? well, you can look up the max operating spec for your particular chips. for instance, http://products.amd.com/en-us/OpteronCPUResult.aspx shows that OS8439YDS6DGN includes chips rated 55-71. (there must be some further package marking to determine which temp spec...) > I know life span is reduced if temperature is high, but due to > performance reasons life span of a CPU is pretty short anyway. if you operate the chip within spec, you should expect the lifespan to be plenty long (basically indefinite, but let's say 10 years...) > * I used lm-sensors to check the temp, how accurate is that? it's just reporting registers; that is not to say that lm-sensors is necessarily interpreting them correctly. otoh, lm_sensors appears to be willing to offer some metadata, as well (critical temp settings.) > * Would there be a market potential for a system like this? I naturally tend the more specialized the product, the smaller the market. there are lots of mainstream workstations which are fairly quiet. I've even seen some small deskside clusters that claimed to be quiet. personally I don't think it makes much sense - I'd rather use an arbitrarily-noisy cluster from a quiet and wimpy desktop. From james.p.lux at jpl.nasa.gov Sun Apr 11 20:58:38 2010 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Sun, 11 Apr 2010 20:58:38 -0700 Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: Message-ID: On 4/11/10 7:59 PM, "Mark Hahn" wrote: > >> * How high cpu temperatures are acceptable (our cluster is built on 6 core >> AMD opterons)? > > well, you can look up the max operating spec for your particular chips. > for instance, http://products.amd.com/en-us/OpteronCPUResult.aspx > shows that OS8439YDS6DGN includes chips rated 55-71. (there must be some > further package marking to determine which temp spec...) > I couldn't find the datasheet in a few seconds of casual clicking, BUT... The temperature might be related to the clock rate you're running at... A faster clock rate or higher dissipation power might have a lower temperature limit (or might not).. For instance, if the limit is the junction temperature, there's some thermal resistance between the reference junction and the measurement point, so if the chip is dissipating more, the delta T between limiting point and measurement point is greater. The limits might also have to do with timing constraints. The timing margins of most semiconductor circuits change pretty substantially with temperature, and what works at a given speed at one temperature might not work hotter or cooler. (and a lot of times, those limits might be determined empirically... They test a bunch of cases, and that's what gets published in the data sheet) There's also the whole "instruction stream" effect on the thermal properties. An instantaneous dissipation change of 10:1 isn't unusual, especially if you have onchip cache and pipelining. > > >> I know life span is reduced if temperature is high, but due to >> performance reasons life span of a CPU is pretty short anyway. > > if you operate the chip within spec, you should expect the lifespan > to be plenty long (basically indefinite, but let's say 10 years...) Maybe, maybe not. The chip life generally follows Arrhenius rule (roughly halving life for 10C rise), but it's hard to know what the "rated" life is, and whether the exponent is the same. And, of course, you're probably not running the thing at max junction temp all the time. When they test chips for life, they do accelerated aging testing.. They do some examples (based on the packaging and fab process and experience) to figure out a scaling law, then run them really hot, to get "effective" rates of aging that are very high (so you can get years of "life" in a month or so). But it's really an art, and sort of a crap shoot anyway. There have been lots of cases where things didn't follow the rules, and unexpected things happened. >> * Would there be a market potential for a system like this? I naturally tend > > the more specialized the product, the smaller the market. there are lots > of mainstream workstations which are fairly quiet. I've even seen some > small deskside clusters that claimed to be quiet. personally I don't > think it makes much sense - I'd rather use an arbitrarily-noisy cluster > from a quiet and wimpy desktop. > The usual argument for deskside clusters is that they are under your personal control, and you don't have to justify your use (or non-use) of them at any given time. They're "personal" as opposed to "corporate resource". From hahn at mcmaster.ca Mon Apr 12 06:48:28 2010 From: hahn at mcmaster.ca (Mark Hahn) Date: Mon, 12 Apr 2010 09:48:28 -0400 (EDT) Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: References: Message-ID: > For instance, if the limit is the junction temperature, there's some thermal > resistance between the reference junction and the measurement point, so if > the chip is dissipating more, the delta T between limiting point and > measurement point is greater. and therefore the published max operating temp would be lower. > work hotter or cooler. (and a lot of times, those limits might be > determined empirically... They test a bunch of cases, and that's what gets > published in the data sheet) no, I believe it's standard practice to characterize individual chips as they're produced ("binning"), with specific markings (clock, temp range) to communicate the results. http://support.amd.com/us/Embedded_TechDocs/43374.pdf has a number of tables that, for instance, define a thermal profile ("V") which requires a Tcase max of 64C at 165W (! though there aren't actually any models listed in the doc rated to dissipate that much). all the profiles listed have a tcase-max of between 64 and 86C. >> if you operate the chip within spec, you should expect the lifespan >> to be plenty long (basically indefinite, but let's say 10 years...) > > Maybe, maybe not. The chip life generally follows Arrhenius rule (roughly > halving life for 10C rise), but it's hard to know what the "rated" life is, hmm, I was assuming a lifetime warranty, but indeed, the terms are 3 years. it would be surprising if you couldn't operate the chip within its thermal spec, continuously, for 3 years, with low failure rates... From rigved.sharma123 at gmail.com Mon Apr 12 08:19:26 2010 From: rigved.sharma123 at gmail.com (rigved sharma) Date: Mon, 12 Apr 2010 20:49:26 +0530 Subject: [Beowulf] in pbs submit script, setenv command is not working In-Reply-To: References: Message-ID: thanksa lot its working On Fri, Apr 9, 2010 at 12:14 AM, Don Holmgren wrote: > > You should add, as the second line, > > #PBS -S /bin/csh > > else PBS will (I think) use your default login shell, which I assume is > not csh (explaining the "setenv" error message). > > Also, you have a typo. You should have > > echo "NPROCS is $NPROCS" > instead of > echo "NPOCS is $NPOCS" > > Don Holmgren > Fermilab > > > > > > On Wed, 7 Apr 2010, akshar bhosale wrote: > > Hi, >> we have cluser of 8 nodes and it is rhel 5.2 (64 bit). We have torque and >> here is my submit script which is >> #!/bin/csh -f >> #PBS -l nodes=2:ppn=2 >> #PBS -r n >> #PBS -A ourproj >> #PBS -V >> #PBS -o output_pvd3.6.txt >> #PBS -e error_pvd3.6.txt >> echo PBS JOB id is $PBS_JOBID >> echo PBS_NODEFILE is $PBS_NODEFILE >> echo PBS_QUEUE is $PBS_QUEUE >> setenv NPROCS `cat $PBS_NODEFILE|wc -l` >> echo "NPOCS is $NPOCS" >> #/opt/intel/mpi/bin64/mpirun --totalnum=$NPROCS --file=$PBS_NODEFILE >> --rsh=/usr/bin/ssh -1 --ordered --verbose -l -machinefile $PBS_NODEFILE >> -np $NPROCS /home/aksharb/helloworld >> #sleep 100 >> /bin/hostname >> cat $PBS_NODEFILE >> >> >> in output file i get >> >> PBS JOB id is 1725.server1.gnps.tkl >> PBS_NODEFILE is /opt/PBS/aux//1725.server1.gnps.tkl >> PBS_QUEUE is batch >> NPOCS is >> y8.gnps.tkl >> y8.gnps.tkl >> y7.gnps.tkl >> y7.gnps.tkl >> >> and error file says : >> /opt/PBS/mom_priv/jobs/1725.server1.gnps.tkl.SC: line 11: setenv: command >> not found >> >> kindly guide me. >> >> > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -------------- next part -------------- An HTML attachment was scrubbed... URL: From prentice at ias.edu Mon Apr 12 10:39:15 2010 From: prentice at ias.edu (Prentice Bisbal) Date: Mon, 12 Apr 2010 13:39:15 -0400 Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: References: <20100407110806.3AF522C01002@bmail01.one.com> <20100407135623.B5309304A01A@bmail00.one.com> <68A57CCFD4005646957BD2D18E60667B0FF7FB7B@milexchmb1.mil.tagmclarengroup.com> <4BBE32FA.9090203@ias.edu> Message-ID: <4BC35AC3.2040605@ias.edu> Lux, Jim (337C) wrote: >> Those guys were at SC09. There is one reason above all else why oil >> cooling like this will never work it's just too messy. If you touched >> anything at that booth, you were almost guaranteed to get oil on your >> hands. There were plenty of paper towels being handed out at that booth. > > This is why the halogenated hydrocarbons (Freons) are nice.. they evaporate quickly. (at the expense of ozone layer depletion.. but hey, with the big cluster being cooled by the Freon, you'll be able to more accurately predict the effect, right?) > > > >> And imagine taking a system out of the oil to replace components. Can >> you imagine trying to replace a slippery hard drive or CPU? Think of all >> those little damn screws that you always drop into the case when trying >> to reattach something. Now add some oily slipperyness. > > With oil, you can hose it off with kerosene.. or better yet hexane or pentane, which is a lot less viscous and dissolves the oil nicely. Then the hydrocarbon evaporates. Still a mess, though, but better. (we won't get into other industrial issues that arise.. kids, don't try this at home) > In both cases, we replace one environmental problem with another. That's what I call progress. Let's go all the way and use Benzene or Carbon Tetrachloride to remove the oil so we can some health risks, too. -- Prentice From tegner at renget.se Mon Apr 12 11:01:54 2010 From: tegner at renget.se (Jon Tegner) Date: Mon, 12 Apr 2010 20:01:54 +0200 Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: References: <20100407110806.3AF522C01002@bmail01.one.com> <4BC220CD.1040709@renget.se> Message-ID: <4BC36012.80402@renget.se> > well, you can look up the max operating spec for your particular chips. > for instance, http://products.amd.com/en-us/OpteronCPUResult.aspx > shows that OS8439YDS6DGN includes chips rated 55-71. (there must be > some further package marking to determine which temp spec...) > I find it strange with this rather large temp range, and 55 seems very low to my experience. Could they possibly stand for something else? Did not find any description of the numbers anywhere on that address. /jon "Whimpy" ?! From james.p.lux at jpl.nasa.gov Mon Apr 12 15:24:34 2010 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Mon, 12 Apr 2010 15:24:34 -0700 Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: <4BC36012.80402@renget.se> References: <20100407110806.3AF522C01002@bmail01.one.com> <4BC220CD.1040709@renget.se> <4BC36012.80402@renget.se> Message-ID: > -----Original Message----- > From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Jon Tegner > Sent: Monday, April 12, 2010 11:02 AM > To: Mark Hahn > Cc: beowulf at beowulf.org > Subject: Re: [Beowulf] 96 cores in silent and small enclosure > > > > well, you can look up the max operating spec for your particular chips. > > for instance, http://products.amd.com/en-us/OpteronCPUResult.aspx > > shows that OS8439YDS6DGN includes chips rated 55-71. (there must be > > some further package marking to determine which temp spec...) > > > I find it strange with this rather large temp range, and 55 seems very > low to my experience. Could they possibly stand for something else? Did > not find any description of the numbers anywhere on that address. > The document Mark posted a link to this morning explains all. That temperature is the max case temperature given a certain power dissipation (TDP), heat sink, and ambient, and also rolls in some other assumptions (such as the thermal resistance from some junction to case) The actual "max temp" limit you're designing to is Tctl Max, which looks like it's 70C for the most part. The problem is that "Tctl Max (maximum control temperature) is a non-physical temperature on an arbitrary scale that can be used for system thermal management policies. Refer to the BIOS and Kernel Developer's Guide (BKDG) For AMD Family 10h Processors, order #31116" I think a fair amount of study is needed to really understand the thermal management of these devices. In many ways, doing it for a modern processor is like doing it for a whole PC board with lots of parts. You've got different functional blocks, all running at different speeds, some enabled, some disabled, so you can't just have a single "keep the case at point X below temp Y". From tegner at renget.se Mon Apr 12 23:18:16 2010 From: tegner at renget.se (Jon Tegner) Date: Tue, 13 Apr 2010 08:18:16 +0200 Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: References: <4BC36012.80402@renget.se> Message-ID: <20100413061816.5952AE947F182@bmail02.one.com> On Apr 13, 2010 00:24 "Lux, Jim (337C)" wrote: > > -----Original Message----- > > From: > > [mailto:beowulf-bounces at beowulf.org] On Behalf Of Jon Tegner > > Sent: Monday, April 12, 2010 11:02 AM > > To: Mark Hahn > > Cc: > > Subject: Re: [Beowulf] 96 cores in silent and small enclosure > > > > > > > well, you can look up the max operating spec for your particular > > > chips. > > > for instance, > > > > > > shows that OS8439YDS6DGN includes chips rated 55-71. (there must > > > be > > > some further package marking to determine which temp spec...) > > > > > I find it strange with this rather large temp range, and 55 seems > > very > > low to my experience. Could they possibly stand for something else? > > Did > > not find any description of the numbers anywhere on that address. > > > > > The document Mark posted a link to this morning explains all. > > That temperature is the max case temperature given a certain power > dissipation (TDP), heat sink, and ambient, and also rolls in some > other assumptions (such as the thermal resistance from some junction > to case) > > The actual "max temp" limit you're designing to is Tctl Max, which > looks like it's 70C for the most part. The problem is that?"Tctl Max > (maximum control temperature) is a non-physical temperature on an > arbitrary scale that?can be used for system thermal management > policies. Refer to the BIOS and Kernel Developer's?Guide (BKDG) For > AMD Family 10h Processors, order #31116" > > I think a fair amount of study is needed to really understand the > thermal management of these devices. In many ways, doing it for a > modern processor is like doing it for a whole PC board with lots of > parts. You've got different functional blocks, all running at > different speeds, some enabled, some disabled, so you can't just have > a single "keep the case at point X below temp Y". > > > > > > ********************************************************************** > ***** > > > Thanks for the information! Lets see if I understand this correctly: > > > * The temperature reported to bios is the Tctl-temperature? > * This "temperature" is non-physical, but the number is designed to be > relevant to the cooling requirements of the CPU. That is, if this > number is larger than Tctl Max, the cpu take corrective actions, e.g. > throttling down? > * If this number (Tctl) is below Tctl Max the chances are high that > the cpu will live a happy life for many years? It would be stupid of > AMD to not have designed this number with some margin to account for > different cooling situations. -------------- next part -------------- An HTML attachment was scrubbed... URL: From james.p.lux at jpl.nasa.gov Tue Apr 13 06:40:12 2010 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Tue, 13 Apr 2010 06:40:12 -0700 Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: <20100413061816.5952AE947F182@bmail02.one.com> Message-ID: On 4/12/10 11:18 PM, "Jon Tegner" wrote: >> >> >> I think a fair amount of study is needed to really understand the thermal >> management of these devices. In many ways, doing it for a modern processor >> is like doing it for a whole PC board with lots of parts. You've got >> different functional blocks, all running at different speeds, some enabled, >> some disabled, so you can't just have a single "keep the case at point X >> below temp Y". >> >> >> *************************************************************************** >> >> Thanks for the information! Lets see if I understand this correctly: >> >> * The temperature reported to bios is the Tctl-temperature? I don't think so. Tctl is a "design reference" of some sort. The BIOS reports the temperature of some sensor at some point on the chip. The relationship between that temperature and the limits is defined in that document (and it's not a fixed relationship, apparently). >> * This "temperature" is non-physical, but the number is designed to be >> relevant to the cooling requirements of the CPU. That is, if this number is >> larger than Tctl Max, the cpu take corrective actions, e.g. throttling down? >> * If this number (Tctl) is below Tctl Max the chances are high that the cpu >> will live a happy life for many years? It would be stupid of AMD to not have >> designed this number with some margin to account for different cooling >> situations. Nope.. The mfr comes up with some strategy for setting temperature limits, based on what they think will be acceptable life in some acceptable installation running some typical instruction stream. The "design reference" installation and instruction stream probably is different for processors targeted to different markets (e.g. Consumer laptop vs consumer set-top-box vs rack mounted server farm). The PC business is horribly cost sensitive, and a few pennies for a different fan or an extra piece of sheet metal to make the processor 2 degrees cooler makes a difference. Just because the horde of small PC manufacturers, in general, don't have good thermal designers, Intel and AMD actually provide a "reference thermal design" including fan size/speed, duct design, etc.; just like they provide reference mobo designs for the electrical aspects. A big company like HP or Dell has enough volume to design their own cases, etc.; they also sell desktops and laptops into big corporate accounts, which are slightly more sensitive to issues like life than consumers. So, in reality, it's smart of AMD to run right to the ragged edge, at least for consumer oriented parts. Most consumers will NOT run 100% duty cycle and will NOT run their computers in 40C air and will NOT be sensitive to a few months shorter life (at least in the aggregate). Given that the warranty term on most computers is no greater than a year, a 2 year design life might be reasonable. The design and use model is very very different from an infrastructure, industrial (or space) application where long life is an important design concern. In those kinds of applications, you'll see a lot more attention to derating, conservative temperatures, and actually understanding the failure mechanisms. There's a fairly easily ascertained economic value to having to deal with a failed network switch or server. If that switch is handling millions of dollars of financial transactions, the downtime cost is pretty high. The downtime cost for a consumer PC, after the warranty has run out, is pretty darn low. > From hahn at mcmaster.ca Tue Apr 13 08:38:09 2010 From: hahn at mcmaster.ca (Mark Hahn) Date: Tue, 13 Apr 2010 11:38:09 -0400 (EDT) Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: References: Message-ID: > warranty term on most computers is no greater than a year, a 2 year design > life might be reasonable. hard to tell exactly what actuarial voodoo they do. I noticed AMD's cpus are waranteed for 3 years. whether they shave the tolerances by recognizing that lots of cpus are used for shorter lives, or at low duty cycle, we can only guess. I suspect companies are pretty conservative when evaluating this kind of risk, as loss of reputation is considered a very high cost. (or rather the gaining of a bad rep.) From james.p.lux at jpl.nasa.gov Tue Apr 13 09:47:34 2010 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Tue, 13 Apr 2010 09:47:34 -0700 Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: References: Message-ID: > -----Original Message----- > From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Mark Hahn > Sent: Tuesday, April 13, 2010 8:38 AM > To: beowulf at beowulf.org > Subject: Re: [Beowulf] 96 cores in silent and small enclosure > > > warranty term on most computers is no greater than a year, a 2 year design > > life might be reasonable. > > hard to tell exactly what actuarial voodoo they do. I noticed AMD's > cpus are waranteed for 3 years. whether they shave the tolerances > by recognizing that lots of cpus are used for shorter lives, or at > low duty cycle, we can only guess. I suspect companies are pretty > conservative when evaluating this kind of risk, as loss of reputation > is considered a very high cost. (or rather the gaining of a bad rep.) Ahh.. but the market is partitioned.. Folks like the ones on this list tend to be buying "server class" processors installed in "server class" hardware, so even if the bottom of the line consumer ones have high failure rates, we'd ignore it, and concentrate on the ones you're buying. A mass market seller doing $500 notebooks will very carefully analyze the return rates, etc. and target appropriately. It would not surprise me at all that AMD and Intel recognize this, and it's manifested in the various thermal design documents, albeit not with a explanatory comment like "models A,B,and C are intended for cost sensitive consumer products while models X,Y,and Z are intended for performance and reliability sensitive applications, and the thermal models reflect this". From hahn at mcmaster.ca Tue Apr 13 10:37:22 2010 From: hahn at mcmaster.ca (Mark Hahn) Date: Tue, 13 Apr 2010 13:37:22 -0400 (EDT) Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: <4BC36012.80402@renget.se> References: <20100407110806.3AF522C01002@bmail01.one.com> <4BC220CD.1040709@renget.se> <4BC36012.80402@renget.se> Message-ID: > I find it strange with this rather large temp range, and 55 seems very low to > my experience. Could they possibly stand for something else? Did not find any > description of the numbers anywhere on that address. I think you should always worry about any temperature measured on a system that's in the >= 65C range. as Jim mentioned, the temps that matter are actually on-chip and not really accessible - and it's unknown to us what they should be anyway, or how long they can tolerate particular temps. and whether over-temp failure modes would be transient (conductivity in semiconductors changes rapidly as a function of temperature) or gradual (electromigration or perhaps the solder-ball problems nvidia had)... the original question was about wheter 60-65C is a safe operating temperature. I think it's pretty clearly high - whether it's critical depends on how it's measured, the specific chip's specs, etc. but it's not the sort of operating range I'd be aiming for. From tegner at renget.se Tue Apr 13 11:40:11 2010 From: tegner at renget.se (Jon Tegner) Date: Tue, 13 Apr 2010 20:40:11 +0200 Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: References: <20100407110806.3AF522C01002@bmail01.one.com> <4BC220CD.1040709@renget.se> <4BC36012.80402@renget.se> Message-ID: <4BC4BA8B.3050809@renget.se> Mark Hahn wrote: >> I find it strange with this rather large temp range, and 55 seems >> very low to my experience. Could they possibly stand for something >> else? Did not find any description of the numbers anywhere on that >> address. > > I think you should always worry about any temperature measured on a > system that's in the >= 65C range. as Jim mentioned, the temps > that matter are actually on-chip and not really accessible - and it's > unknown to us what they should be anyway, or how long they can > tolerate particular temps. and whether over-temp failure > modes would be transient (conductivity in semiconductors changes > rapidly as a function of temperature) or gradual (electromigration > or perhaps the solder-ball problems nvidia had)... > > the original question was about wheter 60-65C is a safe operating > temperature. I think it's pretty clearly high - whether it's critical > depends on how it's measured, the specific chip's specs, etc. > but it's not the sort of operating range I'd be aiming for. But there should be possible to save money by running hotter. Suppose you could accept 10 degrees higher temp, then you would not have to run the AC in the room as hard (and AC represents a significant part of the operating cost). If the price you pay is that your CPUS will only last for 4 years (I'm just speculating here, and for the moment only consider the cpu) instead of 10 years it would probably be an economically much better option. From hahn at mcmaster.ca Tue Apr 13 22:14:58 2010 From: hahn at mcmaster.ca (Mark Hahn) Date: Wed, 14 Apr 2010 01:14:58 -0400 (EDT) Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: <4BC4BA8B.3050809@renget.se> References: <20100407110806.3AF522C01002@bmail01.one.com> <4BC220CD.1040709@renget.se> <4BC36012.80402@renget.se> <4BC4BA8B.3050809@renget.se> Message-ID: >> the original question was about wheter 60-65C is a safe operating >> temperature. I think it's pretty clearly high - whether it's critical >> depends on how it's measured, the specific chip's specs, etc. >> but it's not the sort of operating range I'd be aiming for. > But there should be possible to save money by running hotter. Suppose you sure: move more air and/or provide a lower thermal-resistance heatsink. > could accept 10 degrees higher temp, then you would not have to run the AC in > the room as hard (and AC represents a significant part of the operating the max temp spec is not some arbitrary knob that the chip vendors choose out of spiteful anti-green-ness. I wouldn't be surprised to see some upward change in coming years, but issues here are nontrivial. do you still want the chip to operate correctly at 20C as well as 90C? we're talking fairly big deals like lower doping or non-silicon materials. > cost). If the price you pay is that your CPUS will only last for 4 years (I'm > just speculating here, and for the moment only consider the cpu) instead of > 10 years it would probably be an economically much better option. the problem is that failure rates are pretty nonlinear. my guess is that undercooling (or overvolt/clocking) will increase your early failure rate as well as putting you in a pretty steep zone by year three. I expect a low failure rate for a server well past 5 years (cold room, server-level cooling, 100% duty cycle.) but the fact that chip vendors do 3-year warranties makes me think that going to 4 years would cost them significantly more... From tegner at renget.se Wed Apr 14 01:12:27 2010 From: tegner at renget.se (Jon Tegner) Date: Wed, 14 Apr 2010 10:12:27 +0200 Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: References: <4BC4BA8B.3050809@renget.se> Message-ID: <20100414081227.B68FCFA44ACFA@bmail02.one.com> > > the max temp spec is not some arbitrary knob that the chip vendors > choose out of spiteful anti-green-ness. I wouldn't be surprised to see > some > > > **************************************************************** > > > Issue is not the temp spec of current cpus, problem is that it is hard > to get relevant information. I haven't found any that states that the > failure rate in year 5 should be significantly higher if you operate > the cpu at 65 C instead of 55 C. I'm just saying this kind of > information would be valuable (and I would be glad to find it). > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From james.p.lux at jpl.nasa.gov Wed Apr 14 06:49:57 2010 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Wed, 14 Apr 2010 06:49:57 -0700 Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: <20100414081227.B68FCFA44ACFA@bmail02.one.com> Message-ID: Start with Arrhenius.. 10degree rise halves the life. Actually, there's a huge amount of information out there on semiconductor failure and life effects. It's just not distilled down to a "for part #, here's what happens", because it depends on a lot of things. On 4/14/10 1:12 AM, "Jon Tegner" wrote: the max temp spec is not some arbitrary knob that the chip vendors choose out of spiteful anti-green-ness. I wouldn't be surprised to see some **************************************************************** Issue is not the temp spec of current cpus, problem is that it is hard to get relevant information. I haven't found any that states that the failure rate in year 5 should be significantly higher if you operate the cpu at 65 C instead of 55 C. I'm just saying this kind of information would be valuable (and I would be glad to find it). -------------- next part -------------- An HTML attachment was scrubbed... URL: From tegner at renget.se Wed Apr 14 08:41:26 2010 From: tegner at renget.se (Jon Tegner) Date: Wed, 14 Apr 2010 17:41:26 +0200 Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: References: <20100414081227.B68FCFA44ACFA@bmail02.one.com> Message-ID: <20100414154126.771C2304A019@bmail00.one.com> I have no clue of how to do this "distillation process" - it is not my field. How would you do this? Do you have the numbers for any cpu? And Arrhenius - again, semiconductors is not my field - would a 10 degree rise halve the life span irrespective of activation energy and temperature range? On Apr 14, 2010 15:49 "Lux, Jim (337C)" wrote: > Start with Arrhenius.. 10degree rise halves the life. > > Actually, there?s a huge amount of information out there on > semiconductor failure and life effects. It?s just not distilled down > to > a ?for part #, here?s what happens?, because it depends on a lot of > things. > > > On 4/14/10 1:12 AM, "Jon Tegner" <> wrote: > > > the max temp spec is not some arbitrary knob that the chip vendors > > > choose out of spiteful anti-green-ness. I wouldn't be surprised to > > > see some > > > > > > **************************************************************** > > > > > > Issue is not the temp spec of current cpus, problem is that it is > > > hard to get relevant information. I haven't found any that states > > > that the failure rate in year 5 should be significantly higher if > > > you operate the cpu at 65 C instead of 55 C. I'm just saying this > > > kind of information would be valuable (and I would be glad to find > > > it). > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From james.p.lux at jpl.nasa.gov Wed Apr 14 09:53:41 2010 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Wed, 14 Apr 2010 09:53:41 -0700 Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: <20100414154126.771C2304A019@bmail00.one.com> Message-ID: It's a huge field of research. Yes, there are differences from the materials involved and the actual temperature, but the 10 degrees=factor of 2 is a decent starting point. It's one of those things where you can start simple, but for any non-trivial case, it gets real complex. Don't forget that the temperature across the die varies pretty substantially too, so the "aging" will be different in different parts of the die. For people who really care (e.g. My colleagues and I building spacecraft), you have a team of people who understand device physics study the device, and then you might do testing. On 4/14/10 8:41 AM, "Jon Tegner" wrote: > I have no clue of how to do this "distillation process" - it is not my field. > How would you do this? Do you have the numbers for any cpu? > > And Arrhenius - again, semiconductors is not my field - would a 10 degree rise > halve the life span irrespective of activation energy and temperature range? > > > On Apr 14, 2010 15:49 "Lux, Jim (337C)" > wrote: >> Start with Arrhenius.. 10degree rise halves the life. >> >> Actually, there?s a huge amount of information out there on >> semiconductor failure and life effects. It?s just not distilled down to >> a ?for part #, here?s what happens?, because it depends on a lot of >> things. >> >> >> On 4/14/10 1:12 AM, "Jon Tegner" <> wrote: >>>> the max temp spec is not some arbitrary knob that the chip vendors >>>> choose out of spiteful anti-green-ness. I wouldn't be surprised to >>>> see some >>>> >>>> **************************************************************** >>>> >>>> Issue is not the temp spec of current cpus, problem is that it is >>>> hard to get relevant information. I haven't found any that states >>>> that the failure rate in year 5 should be significantly higher if >>>> you operate the cpu at 65 C instead of 55 C. I'm just saying this >>>> kind of information would be valuable (and I would be glad to find >>>> it). >>> > From mathog at caltech.edu Wed Apr 14 14:49:57 2010 From: mathog at caltech.edu (David Mathog) Date: Wed, 14 Apr 2010 14:49:57 -0700 Subject: [Beowulf] Sil 3124 controller, boot from disk, UUID issues Message-ID: The Arima HDAMAI's onboard Sil 3114 controller wasn't able to deliver maximum disk performance even after much tweaking. So a Syba SY-PCX40009 (Sil 3124) PCI-X controller was installed (firmware 6.3.18). When the system was booted from a small disk still on the 3114 controller linux had no problem seeing a large disk on the 3124 controller, and it could mount the large disks partitions and use them normally. Performance testing for the disk on the 3124 was much better than for the 3114, with a sustained write of 4GB at 108.4MB/s, and bonnie++ results of: Version 1.03 ------Sequential Output------ --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP newsaf.bio.calte 8G 49331 94 116716 45 51925 18 53114 92 136625 21 315.3 1 ------Sequential Create------ --------Random Create-------- -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++ newsaf.bio.caltech.edu,8G,49331,94,116716,45,51925,18,53114,92,136625,21,315.3,1,16,+++++,+++,+++++,+++,+++++,+++,+++++,+++,+++++,+++,+++++,+++ Here's the problem - in order to boot from a Sil 3124 controller the disk must be configured as a single disk concatenation set. If it isn't, the disk does not show up in the BIOS boot menu. This is different from the 3114 controller, where any disks plugged in, but not part of a raid set, show up on the boot list. After doing that, the concatenation set is listed as a disk in the boot menu in the bios (not the disk itself). Booting from this "concatenation set" works, mostly, but /dev/sda7 was not mounting. However, it could be mounted later with mount -t ext4 /dev/sda7 /home This shows why the original mount fails: # blkid /dev/sda6: UUID="80d7830d-53f5-4b69-8b4b-e67ee1d47c9c" TYPE="ext4" /dev/sda1: UUID="ce8b6ea5-8bca-45d3-b9f0-a265e611b929" TYPE="ext4" /dev/sdb1: UUID="caf1e9ea-4eae-41f8-98d1-c4e4cbd6b102" TYPE="ext4" /dev/sdb6: UUID="2b0ef732-9526-49c1-9cb9-f0b73c86874c" TYPE="ext4" /dev/sdb5: UUID="485b0c52-e133-4359-88fa-c01d7d4ae3d1" TYPE="swap" /dev/sda5: UUID="eb79e4d0-66a2-473d-a8fa-5c444e38bb87" TYPE="swap" /dev/sda7: TYPE="silicon_medley_raid_member" and /etc/fstab had this to get /dev/sda7 mounted: UUID=2c87348d-a79e-46ff-8693-1df40409a805 /home ext4 relatime 1 2 where that was the UUID seen by the small disk before the disk was made into a concatenation set. Earlier I had found that plugging disks into the 3114 caused them to have a higher priority than those on the 3124, even when the boot device was on the 3124, so using "/dev/sda7" type entries wasn't reliable if the system was being reconfigured - it could come up as sdb or sdc. What I'm seeking is a way to configure this system so that it will boot properly from the disk on the 3124 controller whether or not there are disk(s) also plugged into the 3114. Possible? The only thing I could come up with that was sure to work was to NOT make the boot disk into a concatenation set, and boot the kernel (with sata_sil24) from a USB key, then have it pick up its partitions. Is there a way to do this without resorting to arcane boot mechanisms??? I foresee problems at reboot too, because: # fsck -t ext4 /dev/sda7 fsck from util-linux-ng 2.16.1 fsck: fsck.silicon_medley_raid_member: not found fsck: Error 2 while executing fsck.silicon_medley_raid_member for /dev/sda7 Besides this oddness with the UUID on the 7th partition, what else might the controller have done when it set this disk up as single disk concatenation set? Would it be prudent to boot from another OS (network), remake the filesystems, and restore them from backups? Thanks, David Mathog mathog at caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech From james.p.lux at jpl.nasa.gov Wed Apr 14 16:39:53 2010 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Wed, 14 Apr 2010 16:39:53 -0700 Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: <20100414081227.B68FCFA44ACFA@bmail02.one.com> References: <4BC4BA8B.3050809@renget.se> <20100414081227.B68FCFA44ACFA@bmail02.one.com> Message-ID: Try this http://rel.intersil.com/docs/rel/calculation_of_semiconductor_failure_rates.pdf You might also look for MIL-HDBK-217 Of course, a paper by H.S. Blanks makes the following statement: Although the temperature dependence of failure rate can be very high, in most situations it is much less than that of the Arrhenius acceleration factor. It is very improbable that the temperature dependence of component failure rate can be meaningfully modelled for reliability prediction purposes or for the purpose of optimizing thermal design component layout. (from abstract for "Arrhenius and the temperature dependence of non-constant failure rate" Quality and Reliability Engineering International, Vol 6, #4, pp259-265, 20 Mar 2007) You might also browse around http://www.weibull.com/ or http://www.klabs.org/ Jim From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Jon Tegner Sent: Wednesday, April 14, 2010 1:12 AM To: Mark Hahn Cc: beowulf at beowulf.org Subject: Re: Re: [Beowulf] 96 cores in silent and small enclosure the max temp spec is not some arbitrary knob that the chip vendors choose out of spiteful anti-green-ness. I wouldn't be surprised to see some **************************************************************** Issue is not the temp spec of current cpus, problem is that it is hard to get relevant information. I haven't found any that states that the failure rate in year 5 should be significantly higher if you operate the cpu at 65 C instead of 55 C. I'm just saying this kind of information would be valuable (and I would be glad to find it). From bcostescu at gmail.com Thu Apr 15 02:01:15 2010 From: bcostescu at gmail.com (Bogdan Costescu) Date: Thu, 15 Apr 2010 11:01:15 +0200 Subject: [Beowulf] Sil 3124 controller, boot from disk, UUID issues In-Reply-To: References: Message-ID: On Wed, Apr 14, 2010 at 11:49 PM, David Mathog wrote: > Here's the problem - in order to boot from a Sil 3124 controller the > disk must be configured as a single disk concatenation set. Can't you find some other controller which allows better control of the boot related options ? > What I'm seeking is a way to configure this system so that it will boot > properly from the disk on the 3124 controller whether or not there are > disk(s) also plugged into the 3114. Have you tried to completely disable the 3114 controller ? Not only unplugging all disks, but disabling it in BIOS. Cheers, Bogdan From mathog at caltech.edu Thu Apr 15 09:34:43 2010 From: mathog at caltech.edu (David Mathog) Date: Thu, 15 Apr 2010 09:34:43 -0700 Subject: [Beowulf] Sil 3124 controller, boot from disk, UUID issues Message-ID: Here is a work around for this issue. For whatever reason the Sil 3124 controller reported sda7 as TYPE="silicon_medley_raid_member", but NOT any of the other partitions. Taking a wild guess that the controller was reporting the last partition this way, and not really doing anything to the partition, I changed the partition table from: /dev/sda7 109433088 1953523055 to /dev/sda7 109433088 1953523055 922044984 83 Linux /dev/sda8 1953523120 1953525167 1024 83 Linux wrote that to disk, then partprobe /dev/sda blkid and now we find /dev/sda8: TYPE="silicon_medley_raid_member" with sda7 back to its original UUID! This fits the hypothesis that the UUID wasn't overwritten by the controller, the controller just hides information about the last partition. My evil plan seems to be working, the 8th partition now takes the hit, and there is nothing in the small 8th partition, so that is fine. Of course the file system on sda7 is now toast, so... mkfs -t ext4 /dev/sda7 mount -t ext4 /dev/sda7 /home (restore contents of sda7 == /home from remote storage) nedit /etc/fstab (put in new UUID for sda7, it changed at mkfs) reboot and the system mounts all of the partitions fine, including sda7. % blkid | sort /dev/sda1: UUID="ce8b6ea5-8bca-45d3-b9f0-a265e611b929" TYPE="ext4" /dev/sda5: UUID="eb79e4d0-66a2-473d-a8fa-5c444e38bb87" TYPE="swap" /dev/sda6: UUID="80d7830d-53f5-4b69-8b4b-e67ee1d47c9c" TYPE="ext4" /dev/sda7: UUID="d3213c8e-3682-4168-b11e-d0b949aee9c9" TYPE="ext4" /dev/sda8: TYPE="silicon_medley_raid_member" /dev/sdb1: UUID="caf1e9ea-4eae-41f8-98d1-c4e4cbd6b102" TYPE="ext4" /dev/sdb5: UUID="485b0c52-e133-4359-88fa-c01d7d4ae3d1" TYPE="swap" /dev/sdb6: UUID="2b0ef732-9526-49c1-9cb9-f0b73c86874c" TYPE="ext4" where sdb is just a disk, not part of any raid or concatenation set, and sda is the single member of a concatenation set so that it can boot. In summary, linux can work through the 3124 controller without the controller having to be told about the disks, in which case the 3124 doesn't play games with the partition information. Unfortunately the only way to make a disk on the controller bootable is to tell the controller about it, in which case the controller reports odd things for the last partition. So make sure the last partition is small and not used by the linux system, and this will not get in the way. (I don't know how small it can be for this to work, probably smaller than 1MB, but I didn't experiment further.) It would have been nice if some of this was documented somewhere on either Silicon Image or Syba's sites, but if it was, I could not find it. Regards, David Mathog mathog at caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech From gerry.creager at tamu.edu Fri Apr 16 18:06:38 2010 From: gerry.creager at tamu.edu (Gerry Creager) Date: Fri, 16 Apr 2010 20:06:38 -0500 Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: References: <4BC4BA8B.3050809@renget.se> <20100414081227.B68FCFA44ACFA@bmail02.one.com> Message-ID: <4BC9099E.9000802@tamu.edu> I hadn't looked at -217 since, well, I was designing spaceflight hardware... This is a very nice set of references; I'm especially fond of perusing the Weibull data. I'd not looked at klabs before. I'll echo the 2x factoring with 10 deg temperature rise. And, I hear, al the time, from bean counters and room monitors, how we should run our machine rooms hotter. I've got 2 with ambient setpoints at 80F right now, and we see, in our 300 node cluster, an average of one DIMM and one hard drive/week. It's a real good thing the hardware's all still under maintenance, else we'd be out of systems already. Over the winter, when building thermal sink was lower, we also saw fewer failures. gerry Lux, Jim (337C) wrote: > Try this > http://rel.intersil.com/docs/rel/calculation_of_semiconductor_failure_rates.pdf > > You might also look for MIL-HDBK-217 > > Of course, a paper by H.S. Blanks makes the following statement: > Although the temperature dependence of failure rate can be very high, in most situations it is much less than that of the Arrhenius acceleration factor. It is very improbable that the temperature dependence of component failure rate can be meaningfully modelled for reliability prediction purposes or for the purpose of optimizing thermal design component layout. > (from abstract for "Arrhenius and the temperature dependence of non-constant failure rate" Quality and Reliability Engineering International, Vol 6, #4, pp259-265, 20 Mar 2007) > > You might also browse around http://www.weibull.com/ or http://www.klabs.org/ > > > Jim > > > From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Jon Tegner > Sent: Wednesday, April 14, 2010 1:12 AM > To: Mark Hahn > Cc: beowulf at beowulf.org > Subject: Re: Re: [Beowulf] 96 cores in silent and small enclosure > > > the max temp spec is not some arbitrary knob that the chip vendors > choose out of spiteful anti-green-ness. I wouldn't be surprised to see some > > **************************************************************** > > Issue is not the temp spec of current cpus, problem is that it is hard to get relevant information. I haven't found any that states that the failure rate in year 5 should be significantly higher if you operate the cpu at 65 C instead of 55 C. I'm just saying this kind of information would be valuable (and I would be glad to find it). > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From tomislav.maric at gmx.com Sat Apr 17 06:19:06 2010 From: tomislav.maric at gmx.com (tomislav_maric@gmx.com) Date: Sat, 17 Apr 2010 15:19:06 +0200 Subject: [Beowulf] compute node hardware selection Message-ID: <20100417140015.134300@gmx.com> Hi everyone, I have built and played with my home beowulf cluster running rocks for a while and now I would like to construct a bigger one. I have bought a book called Building Clustered Linux Systems (being a noob at HPC). The book is most excellent. The cluster is to be used for CFD and FEM computations. A side note: I have a MSc in computational science so I know my applications in detail, and the numerics behind the calculations, but on the HPC side, I'm kind of a noob (learning it for only one year of the required 25 years ;) ). I have some questions regarding the choice of hardware architecture for the compute nodes. Since I am on a low budget I would like to implement a 16 compute nodes cluster of COTS electronics (for my budget, Xeons and Opterons are not COTS, definitely) and I have trouble deciding whether to Dual Core or Quad Core, Intel vs AMD processors for the cluster. gcc is used for the compilation and the AMD hardware is cheaper, so I'm inclined on AMD, but I would appreciate any advice on this: I have been given advice before that icc can get me up to 20% of speed increase on Intel processors. Another reason is the fact that since I'm running coarse grained CFD/FEM simulations, there is not much use of the bigger cache that is held by the server type processors like Opteron and Xeon, or am I wrong? The data is really huge so not much can happen in the cache that can stay there for a while and make it be useful that way. I have read that for the multiple core processors the system bus can get saturated, so I am running benchmarks on two single machines: one with 2 core processor and the other one with 4 cores. The idea is to run a benchmarking case of my choice (transient multiphase incompressible fluid flow) and increase the case size and the number of processors to see when the parallelization is impacted by the traffic on the system bus and to estimate the biggest size of the simulation case for the single compute node. How can I avoid I/O speeds to impact my IPC estimation for a single slice? I will up a RAID 0 on the machine and with no networking involved, I am not sure that there is anything else I could do to take the I/O impact out of the benchmarking for the single slice. I am describing this in details because I really want to avoid spending the budget in the wrong way. I will appreciate any advice that you can spare. Tomislav -------------- next part -------------- An HTML attachment was scrubbed... URL: From a28427 at ua.pt Sat Apr 17 19:33:08 2010 From: a28427 at ua.pt (Tiago Marques) Date: Sun, 18 Apr 2010 03:33:08 +0100 Subject: [Beowulf] AMD 6100 vs Intel 5600 In-Reply-To: References: <4BB369E1.60906@cora.nwra.com> <4BB3B5C6.8060001@cse.ucdavis.edu> <201004010959.15152.cap@nsc.liu.se> Message-ID: On Thu, Apr 1, 2010 at 1:34 PM, Jerker Nyberg wrote: > On Thu, 1 Apr 2010, Peter Kjellstrom wrote: > > My experience is that in HPC it always boils down to price/performance and >> that would in my eyes make apples out of Magnycour and Westmere. >> > > I just ordered two desktop systems with Intel i7-860 2.8 GHz QC and 16 GB > RAM for evaluation, to run as computation nodes for our CPU-bound batchlike > application. I'll figure out the performance/price later but it seems to be > significantly better than Xeons from the same vendor. It feels like 15 years > ago all over again. I guess the major drawback is the lack of ECC RAM, so > maybe they get their second life as ordinary desktops sooner rather than > later... > Actually, if you look hard enough ( http://www.asrock.com/mb/overview.asp?Model=X58%20SuperComputer - or better yet, ASUS P6T6-WS ), and a Xeon i7-860 ( X3460 - http://www.siliconmadness.com/2009/09/intel-releases-lynnfield-updates-xeon.html), which is priced slightly higher and BOTH support ECC RAM on commodity grade hardware. Best regards, Tiago > > Regards, > Jerker Nyberg. > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rchang.lists at gmail.com Sat Apr 17 21:15:30 2010 From: rchang.lists at gmail.com (Richard Chang) Date: Sun, 18 Apr 2010 09:45:30 +0530 Subject: [Beowulf] Building a Beowulf - Noob Message-ID: <4BCA8762.70806@gmail.com> I am a bit new. Trying to build a cluster from ground up. I need to build a new 32 node cluster from ground up. Hardware has been decided, Nehalam based, but OS is still a debate. I am in a dilemma whether to keep things simple or to cut costs. In order to keep things simple, I can order Platform Cluster Manager with RHEL HPC license per node. Or I can use Rocks with CentOS. I want to know what the other members of this list do. Thanks, Richard From jlforrest at berkeley.edu Sat Apr 17 21:26:34 2010 From: jlforrest at berkeley.edu (Jon Forrest) Date: Sat, 17 Apr 2010 21:26:34 -0700 Subject: [Beowulf] Building a Beowulf - Noob In-Reply-To: <4BCA8762.70806@gmail.com> References: <4BCA8762.70806@gmail.com> Message-ID: <4BCA89FA.5050402@berkeley.edu> On 4/17/2010 9:15 PM, Richard Chang wrote: > I am a bit new. Trying to build a cluster from ground up. I need to > build a new 32 node cluster from ground up. > > Hardware has been decided, Nehalam based, but OS is still a debate. I am > in a dilemma whether to keep things simple or to cut costs. > > In order to keep things simple, I can order Platform Cluster Manager > with RHEL HPC license per node. Or I can use Rocks with CentOS. > > I want to know what the other members of this list do. What I do is use the free Rocks clustering package (http://www.rocksclusters.org/wordpress/). It's free and you can be up and running in no time. Also free is Perceus (http://www.perceus.org/portal/) which takes a different approach than Rocks. It's more work to get started but what you end up with is exactly what you want - no more and no less. There's no reason to spend any money on software to create a cluster. Maybe later you'll find some commercial products that solve specific problems, but one of the nice things about modern clusters is that there's so much good quality free software available. Cordially, -- Jon Forrest Research Computing Support College of Chemistry 173 Tan Hall University of California Berkeley Berkeley, CA 94720-1460 510-643-1032 jlforrest at berkeley.edu From tegner at renget.se Sat Apr 17 23:32:19 2010 From: tegner at renget.se (Jon Tegner) Date: Sun, 18 Apr 2010 08:32:19 +0200 Subject: [Beowulf] Building a Beowulf - Noob In-Reply-To: <4BCA8762.70806@gmail.com> References: <4BCA8762.70806@gmail.com> Message-ID: <4BCAA773.2090108@renget.se> Take a look at perceus, can be really easy, check www.infiscale.com/html/perceus_on_enterprise_linux_qu.html Richard Chang wrote: > I am a bit new. Trying to build a cluster from ground up. I need to > build a new 32 node cluster from ground up. > > Hardware has been decided, Nehalam based, but OS is still a debate. I am > in a dilemma whether to keep things simple or to cut costs. > > In order to keep things simple, I can order Platform Cluster Manager > with RHEL HPC license per node. Or I can use Rocks with CentOS. > > I want to know what the other members of this list do. > > Thanks, > Richard > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > From rchang.lists at gmail.com Sun Apr 18 05:02:16 2010 From: rchang.lists at gmail.com (Richard Chang) Date: Sun, 18 Apr 2010 17:32:16 +0530 Subject: [Beowulf] Building a Beowulf - Noob In-Reply-To: <4BCA89FA.5050402@berkeley.edu> References: <4BCA8762.70806@gmail.com> <4BCA89FA.5050402@berkeley.edu> Message-ID: <4BCAF4C8.8090408@gmail.com> Jon Forrest wrote: > What I do is use the free Rocks clustering package > (http://www.rocksclusters.org/wordpress/). It's free > and you can be up and running in no time. > > Also free is Perceus (http://www.perceus.org/portal/) > which takes a different approach than Rocks. > It's more work to get started but what you end > up with is exactly what you want - no more and no less. > > There's no reason to spend any money on software to > create a cluster. Maybe later you'll find some commercial > products that solve specific problems, but one of the nice > things about modern clusters is that there's so much > good quality free software available. > > Cordially, Thanks Jon, I will try both and see which one is better suited for us. regards, Richard. From rchang.lists at gmail.com Sun Apr 18 05:04:11 2010 From: rchang.lists at gmail.com (Richard Chang) Date: Sun, 18 Apr 2010 17:34:11 +0530 Subject: [Beowulf] Building a Beowulf - Noob In-Reply-To: <4BCAA773.2090108@renget.se> References: <4BCA8762.70806@gmail.com> <4BCAA773.2090108@renget.se> Message-ID: <4BCAF53B.6020202@gmail.com> Jon Tegner wrote: > Take a look at perceus, can be really easy, check > > www.infiscale.com/html/perceus_on_enterprise_linux_qu.html > Thanks for leading me to Perceus. It looks very easy as per the above URL. I will try it out. regards, Richard. From john.hearns at mclaren.com Mon Apr 19 01:30:14 2010 From: john.hearns at mclaren.com (Hearns, John) Date: Mon, 19 Apr 2010 09:30:14 +0100 Subject: [Beowulf] Building a Beowulf - Noob In-Reply-To: <4BCA8762.70806@gmail.com> References: <4BCA8762.70806@gmail.com> Message-ID: <68A57CCFD4005646957BD2D18E60667B101A6FBB@milexchmb1.mil.tagmclarengroup.com> > I am a bit new. Trying to build a cluster from ground up. I need to > build a new 32 node cluster from ground up. > > Hardware has been decided, Nehalam based, but OS is still a debate. I > am > in a dilemma whether to keep things simple or to cut costs. My advice? Talk to the cluster vendors. There are several on this list, or just look at clustermonkey.net or other sites. You then get a tried and tested software stack, support on the software side and support on the hardware - ie something goes 'pop' you should get on the phone and have a replacement sent out. Cluster vendors will also happily help you get your applications integrated. John Hearns The contents of this email are confidential and for the exclusive use of the intended recipient. If you receive this email in error you should not copy it, retransmit it, use it or disclose its contents but should return it to the sender immediately and delete your copy. From eugen at leitl.org Mon Apr 19 07:27:32 2010 From: eugen at leitl.org (Eugen Leitl) Date: Mon, 19 Apr 2010 16:27:32 +0200 Subject: [Beowulf] 96 cores in silent and small enclosure In-Reply-To: <4BC9099E.9000802@tamu.edu> References: <4BC4BA8B.3050809@renget.se> <20100414081227.B68FCFA44ACFA@bmail02.one.com> <4BC9099E.9000802@tamu.edu> Message-ID: <20100419142732.GB1964@leitl.org> On Fri, Apr 16, 2010 at 08:06:38PM -0500, Gerry Creager wrote: > I'll echo the 2x factoring with 10 deg temperature rise. And, I hear, al > the time, from bean counters and room monitors, how we should run our > machine rooms hotter. I've got 2 with ambient setpoints at 80F right Does anyone here have nonanecdotal data on (Intel X-25M G2) SSD populations, especially at slightly elevated temperatures? > now, and we see, in our 300 node cluster, an average of one DIMM and one > hard drive/week. It's a real good thing the hardware's all still under > maintenance, else we'd be out of systems already. Over the winter, when > building thermal sink was lower, we also saw fewer failures. -- Eugen* Leitl leitl http://leitl.org ______________________________________________________________ ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org 8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE From hahn at mcmaster.ca Mon Apr 19 11:11:40 2010 From: hahn at mcmaster.ca (Mark Hahn) Date: Mon, 19 Apr 2010 14:11:40 -0400 (EDT) Subject: [Beowulf] Building a Beowulf - Noob In-Reply-To: <4BCA8762.70806@gmail.com> References: <4BCA8762.70806@gmail.com> Message-ID: > I want to know what the other members of this list do. the answer depends almost entirely on you: do you want to pay for support? how much DIY are you comfortable with? bear in mind that most of us have had "very mixed" experience with support, though I suppose that also varies with your endogenous expertise level... From rchang.lists at gmail.com Mon Apr 19 20:34:34 2010 From: rchang.lists at gmail.com (Richard Chang) Date: Tue, 20 Apr 2010 09:04:34 +0530 Subject: [Beowulf] Building a Beowulf - Noob In-Reply-To: References: <4BCA8762.70806@gmail.com> Message-ID: <4BCD20CA.4070605@gmail.com> Mark Hahn wrote: > the answer depends almost entirely on you: do you want to pay for > support? how much DIY are you comfortable with? bear in mind that > most of us have had "very mixed" experience with support, though > I suppose that also varies with your endogenous expertise level... Hello Mark, I would like to do it myself but not to the limit of pulling off my hair. We are a small group with limited resources. I think I know what you mean by "very mixed" experience. Infact, I have also had my own bitter experience trying to get support for a particular brand of hardware from the so-called experts in the field. Uptime and availability is not a big issue. So, I think I can live with DIY software so long as I can manage. Thanks, Richard. From brice.goglin at gmail.com Mon Apr 19 22:55:57 2010 From: brice.goglin at gmail.com (Brice Goglin) Date: Tue, 20 Apr 2010 07:55:57 +0200 Subject: [Beowulf] Hardware locality (hwloc) v1.0rc1 released Message-ID: <4BCD41ED.8070808@gmail.com> The Hardware Locality (hwloc) team is pleased to announce the first release candidate for v1.0: http://www.open-mpi.org/projects/hwloc/ (mirrors will update shortly) hwloc provides command line tools and a C API to obtain the hierarchical map of key computing elements, such as: NUMA memory nodes, shared caches, processor sockets, processor cores, and processor "threads". hwloc also gathers various attributes such as cache and memory information, and is portable across a variety of different operating systems and platforms. v1.0rc1 is the first milestone of a major feature release. Many features and changes have been added since the v0.9 series. Although v1.0rc1 is only a prerelease, we felt it important to announce the first in the series in order to gain feedback and widespread testing before v1.0 goes final. Please try hwloc out on your system, read its improved documentation, and send us your feedback. The following is a summary of the changes since the v0.9 series (this list may change before v1.0 goes final): * The ABI of the library has changed. * Backend updates + Add FreeBSD support. + Add x86 cpuid based backend. + Add Linux cgroup support to the Linux cpuset code. + Support binding of entire multithreaded process on Linux. + Cleanup XML export/import. * Objects + HWLOC_OBJ_PROC is renamed into HWLOC_OBJ_PU for "Processing Unit", its stringified type name is now "PU". + Use new HWLOC_OBJ_GROUP objects instead of MISC when grouping objects according to NUMA distances or arbitrary OS aggregation. + Rework memory attributes. + Add different cpusets in each object to specify processors that are offline, unavailable, ... + Cleanup the storage of object names and DMI infos. * Features + Add support for looking up specific PID topology information. + Add hwloc_topology_export_xml() to export the topology in a XML file. + Add hwloc_topology_get_support() to retrieve the supported features for the current topology context. + Support non-SYSTEM object as the root of the tree, use MACHINE in most common cases. + Add hwloc_get_*cpubind() routines to retrieve the current binding of processes and threads. * API + Add HWLOC_API_VERSION to help detect the currently used API version. + Add missing ending "e" to *compare* functions. + Add several routines to emulate PLPA functions. + Rename and rework the cpuset and/or/xor/not/clear operators to output their result in a dedicated argument instead of modifying one input. + Deprecate hwloc_obj_snprintf() in favor of hwloc_obj_type/attr_snprintf(). + Clarify the use of parent and ancestor in the API, do not use father. + Replace hwloc_get_system_obj() with hwloc_get_root_obj(). + Return -1 instead of HWLOC_OBJ_TYPE_MAX in the API since the latter isn't public. + Relax constraints in hwloc_obj_type_of_string(). + Improve displaying of memory sizes. + Add 0x prefix to cpuset strings. * Tools + lstopo now displays logical indexes by default, use --physical to revert back to OS/physical indexes. + Add colors in the lstopo graphical outputs to distinguish between online, offline, reserved, ... objects. + Extend lstopo to show cpusets, filter objects by type, ... + Renamed hwloc-mask into hwloc-calc which supports many new options. * Documentation + Add a hwloc(7) manpage containing general information. + Add documentation about how to switch from PLPA to hwloc. + Cleanup the distributed documentation files. * Miscellaneous + Many compilers warning fixes. + Cleanup the ABI by using the visibility attribute. + Add project embedding support. -- Brice Goglin From henning.fehrmann at aei.mpg.de Tue Apr 20 12:36:10 2010 From: henning.fehrmann at aei.mpg.de (Henning Fehrmann) Date: Tue, 20 Apr 2010 21:36:10 +0200 Subject: [Beowulf] NFS share - IO rate Message-ID: <20100420193610.GA22507@gretchen.aei.mpg.de> Hello, We purchased a disk based cache system. The content of the cache system is NFS exported into our cluster. Currently we'd like to tweak the cache setup to increase the IO performance. There are tools like bonny or iozone which tests locally mounted file systems. First I tried to start iozone on many NFS clients hammering the cache file server. I got results which are actually meaningless if they are not the same on all clients. One scenario is: Client A says I got the IO-rate Ra which is twice as big as the IO-rate of B: Ra = 2 Rb. The test on B took twice as long as on A. The simplest idea is to accumulate all the IO results to get the overall IO capability of the cache server. In this case the total IO rate is Rt = Ra + Rb. This would be the accurate result if the IO rate on B was constant during the entire test. But one can also interpret the result in a different way. Client A was doing its IO test and Client B got no bandwidth left at all. Only after A finished the test, B has been served. This results in a twice as small average rate on B. The total IO-rate in this case is only Rt = Ra. These extreme case illustrate the difficulty to interpret the results reliably. It would make more sense to have a tool at hand which starts a test on many clients simultaneously and do a count of the IO operation within a particular time interval. The simultaneity is not a problem here - this can be achieved by scripts. These results can be safely added together to obtain a overall IO-rate. Do you know such a tool or are there other ways to get a picture of a IO capability of cache-file server? Thank you and cheers, Henning From jlb17 at duke.edu Tue Apr 20 12:54:50 2010 From: jlb17 at duke.edu (Joshua Baker-LePain) Date: Tue, 20 Apr 2010 15:54:50 -0400 (EDT) Subject: [Beowulf] NFS share - IO rate In-Reply-To: <20100420193610.GA22507@gretchen.aei.mpg.de> References: <20100420193610.GA22507@gretchen.aei.mpg.de> Message-ID: On Tue, 20 Apr 2010 at 9:36pm, Henning Fehrmann wrote > These extreme case illustrate the difficulty to interpret the results > reliably. It would make more sense to have a tool at hand which starts a > test on many clients simultaneously and do a count of the IO operation > within a particular time interval. The simultaneity is not a problem > here - this can be achieved by scripts. These results can be safely > added together to obtain a overall IO-rate. > > Do you know such a tool or are there other ways to get a picture of a IO > capability of cache-file server? Have a look at IOR, which uses MPI to coordinate the clients. http://sourceforge.net/projects/ior-sio/ -- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF From bcostescu at gmail.com Wed Apr 21 02:07:26 2010 From: bcostescu at gmail.com (Bogdan Costescu) Date: Wed, 21 Apr 2010 11:07:26 +0200 Subject: [Beowulf] NFS share - IO rate In-Reply-To: <20100420193610.GA22507@gretchen.aei.mpg.de> References: <20100420193610.GA22507@gretchen.aei.mpg.de> Message-ID: On Tue, Apr 20, 2010 at 9:36 PM, Henning Fehrmann wrote: > Client A says I got the IO-rate Ra which is twice as big as the IO-rate of B: > Ra = 2 Rb. ?The test on B took twice as long as on A. I look at this differently: the overall rate that the server has dealt with is given by the total amount of data transferred in the time taken by the slowest node. So if Ra=Da/Ta and Rb=Db/Tb then I consider Rt=(Da+Db)/max(Ta, Tb) It's a similar view to the one I have about a parallel program: the real time (wallclock) of giving me the solution is what matters, not whatever built-in counters report. And this real time is the time taken by the slowest node (=the one which finished last, I'm not referring to the CPU speed...) > But one can also interpret the result in a different way. > Client A was doing its IO test and Client B got no bandwidth left at all. > Only after A finished the test, B has been served. This results in a twice as small > average rate on B. This shows a different point of view: you mention the average rate on B, I talk about what the server sees. So what are you actually interested in ? Do you have some rate specified by the manufacturer for the server that you want to compare with ? Or do you have some requirement of rate per node ? Cheers, Bogdan From henning.fehrmann at aei.mpg.de Wed Apr 21 06:16:04 2010 From: henning.fehrmann at aei.mpg.de (Henning Fehrmann) Date: Wed, 21 Apr 2010 15:16:04 +0200 Subject: [Beowulf] NFS share - IO rate In-Reply-To: References: <20100420193610.GA22507@gretchen.aei.mpg.de> Message-ID: <20100421131604.GA4458@gretchen.aei.mpg.de> Hi Bogdan, On Wed, Apr 21, 2010 at 11:07:26AM +0200, Bogdan Costescu wrote: > On Tue, Apr 20, 2010 at 9:36 PM, Henning Fehrmann > wrote: > > Client A says I got the IO-rate Ra which is twice as big as the IO-rate of B: > > Ra = 2 Rb. ?The test on B took twice as long as on A. > > I look at this differently: the overall rate that the server has dealt > with is given by the total amount of data transferred in the time > taken by the slowest node. So if > > Ra=Da/Ta and Rb=Db/Tb then I consider Rt=(Da+Db)/max(Ta, Tb) Yes, the rate of the slowest node times the number of nodes would give a lower bound for the IO rate. > > It's a similar view to the one I have about a parallel program: the > real time (wallclock) of giving me the solution is what matters, not > whatever built-in counters report. And this real time is the time > taken by the slowest node (=the one which finished last, I'm not > referring to the CPU speed...) > > > But one can also interpret the result in a different way. > > Client A was doing its IO test and Client B got no bandwidth left at all. > > Only after A finished the test, B has been served. This results in a twice as small > > average rate on B. > > This shows a different point of view: you mention the average rate on > B, I talk about what the server sees. So what are you actually > interested in ? Do you have some rate specified by the manufacturer > for the server that you want to compare with ? Or do you have some > requirement of rate per node ? The server has a Solaris 10 and a SAM/QFS on it and there are tools to measure the IO rate. Currently, I can't say how reliable these tools are. Some of them measure the IO rate on the discs which makes no sense in a RAID set. Doing these tests on clients might give a better picture of the usability of the cache system. Measuring the performance on the server wouldn't also take into account the buffering of the VFS or NFS on the client side. Additionally, more important than the streaming is the IO rate doing random seeks. In the bidding process we specified the read and write rate doing random seeks on the server, induced and seen by many clients in parallel. Thank you and cheers, Henning From jeff.johnson at aeoncomputing.com Tue Apr 20 12:57:29 2010 From: jeff.johnson at aeoncomputing.com (Jeff Johnson) Date: Tue, 20 Apr 2010 12:57:29 -0700 Subject: [Beowulf] NFS share - IO rate In-Reply-To: <20100420193610.GA22507@gretchen.aei.mpg.de> References: <20100420193610.GA22507@gretchen.aei.mpg.de> Message-ID: <4BCE0729.3050108@aeoncomputing.com> On 4/20/10 12:36 PM, Henning Fehrmann wrote: > Hello, > > We purchased a disk based cache system. The content of the cache system > is NFS exported into our cluster. Currently we'd like to tweak the cache setup to > increase the IO performance. There are tools like bonny or iozone which tests > locally mounted file systems. > > First I tried to start iozone on many NFS clients hammering the cache file server. > I got results which are actually meaningless if they are not the same on all clients. > > [...snip...] > > Do you know such a tool or are there other ways to get a picture of a IO capability > of cache-file server? > > Thank you and cheers, > Henning > Henning, Did you run your iozone tests using the multi-client throughput mode? (-+m and -+t options?) If not, you can set a shell env variable 'RSH=ssh', create a text file listing your desired clients and then run iozone in clustered mode. The master iozone process will collect the results for each operation and provide performance as seen by the clients. --Jeff -- ------------------------------ Jeff Johnson Manager Aeon Computing jeff.johnson at aeoncomputing.com www.aeoncomputing.com t: 858-412-3810 f: 858-412-3845 m: 619-204-9061 4905 Morena Boulevard, Suite 1313 - San Diego, CA 92117 From jigarhalani at gmail.com Mon Apr 19 09:15:25 2010 From: jigarhalani at gmail.com (jigar halani) Date: Mon, 19 Apr 2010 21:45:25 +0530 Subject: [Beowulf] HiPC 2010 Call for Workshop Proposals Message-ID: 17th IEEE International Conference on High Performance Computing (HiPC2010) December 19-22, 2010 Goa,INDIA The goal of the workshops held in conjunction with the HiPC conference is to broaden the technical scope of the conference in emerging areas of high performance computing and communication, and their applications. The HiPC workshops serves as an extended forum to present and discuss work-in-progress as well as mature research among researchers from around the world, and also highlight research activities in India. Since its inception in 2002, HiPC workshops has been very successful and has grown significantly. This year, the workshops will be held on Dec. 19, 2010, the day before the start of the main HiPC conference. Workshop proposals are solicited from interested researchers (both from academia and industry) in areas relating to high performance computing and networking, or any relevant emerging themes. Please send your proposals to the HiPC Workshops Chair, Manimaran Govindarasu (gmaini AT iastate DOT edu). In your proposal, provide the following information: workshop title, organizers information, workshop theme and topical areas, potential TPC, indicate half-day or a full-day event. Due date for workshop proposals: May 15, 2010 Notification of acceptance/reject decision: June 1, 2010 Workshops: Dec. 19, 2010 More info: http://www.hipc.org/hipc2010/workshops.php -- Jigar Halani Publicity co-chair - HiPC www.hipc.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From trainor at divination.biz Tue Apr 20 08:34:29 2010 From: trainor at divination.biz (Douglas J. Trainor) Date: Tue, 20 Apr 2010 11:34:29 -0400 Subject: [Beowulf] Forward: [SIAM-CSE] 2010 Guest Student Programme on Scientific Computing Message-ID: <890BE099-4FE2-4004-A652-E034A80B8DE5@divination.biz> From: Johannes Grotendorst Subject: [SIAM-CSE] 2010 Guest Student Programme on Scientific Computing Date: April 20, 2010 4:33:48 AM EDT To: SIAM-CSE at siam.org 2010 Guest Student Programme on Scientific Computing Where: Juelich Supercomputing Centre (JSC), Forschungszentrum Juelich When: 2 August to 8 October 2010 In order to give students the opportunity to familiarize themselves with various aspects of scientific computing as early as possible, the Juelich Supercomputing Centre (JSC) is once again organizing a programme for guest students in the 2010 summer vacation. The programme targets students of science and engineering, informatics and mathematics who have already completed their first degree but have not yet finished their master?s course. The students will work together with scientists from JSC on topics of current interest in research and development. Depending on their previous experience and interests, they will be involved in various fields of work, for example: Computational Science, Applied Mathematics Modelling and simulation in physics, chemistry and biophysics Techniques of parallel molecular dynamics simulation Efficient methods for long-range interactions Parallel computational procedures in quantum chemistry and structural mechanics Performance evaluation of parallel algorithms in linear algebra Mathematical modelling and statistics, data mining High-Performance Computing, Visualisation Performance analysis and optimization of parallel programs Programming of hierarchical parallel computer systems Distributed applications, interactive control and visualisation Virtual reality techniques for visualising scientific data Computer Architectures, Grid Computing Grid computing, uniform and secure access to IT resources Cluster operating systems Interconnection networks in clusters Data management High-speed data networks Network management The programme will run for ten weeks from 2 August to 8 October 2010. The students will be able to use the supercomputers at JSC, including JUGENE ? currently the fastest computer in Europe. They should naturally be familiar with computer-oriented areas of their subjects. In addition, they should also have practical computer experience including at least a good knowledge of programming with C, C++ or Fortran on Unix systems. More information:http://www.fz-juelich.de/jsc/gsp Deadline: 30 April 2010 Contact: Robert Speck Juelich Supercomputing Centre Forschungszentrum Juelich 52425 Juelich (Germany) Tel.: +49-2461-61-8715, Fax: +49-2461-61-6656 E-Mail:jsc-gsp at fz-juelich.de ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------ Forschungszentrum Juelich GmbH 52425 Juelich Sitz der Gesellschaft: Juelich Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 Vorsitzende des Aufsichtsrats: MinDir'in Baerbel Brumme-Bothe Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender), Dr. Ulrich Krafft (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, Prof. Dr. Sebastian M. Schmidt ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------ _______________________________________________ SIAM-CSE mailing list To post messages to the list please send them to: SIAM-CSE at siam.org http://lists.siam.org/mailman/listinfo/siam-cse -------------- next part -------------- An HTML attachment was scrubbed... URL: From bcostescu at gmail.com Thu Apr 22 06:32:28 2010 From: bcostescu at gmail.com (Bogdan Costescu) Date: Thu, 22 Apr 2010 15:32:28 +0200 Subject: [Beowulf] NFS share - IO rate In-Reply-To: <20100421131604.GA4458@gretchen.aei.mpg.de> References: <20100420193610.GA22507@gretchen.aei.mpg.de> <20100421131604.GA4458@gretchen.aei.mpg.de> Message-ID: On Wed, Apr 21, 2010 at 3:16 PM, Henning Fehrmann wrote: > Doing these tests on clients might give a better picture of the > usability of the cache system. > Measuring the performance on the server wouldn't also take into account the > buffering of the VFS or NFS on the client side. >From what I understand, allowing clients to cache would actually reduce the rates that the server has to deal with. If you just want to measure the server's response, you could mount the NFS shares with the "sync" flag, so that no caching occurs on the clients. Another idea: you could measure the network rates on the server; this would still include the client caching effects - however this would not be very easy to translate into IOP/s. > In the bidding process we specified the read and write rate doing random seeks on the server, > induced and seen by many clients in parallel. OK :-) This still doesn't mention where there measurement takes place... Cheers, Bogdan From mathog at caltech.edu Thu Apr 22 12:11:06 2010 From: mathog at caltech.edu (David Mathog) Date: Thu, 22 Apr 2010 12:11:06 -0700 Subject: [Beowulf] Choosing pxelinux.cfg DEFAULT via dhcpd.conf? Message-ID: Is there a way to set dhcpd.conf so that it changes which pxelinux.cfg entry (LABEL) starts on a network boot? I think something like this can be done with the option pxelinux.magic combined with pxelinux.config file or pxelinux.pathprefix to specify a 2nd (or 3rd...) pxelinux.cfg file, each of which has a different DEFAULT, but I don't see how to set the dhcpd.conf entry for a machine to get pxelinux to effectively change the default value for a single pxelinux.cfg file. The idea being, in a heterogenous cluster, to have dhcpd.conf set up so that type A nodes boot one thing, and type B nodes boot another, without any manual intervention required. The pxelinux.cfg has several different LABEL entries one of which is DEFAULT. If dhcpd.conf has filename "/pxelinux.0" set then when the remote node boots it network boots the DEFAULT entry from pxelinux.cfg. From the keyboard one can choose an entry other than DEFAULT to start. I know how to pass variables to whatever comes up, with parameters like this on the relevant line in dhcpd.conf: option option-200 "information passed" but I don't see how to make the node boot, for instance, the 3rd option on the pxelinux.cfg list, using just the one file, other than by editing that file changing DEFAULT, and reloading the dhcpd daemon. Thanks, David Mathog mathog at caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech From prentice at ias.edu Thu Apr 22 12:40:08 2010 From: prentice at ias.edu (Prentice Bisbal) Date: Thu, 22 Apr 2010 15:40:08 -0400 Subject: [Beowulf] Choosing pxelinux.cfg DEFAULT via dhcpd.conf? In-Reply-To: References: Message-ID: <4BD0A618.9010505@ias.edu> In your pxelinux.cfg directory, instead of default, you would need to create a file whose name is the MAC address of the client, or it's IP address in hexadecimal. See http://syslinux.zytor.com/wiki/index.php/PXELINUX * First, it will search for the config file using the hardware type (using its ARP type code) and address, all in lower case hexadecimal with dash separators; for example, for an Ethernet (ARP type 1) with address 88:99:AA:BB:CC:DD it would search for the filename 01-88-99-aa-bb-cc-dd. * Next, it will search for the config file using its own IP address in upper case hexadecimal, e.g. 192.0.2.91 -> C000025B (you can use the included progam gethostip to compute the hexadecimal IP address for any host). If that file is not found, it will remove one hex digit and try again. Ultimately, it will try looking for a file named default (in lower case). As an example, if the boot file name is /mybootdir/pxelinux.0, the Ethernet MAC address is `88:99:AA:BB:CC:DD` and the IP address 192.0.2.91, it will try following files (in that order): " These files don't have to be real files, they could symlinks so you could create /tftp/pxelinux.cfg/newservers /tftp/pxelinux.cfg/oldservers and then create these symlinks /tftp/pxelinux.cfg/88-99-AA-BB-CC-D1 -> newservers /tftp/pxelinux.cfg/88-99-AA-BB-CC-D2 -> newservers /tftp/pxelinux.cfg/88-99-AA-BB-CC-D3 -> oldservers /tftp/pxelinux.cfg/88-99-AA-BB-CC-D4 -> oldservers Using symlinks in this way will reduce how many different config files you'll actually need. I'm pretty sure this will accomplish exactly what you desire. If you were looking for a conditional you could use inside your PXE config file, I don't think such a feature exists. -- Prentice David Mathog wrote: > Is there a way to set dhcpd.conf so that it changes which pxelinux.cfg > entry (LABEL) starts on a network boot? I think something like this can > be done with the option pxelinux.magic combined with pxelinux.config > file or pxelinux.pathprefix to specify a 2nd (or 3rd...) pxelinux.cfg > file, each of which has a different DEFAULT, but I don't see how to set > the dhcpd.conf entry for a machine to get pxelinux to effectively change > the default value for a single pxelinux.cfg file. The idea being, in a > heterogenous cluster, to have dhcpd.conf set up so that type A nodes > boot one thing, and type B nodes boot another, without any manual > intervention required. > > The pxelinux.cfg has several different LABEL entries > one of which is DEFAULT. If dhcpd.conf has filename "/pxelinux.0" set > then when the remote node boots it network boots the DEFAULT entry from > pxelinux.cfg. From the keyboard one can choose an entry other than > DEFAULT to start. I know how to pass variables to whatever comes up, > with parameters like this on the relevant line in dhcpd.conf: > > option option-200 "information passed" > > but I don't see how to make the node boot, for instance, > the 3rd option on the pxelinux.cfg list, using just the one file, > other than by editing that file changing DEFAULT, and reloading the > dhcpd daemon. > > Thanks, > > David Mathog > mathog at caltech.edu > Manager, Sequence Analysis Facility, Biology Division, Caltech > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > -- Prentice Bisbal Linux Software Support Specialist/System Administrator School of Natural Sciences Institute for Advanced Study Princeton, NJ From henning.fehrmann at aei.mpg.de Fri Apr 23 00:08:10 2010 From: henning.fehrmann at aei.mpg.de (Henning Fehrmann) Date: Fri, 23 Apr 2010 09:08:10 +0200 Subject: [Beowulf] NFS share - IO rate In-Reply-To: References: <20100420193610.GA22507@gretchen.aei.mpg.de> <20100421131604.GA4458@gretchen.aei.mpg.de> Message-ID: <20100423070810.GA7821@gretchen.aei.mpg.de> On Thu, Apr 22, 2010 at 03:32:28PM +0200, Bogdan Costescu wrote: > On Wed, Apr 21, 2010 at 3:16 PM, Henning Fehrmann > wrote: > > Doing these tests on clients might give a better picture of the > > usability of the cache system. > > Measuring the performance on the server wouldn't also take into account the > > buffering of the VFS or NFS on the client side. > > From what I understand, allowing clients to cache would actually > reduce the rates that the server has to deal with. If you just want to > measure the server's response, you could mount the NFS shares with the > "sync" flag, so that no caching occurs on the clients. This is correct. Actually, with the async option one gets also a good overview about buffer sizes. The IO performance drops with increasing file size. > Another idea: you could measure the network rates on the server; this > would still include the client caching effects - however this would > not be very easy to translate into IOP/s. > > > In the bidding process we specified the read and write rate doing random seeks on the server, > > induced and seen by many clients in parallel. > > OK :-) This still doesn't mention where there measurement takes place... > We really want to measure the IO on the client side, since this is what matters for the applications. We assume that many clients do unpredictable seeks in files on the NFS-share for an unknown long run time. This is what we try to simulate. Of course we try to measure consistent results also on the server side. I wrote a little program which is doing random seeks, writes or reads one byte and does a fsync. It starts at a particular system time, runs for a well defined time and counts the IO. In this way I hope to synchronize the tests on the clients, similar as it would be done using MPI synchronized IO tests. The fsync in fact should prevent the usage of the buffer on the client side. We'll see. Cheers, Henning From cbergstrom at pathscale.com Mon Apr 26 10:58:12 2010 From: cbergstrom at pathscale.com (=?ISO-8859-1?Q?=22C=2E_Bergstr=F6m=22?=) Date: Tue, 27 Apr 2010 00:58:12 +0700 Subject: [Beowulf] Free Fermi cards to interested developers and researchers Message-ID: <4BD5D434.3040808@pathscale.com> Hi all PathScale is giving away a limited amount of Nvidia Fermi cards to qualified open source developers and researchers. We are mainly focused on our optimized gpu compiler and the HPC market, but also open to sponsoring creative ideas or projects surrounding the gpu. Here's a short list of some of the areas most interesting to us * Nouveau/kernel drivers * CUDA * OpenCL * HMPP * Parallel programming * MPI * Shader compilers If you're interested please contact me offlist with a brief description of your background, the contributions you made to open source and what you'd intend to do with the card. Thanks Christopher #pathscale - irc.freenode.net From eagles051387 at gmail.com Mon Apr 26 22:09:30 2010 From: eagles051387 at gmail.com (Jonathan Aquilina) Date: Tue, 27 Apr 2010 07:09:30 +0200 Subject: [Beowulf] Free Fermi cards to interested developers and researchers In-Reply-To: <4BD5D434.3040808@pathscale.com> References: <4BD5D434.3040808@pathscale.com> Message-ID: would someone who uses boinc and is part of alot of these shared computing projects that take advantage of gpu process qualify for one? 2010/4/26 "C. Bergstr?m" > Hi all > > PathScale is giving away a limited amount of Nvidia Fermi cards to > qualified open source developers and researchers. We are mainly focused > on our optimized gpu compiler and the HPC market, but also open to > sponsoring creative ideas or projects surrounding the gpu. > > Here's a short list of some of the areas most interesting to us > * Nouveau/kernel drivers > * CUDA > * OpenCL > * HMPP > * Parallel programming > * MPI > * Shader compilers > > If you're interested please contact me offlist with a brief description > of your background, the contributions you made to open source and what > you'd intend to do with the card. > > > Thanks > > Christopher > > #pathscale - irc.freenode.net > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -- Jonathan Aquilina -------------- next part -------------- An HTML attachment was scrubbed... URL: