From kus at free.net Mon Sep 1 10:34:53 2008 From: kus at free.net (Mikhail Kuzminsky) Date: Thu Nov 20 01:07:41 2008 Subject: [Beowulf] gpgpu In-Reply-To: Message-ID: I performed some simplest estimation for possible performance improvements using "dgemm on FirerStream 9250". It's extremally good for GPGPU example. The source data for 9250: peak DP performance 200 GFLOPS, GDDR3 RAM 1 Gbyte. 1 Gbyte can hold 3 DP(64 bit) matrixes (n x n) for n=6000 - they require 864 Mbytes. Let me suppose that real performance of FireStream will be 90% of peak value (I'm afraid, that reality will be more bad), i.e. 180 GFLOPS. dgemm requires 2*n^3 FP operations (I neglect n^2 operations for matrix addition and scaling), i.e. 432 GFLOP The calculation time will be 432/180 = 2.4 sec We'll need for dgemm calculation also 4 matrix transmissions: 3 to GPGPU, 1 - from GPGPU to main memory of server. It's 1152 Gbytes of data. For PCI-e x16 v.2 peak throughput value is 8 GB/s, therefore transmission time will be about 0.144 sec (I don't know what may be real throughput for PCIe). The total calc. time is therefore about 2.54 sec. On dual socket quad core Xeon server w/3 Ghz E5472 (8 cores) the peak performance is 96 GFLOPS. Parallelized dgemm will give, I believe, about 80% of peak - i.e. 77 GFLOPS; therefore calcualtion time is 432/77= 5.6 sec. Speedup is 2.2 times. Price increase - I don't know, for example from $4500 to $6500 (if Firestream costs $2000, but may be $1000 as Igor Kozin wrote here), it's about 1.4 times. But I think there will be not too many job which require matrix multiplication for *dense* matrixes w/such large (6000 x 6000) sizes; for sparse matrixes the dimensions, I beleive, will be lower. Mikhail From libo at buaa.edu.cn Mon Sep 1 17:43:16 2008 From: libo at buaa.edu.cn (Li, Bo) Date: Thu Nov 20 01:07:41 2008 Subject: [Beowulf] gpgpu References: Message-ID: <000f01c90c94$e84e2ba0$6300a8c0@LIBO> Hello, It seemed that you had got a very good example for GPGPU. As I said before, it's not the time for GPGPU to do the DP calculation at the moment. If you can bear SP computation, you will find more about it. NVidia just sent me some special offer about their Tesla platforms, which said that the workstation equipped with two GTX280 level professional cards costs about $5000, not bad. But my intention is still to lower the core frequency of a gaming card, and use it for computation. Regards, Li, Bo ----- Original Message ----- From: "Mikhail Kuzminsky" To: "Kozin, I (Igor)" Cc: Sent: Tuesday, September 02, 2008 1:34 AM Subject: Re: [Beowulf] gpgpu >I performed some simplest estimation for possible performance > improvements using "dgemm on FirerStream 9250". > It's extremally good for GPGPU example. > > The source data for 9250: peak DP performance 200 GFLOPS, GDDR3 RAM 1 > Gbyte. > > 1 Gbyte can hold 3 DP(64 bit) matrixes (n x n) for n=6000 - they > require 864 Mbytes. > Let me suppose that real performance of FireStream will be 90% of peak > value (I'm afraid, that reality will be more bad), i.e. 180 GFLOPS. > > dgemm requires 2*n^3 FP operations (I neglect n^2 operations for > matrix addition and scaling), i.e. 432 GFLOP > The calculation time will be 432/180 = 2.4 sec > > We'll need for dgemm calculation also 4 matrix transmissions: 3 to > GPGPU, 1 - from GPGPU to main memory of server. > It's 1152 Gbytes of data. > > For PCI-e x16 v.2 peak throughput value is 8 GB/s, therefore > transmission time will be about 0.144 sec (I don't know what may be > real throughput for PCIe). > > The total calc. time is therefore about 2.54 sec. > > On dual socket quad core Xeon server w/3 Ghz E5472 (8 cores) the peak > performance is 96 GFLOPS. Parallelized dgemm will give, I believe, > about 80% of peak - i.e. 77 GFLOPS; therefore calcualtion time is > 432/77= 5.6 sec. > > Speedup is 2.2 times. Price increase - I don't know, for example from > $4500 to $6500 (if Firestream costs $2000, but may be $1000 as Igor > Kozin wrote here), it's about 1.4 times. > > But I think there will be not too many job which require matrix > multiplication for *dense* matrixes w/such large (6000 x 6000) sizes; > for sparse matrixes the dimensions, I beleive, will be lower. > > Mikhail > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From matt at technoronin.com Mon Sep 1 18:44:53 2008 From: matt at technoronin.com (Matt Lawrence) Date: Thu Nov 20 01:07:41 2008 Subject: [Beowulf] gpgpu In-Reply-To: <000f01c90c94$e84e2ba0$6300a8c0@LIBO> References: <000f01c90c94$e84e2ba0$6300a8c0@LIBO> Message-ID: On Tue, 2 Sep 2008, Li, Bo wrote: > It seemed that you had got a very good example for GPGPU. As I said > before, it's not the time for GPGPU to do the DP calculation at the > moment. If you can bear SP computation, you will find more about it. > NVidia just sent me some special offer about their Tesla platforms, > which said that the workstation equipped with two GTX280 level > professional cards costs about $5000, not bad. But my intention is still > to lower the core frequency of a gaming card, and use it for > computation. Are those the chips that overheat and pull loose from the carrier? -- Matt It's not what I know that counts. It's what I can remember in time to use. From libo at buaa.edu.cn Mon Sep 1 19:14:08 2008 From: libo at buaa.edu.cn (Li, Bo) Date: Thu Nov 20 01:07:41 2008 Subject: [Beowulf] gpgpu References: <000f01c90c94$e84e2ba0$6300a8c0@LIBO> Message-ID: <001c01c90ca1$99966c90$6300a8c0@LIBO> Hello, Not at all. I lowered the frequency for stability, actually it works fine at the default frequency, but I don't want to take any risks. Regards, Li, Bo ----- Original Message ----- From: "Matt Lawrence" To: "who's afraid of" Sent: Tuesday, September 02, 2008 9:44 AM Subject: Re: [Beowulf] gpgpu > On Tue, 2 Sep 2008, Li, Bo wrote: > >> It seemed that you had got a very good example for GPGPU. As I said >> before, it's not the time for GPGPU to do the DP calculation at the >> moment. If you can bear SP computation, you will find more about it. >> NVidia just sent me some special offer about their Tesla platforms, >> which said that the workstation equipped with two GTX280 level >> professional cards costs about $5000, not bad. But my intention is still >> to lower the core frequency of a gaming card, and use it for >> computation. > > Are those the chips that overheat and pull loose from the carrier? > > -- Matt > It's not what I know that counts. > It's what I can remember in time to use. > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From libo at buaa.edu.cn Mon Sep 1 19:19:52 2008 From: libo at buaa.edu.cn (Li, Bo) Date: Thu Nov 20 01:07:41 2008 Subject: [Beowulf] gpgpu References: <000f01c90c94$e84e2ba0$6300a8c0@LIBO> Message-ID: <001f01c90ca2$66d9e100$6300a8c0@LIBO> Gaming card is not supposed to have the same stability at the default frequency, but with the 10 times price difference, it is still a very good choice. Two card system cost us only $1,000 and provides about 1.6TFlops SP capability. Regards, Li, Bo ----- Original Message ----- From: "Matt Lawrence" To: "who's afraid of" Sent: Tuesday, September 02, 2008 9:44 AM Subject: Re: [Beowulf] gpgpu > On Tue, 2 Sep 2008, Li, Bo wrote: > >> It seemed that you had got a very good example for GPGPU. As I said >> before, it's not the time for GPGPU to do the DP calculation at the >> moment. If you can bear SP computation, you will find more about it. >> NVidia just sent me some special offer about their Tesla platforms, >> which said that the workstation equipped with two GTX280 level >> professional cards costs about $5000, not bad. But my intention is still >> to lower the core frequency of a gaming card, and use it for >> computation. > > Are those the chips that overheat and pull loose from the carrier? > > -- Matt > It's not what I know that counts. > It's what I can remember in time to use. > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From gerry.creager at tamu.edu Mon Sep 1 21:38:52 2008 From: gerry.creager at tamu.edu (Gerry Creager) Date: Thu Nov 20 01:07:41 2008 Subject: [Beowulf] Can one Infiniband net support MPI and a parallel filesystem? In-Reply-To: <48A2F5D7.9080406@noaa.gov> References: <1278943052.86771218524490230.JavaMail.root@mail.vpac.org> <48A1D1D8.6040405@noaa.gov> <48A2C50F.5010305@scalableinformatics.com> <48A2F5D7.9080406@noaa.gov> Message-ID: <48BCC35C.6010204@tamu.edu> Craig Tierney wrote: > Joe Landman wrote: >> Craig Tierney wrote: >>> Chris Samuel wrote: >>>> ----- "I Kozin (Igor)" wrote: >>>> >>>>>> Generally speaking, MPI programs will not be fetching/writing data >>>>>> from/to storage at the same time they are doing MPI calls so there >>>>>> tends to not be very much contention to worry about at the node >>>>>> level. >>>>> I tend to agree with this. >>>> >>>> But that assumes you're not sharing a node with other >>>> jobs that may well be doing I/O. >>>> >>>> cheers, >>>> Chris >>> >>> I am wondering, who shares nodes in cluster systems with >>> MPI codes? We never have shared nodes for codes that need >> >> The vast majority of our customers/users do. Limited resources, they >> have to balance performance against cost and opportunity cost. >> >> Sadly not every user has an infinite budget to invest in contention >> free hardware (nodes, fabrics, or disks). So they have to maximize >> the utilization of what they have, while (hopefully) not trashing the >> efficiency too badly. >> >>> multiple cores since be built our first SMP cluster >>> in 2001. The contention for shared resources (like memory >>> bandwidth and disk IO) would lead to unpredictable code performance. >> >> Yes it does. As does OS jitter and other issues. >> >>> Also, a poorly behaved program can cause the other codes on >>> that node to crash (which we don't want). >> >> Yes this happens as well, but some users simply have no choice. >> >>> >>> Even at TACC (62000+ cores) with 16 cores per node, nodes >>> are dedicated to jobs. >> >> I think every user would love to run on a TACC like system. I think >> most users have a budget for something less than 1/100th the size. >> Its easy to forget how much resource (un)availability constrains >> actions when you have very large resources to work with. >> > > TACC probably wasn't a good example for the "rest of us". It hasn't been > difficult to dedicate nodes to jobs when the number of cores was 2 or 4. > We now have some 8 core nodes, and we are wondering if the policy of > not sharing nodes is going to continue, or at least modified to minimize > waste. Last time I asked (recently...) TACC intends to continue scheduling per-node, even with 16 cores/node. Sorry to be late with this but the hurricane season is getting interesting and e-mail's taken a bit of a hit. -- Gerry Creager -- gerry.creager@tamu.edu Texas Mesonet -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.862.3983 Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843 From lindahl at pbm.com Wed Sep 3 02:04:17 2008 From: lindahl at pbm.com (Greg Lindahl) Date: Thu Nov 20 01:07:41 2008 Subject: [Beowulf] Stroustrup regarding multicore In-Reply-To: References: <87prnv20ky.fsf@snark.cb.piermont.com> <87d4jv1zwo.fsf@snark.cb.piermont.com> <20080826140446.0479b7b3@localhost.localdomain> <87zlmzy6iw.fsf@snark.cb.piermont.com> <49D2DE14-1F0A-44B9-B05F-BA49D3766C7E@sanger.ac.uk> <9E8A6C2B-DB97-4BCC-97E2-A97EE5A11FA7@xs4all.nl> <0974FCFCC905405B8278F2D805667336@dynamic.usna.edu> <87prntyydv.fsf@snark.cb.piermont.com> Message-ID: <20080903090417.GA15987@bx9.net> On Thu, Aug 28, 2008 at 11:54:05AM -0400, Peter St. John wrote: > I think a physicist programming is like an astronomer grinding lenses (maybe > nobody does that anymore). Some astronomers (in the old days) ground their > own lenses and ended up contributing to optics; others never looked through > telescopes, they do math on the measurements taken by others. This is the 2nd funniest posting in this thread. Did you notice that ground-based telescopes recently started being much, much bigger? These new lenses were invented and made in Arizona by an astronomer, who figured out how to spin molten glass into roughly the right shape, instead of taking a huge, flat, thick piece of glass and grinding it into the shape of a mirror. http://www.npr.org/templates/story/story.php?storyId=4773461 Our community does this kind of stuff because it wouldn't happen otherwise. The funniest posting in this thread was when rgb failed to notce that Perry had compared the difficulty of directing physics research to the difficulty of writing a program. Some computer programs are hard. Most aren't. So it's a dumb comparison. I don't know what to make of Vincent saying that I sound like an average guy who watches TV. I haven't watched TV much since 1983, but I have spent a lot of time as an astronomy graduate student doing supercomputing, and then working with scientific programmers. This isn't meant to encourage anyone to continue discussing any of this. I did want to point out how misinformed most of the "discussion" was. That's in addition to being pointless. Yeah, I'm probably a bit grouchy because my car's parking lights don't turn off anymore after the final dust storm at Burning Man. The owner's manual says it can't happen. Must have been written by a computer scientist :-) -- greg From rgb at phy.duke.edu Wed Sep 3 04:20:16 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Thu Nov 20 01:07:41 2008 Subject: [Beowulf] Stroustrup regarding multicore In-Reply-To: <20080903090417.GA15987@bx9.net> References: <87prnv20ky.fsf@snark.cb.piermont.com> <87d4jv1zwo.fsf@snark.cb.piermont.com> <20080826140446.0479b7b3@localhost.localdomain> <87zlmzy6iw.fsf@snark.cb.piermont.com> <49D2DE14-1F0A-44B9-B05F-BA49D3766C7E@sanger.ac.uk> <9E8A6C2B-DB97-4BCC-97E2-A97EE5A11FA7@xs4all.nl> <0974FCFCC905405B8278F2D805667336@dynamic.usna.edu> <87prntyydv.fsf@snark.cb.piermont.com> <20080903090417.GA15987@bx9.net> Message-ID: On Wed, 3 Sep 2008, Greg Lindahl wrote: > The funniest posting in this thread was when rgb failed to notce that > Perry had compared the difficulty of directing physics research to the > difficulty of writing a program. Some computer programs are hard. Most > aren't. So it's a dumb comparison. I didn't quite fail to notice;-) I just offered to explain my own research if anybody was interested. No takers, of course -- which is good as it would take me a LONG time as the science is nontrivial:-) I was also getting a bit tired of the thread as this particular thesis (that scientists make poor computer programmer and/or must hire programmers in order to do good science using computers) was so absurd that -- after writing out a longish response and just throwing up my hands in disgust and deleting it instead of posting -- I tried to gently bow out. > I don't know what to make of Vincent saying that I sound like an > average guy who watches TV. I haven't watched TV much since 1983, but It just means that Vincent is a narrowly brilliant wacko. Narrowly possibly brilliant -- I never know quite what to make of chess or go masters who never do anything constructive. Clearly requires some serious neurons, but isn't there ANYTHING in the world that they can turn all that grey matter to to the benefit of humankind? But you know that. > I have spent a lot of time as an astronomy graduate student doing > supercomputing, and then working with scientific programmers. > > This isn't meant to encourage anyone to continue discussing any of > this. I did want to point out how misinformed most of the "discussion" > was. That's in addition to being pointless. I still don't think it was originally pointless. People read the list and then go write proposals. Twenty proposals budgeting one grad student and a computer programmer are twenty proposals that won't get funded. So who knows, MAYBE it saved some poor soul's research program. But probably not -- people aren't that stupid. > Yeah, I'm probably a bit grouchy because my car's parking lights don't > turn off anymore after the final dust storm at Burning Man. The > owner's manual says it can't happen. Must have been written by a > computer scientist :-) Or Murphy. I just like to think of matter as being, y'know, this collection of spinning clouds of "stuff" that is all really soft, ultimately, and fails to hold its shape, structure, form, and purpose a whole lot faster than people realize. The key cylinder in my son's junker jaguar ('92) decided yesterday to ignore the jag's bizarre key for the same reason. I'm sure it will cost me a bunch of money, sigh. rgb > > -- greg > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > -- Robert G. Brown Phone(cell): 1-919-280-8443 Duke University Physics Dept, Box 90305 Durham, N.C. 27708-0305 Web: http://www.phy.duke.edu/~rgb Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 From rgb at phy.duke.edu Wed Sep 3 05:01:08 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Thu Nov 20 01:07:41 2008 Subject: [Beowulf] Stroustrup regarding multicore In-Reply-To: References: <87prnv20ky.fsf@snark.cb.piermont.com> <87d4jv1zwo.fsf@snark.cb.piermont.com> <20080826140446.0479b7b3@localhost.localdomain> <87zlmzy6iw.fsf@snark.cb.piermont.com> <49D2DE14-1F0A-44B9-B05F-BA49D3766C7E@sanger.ac.uk> <9E8A6C2B-DB97-4BCC-97E2-A97EE5A11FA7@xs4all.nl> <0974FCFCC905405B8278F2D805667336@dynamic.usna.edu> <87prntyydv.fsf@snark.cb.piermont.com> <20080903090417.GA15987@bx9.net> Message-ID: On Wed, 3 Sep 2008, Robert G. Brown wrote: >> I don't know what to make of Vincent saying that I sound like an >> average guy who watches TV. I haven't watched TV much since 1983, but > > It just means that Vincent is a narrowly brilliant wacko. Narrowly Jesus, I shouldn't be allowed near a keyboard before I have my coffee. Vincent, I apologize. This isn't funny (although somehow, at the time...) This is clearly uncalled for ad hominem crap and a product of a mix of pre-coffee crankiness and a profound lack of sleep. I love chess. I love go. I suck at both of them, and they are very, very hard problems and humans learn a lot from trying to solve them. If I offended you (and I don't see how I could miss, sorry) I apologize. If I offended Greg, Peter, or anyone else, I apologize again. I think I'll go crawl back under a rock for a while with the rest of the exoskeletal mindless creatures. rgb -- Robert G. Brown Phone(cell): 1-919-280-8443 Duke University Physics Dept, Box 90305 Durham, N.C. 27708-0305 Web: http://www.phy.duke.edu/~rgb Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 From james.p.lux at jpl.nasa.gov Wed Sep 3 06:44:44 2008 From: james.p.lux at jpl.nasa.gov (Lux, James P) Date: Thu Nov 20 01:07:42 2008 Subject: [Beowulf] Stroustrup regarding multicore In-Reply-To: <20080903090417.GA15987@bx9.net> Message-ID: On 9/3/08 2:04 AM, "Greg Lindahl" wrote: > On Thu, Aug 28, 2008 at 11:54:05AM -0400, Peter St. John wrote: > >> I think a physicist programming is like an astronomer grinding lenses (maybe >> nobody does that anymore). Some astronomers (in the old days) ground their >> own lenses and ended up contributing to optics; others never looked through >> telescopes, they do math on the measurements taken by others. > > This is the 2nd funniest posting in this thread. Did you notice that > ground-based telescopes recently started being much, much bigger? > These new lenses were invented and made in Arizona by an astronomer, > who figured out how to spin molten glass into roughly the right shape, > instead of taking a huge, flat, thick piece of glass and grinding it > into the shape of a mirror. > > http://www.npr.org/templates/story/story.php?storyId=4773461 > > > > > ---- Ahem.. Reflectors, not lenses And, actually, the fact that a spinning body of liquid assumes a parabolic shape has been known for centuries (Kepler?), and, in fact, as early as 1850, an astronomer (Ernesto Capocci) proposed and built a telescope using liquid metal (e.g. Mercury) for a reflector. He probably wasn?t unique, as there are mentions of a Mr. Buchan in notes by Brewster (as in Brewster angle) about the same time. There?s a fascinating thesis by Brad Gibson from Univ of Vancouver that gives a dozen or so pages of all the problems faced with liquid metal telescopes (ripples, etc.) > What Dr Angel and the folks in Arizona have done is build an enormous spinning oven and worked out the process controls (more of an engineering task than a science, one, I might add.. Being an Engineer, I think these distinctions are important, not that new science isn't being done here). They also still have to do a conventional polishing step, but, at least the general figure of the mirror?s surface is already close to what it needs to be. (Interestingly, there?s apparently an article about this in Science News back in Feb 1985, which is when the latest work in LMTs got going at Laval) Jim Lux From perry at piermont.com Wed Sep 3 07:12:04 2008 From: perry at piermont.com (Perry E. Metzger) Date: Thu Nov 20 01:07:42 2008 Subject: [Beowulf] Stroustrup regarding multicore In-Reply-To: <20080903090417.GA15987@bx9.net> (Greg Lindahl's message of "Wed\, 3 Sep 2008 02\:04\:17 -0700") References: <87prnv20ky.fsf@snark.cb.piermont.com> <87d4jv1zwo.fsf@snark.cb.piermont.com> <20080826140446.0479b7b3@localhost.localdomain> <87zlmzy6iw.fsf@snark.cb.piermont.com> <49D2DE14-1F0A-44B9-B05F-BA49D3766C7E@sanger.ac.uk> <9E8A6C2B-DB97-4BCC-97E2-A97EE5A11FA7@xs4all.nl> <0974FCFCC905405B8278F2D805667336@dynamic.usna.edu> <87prntyydv.fsf@snark.cb.piermont.com> <20080903090417.GA15987@bx9.net> Message-ID: <878wu9xqrv.fsf@snark.cb.piermont.com> Greg Lindahl writes: > The funniest posting in this thread was when rgb failed to notce that > Perry had compared the difficulty of directing physics research to the > difficulty of writing a program. Some computer programs are hard. Most > aren't. So it's a dumb comparison. If you say so. Most of the programmers I know go through three stages. When they're starting out, as they're writing their very first programs, they think writing software is complicated and that they don't know nearly enough. Then, when they've gotten to the point where they have been doing it a while and are reasonably familiar with a language or two, they think writing software is straightforward. As with the time where new pilots know enough to fly reasonably but don't have a lot of hours, this is when the programmer is the most dangerous to himself and to others. Finally, if they're really good programmers, after a few years they begin to think writing good software is monstrously difficult, as hard as the hardest human endeavors, and that they only understand enough to muddle through it. This is when they can finally be trusted. You can tell the incompetent people by the fact that they never get past stage 2. Perry From perry at piermont.com Wed Sep 3 07:14:43 2008 From: perry at piermont.com (Perry E. Metzger) Date: Thu Nov 20 01:07:42 2008 Subject: [Beowulf] Stroustrup regarding multicore In-Reply-To: (Robert G. Brown's message of "Wed\, 3 Sep 2008 07\:20\:16 -0400 \(EDT\)") References: <87prnv20ky.fsf@snark.cb.piermont.com> <87d4jv1zwo.fsf@snark.cb.piermont.com> <20080826140446.0479b7b3@localhost.localdomain> <87zlmzy6iw.fsf@snark.cb.piermont.com> <49D2DE14-1F0A-44B9-B05F-BA49D3766C7E@sanger.ac.uk> <9E8A6C2B-DB97-4BCC-97E2-A97EE5A11FA7@xs4all.nl> <0974FCFCC905405B8278F2D805667336@dynamic.usna.edu> <87prntyydv.fsf@snark.cb.piermont.com> <20080903090417.GA15987@bx9.net> Message-ID: <874p4xxqng.fsf@snark.cb.piermont.com> "Robert G. Brown" writes: > I was also getting a bit tired of the thread as this particular > thesis (that scientists make poor computer programmer and/or must > hire programmers in order to do good science using computers) You totally got my point wrong. I said exactly the opposite. I believe that scientists must spend enough time to become good computer programmers -- they must neither leave the task to others nor can they underestimate the amount of difficulty involved in the software. How it is possible that people managed to read that much and hear exactly the inverse of my central thesis, I don't understand at all. Perhaps everyone just hears what they want to. Perry From smulcahy at aplpi.com Wed Sep 3 07:31:31 2008 From: smulcahy at aplpi.com (stephen mulcahy) Date: Thu Nov 20 01:07:42 2008 Subject: [Beowulf] Stroustrup regarding multicore In-Reply-To: <874p4xxqng.fsf@snark.cb.piermont.com> References: <87prnv20ky.fsf@snark.cb.piermont.com> <87d4jv1zwo.fsf@snark.cb.piermont.com> <20080826140446.0479b7b3@localhost.localdomain> <87zlmzy6iw.fsf@snark.cb.piermont.com> <49D2DE14-1F0A-44B9-B05F-BA49D3766C7E@sanger.ac.uk> <9E8A6C2B-DB97-4BCC-97E2-A97EE5A11FA7@xs4all.nl> <0974FCFCC905405B8278F2D805667336@dynamic.usna.edu> <87prntyydv.fsf@snark.cb.piermont.com> <20080903090417.GA15987@bx9.net> <874p4xxqng.fsf@snark.cb.piermont.com> Message-ID: <48BE9FC3.4080306@aplpi.com> Perry E. Metzger wrote: > How it is possible that people managed to read that much and hear > exactly the inverse of my central thesis, I don't understand at > all. Perhaps everyone just hears what they want to. Sheesh, I resisted for a long time but .... The scenario above pretty much sums up the situation I see with one of the softer sides of software engineering - the requirements gathering, which I'd see as fundamental to a successful (software, or indeed general IT project). IMHO, the most important part of most projects is figuring out what the heck the "stakeholder"[1] wants in the first place. No matter how good your programming is, if your requirements are wrong - you're heading in the wrong direction entirely (a bit like building a really neat spacecraft and then launching it towards Pluto instead of Mars[2]). This is SoftwareAnalysisAndDesign@beowulf.org right? -stephen [1] Am I the only one that can't help using that word and visualing a Van Helsing type waving a wooden stake around? Whether the typical project stakeholder is trying to drive the stake through the heart of the project or the heart of nasties trying to drag the project down is an exercise for the reader. [2] Some of those with a background in directing spacecraft lurking on this list may poke holes in my analogy by noting a trajectory to Pluto would take you right my Mars which will really take from my point. -- Stephen Mulcahy, Applepie Solutions Ltd., Innovation in Business Center, GMIT, Dublin Rd, Galway, Ireland. +353.91.751262 http://www.aplpi.com Registered in Ireland, no. 289353 (5 Woodlands Avenue, Renmore, Galway) From larry.stewart at sicortex.com Wed Sep 3 08:20:02 2008 From: larry.stewart at sicortex.com (Lawrence Stewart) Date: Thu Nov 20 01:07:42 2008 Subject: [Beowulf] Stroustrup regarding multicore In-Reply-To: <874p4xxqng.fsf@snark.cb.piermont.com> References: <87prnv20ky.fsf@snark.cb.piermont.com> <87d4jv1zwo.fsf@snark.cb.piermont.com> <20080826140446.0479b7b3@localhost.localdomain> <87zlmzy6iw.fsf@snark.cb.piermont.com> <49D2DE14-1F0A-44B9-B05F-BA49D3766C7E@sanger.ac.uk> <9E8A6C2B-DB97-4BCC-97E2-A97EE5A11FA7@xs4all.nl> <0974FCFCC905405B8278F2D805667336@dynamic.usna.edu> <87prntyydv.fsf@snark.cb.piermont.com> <20080903090417.GA15987@bx9.net> <874p4xxqng.fsf@snark.cb.piermont.com> Message-ID: <48BEAB22.9080406@sicortex.com> This discussion of letting scientists program reminds me of something that really impressed me about an earlier generation of folks at, I think, CERN. They had, for those days, a big real-time processing problem to process detector data, and they couldn't afford commercial computers to do it, so they built their own racks full of limited 360 clones to do the job. The programming AND the iron was completely incidental to their true goals. They regarded computers and programming as means, rather than ends in themselves, yet were not afraid to step outside their box anymore than a woodworker is afraid to build a jig or grind a chisel to achieve her ends. -- -Larry / Sector IX From james.p.lux at jpl.nasa.gov Wed Sep 3 08:58:21 2008 From: james.p.lux at jpl.nasa.gov (Lux, James P) Date: Thu Nov 20 01:07:42 2008 Subject: Software engineering Re: [Beowulf] Stroustrup regarding multicore In-Reply-To: <48BE9FC3.4080306@aplpi.com> Message-ID: On 9/3/08 7:31 AM, "stephen mulcahy" wrote: Perry E. Metzger wrote: > How it is possible that people managed to read that much and hear > exactly the inverse of my central thesis, I don't understand at > all. Perhaps everyone just hears what they want to. Sheesh, I resisted for a long time but .... The scenario above pretty much sums up the situation I see with one of the softer sides of software engineering - the requirements gathering, which I'd see as fundamental to a successful (software, or indeed general IT project). IMHO, the most important part of most projects is figuring out what the heck the "stakeholder"[1] wants in the first place. --- And that's assuming the stakeholder really understands what they want.. Often it evolves as understanding improves (this is one of the arguments for RAD and XP). No matter how good your programming is, if your requirements are wrong - you're heading in the wrong direction entirely (a bit like building a really neat spacecraft and then launching it towards Pluto instead of Mars[2]). ----- All depends on the alignments of planets and stars.. I wouldn't go so far as to say things are planned using astrology, but we (JPL) are probably one of the few businesses around that can use the motions of heavenly bodies to predict our business base and workforce requirements. Every 26 months as Earth comes into trine with Mars is an auspicious time for launch (you want to launch at a time that is roughly half the trip length before closest approach) This is SoftwareAnalysisAndDesign@beowulf.org right? --- you betcha.. When it's not HardwareAnalysisAndDesign... Jim Lux -stephen [1] Am I the only one that can't help using that word and visualing a Van Helsing type waving a wooden stake around? --- Cecil Adams of "The Straight Dope" says that wooden stakes only work on some kinds of beasts. It's apparently a geographic thing.. Other places you need silver bullets, garlic, or something else. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080903/61b17208/attachment.html From prentice at ias.edu Wed Sep 3 09:10:52 2008 From: prentice at ias.edu (Prentice Bisbal) Date: Thu Nov 20 01:07:42 2008 Subject: [Beowulf] Stroustrup regarding multicore In-Reply-To: <878wu9xqrv.fsf@snark.cb.piermont.com> References: <87prnv20ky.fsf@snark.cb.piermont.com> <87d4jv1zwo.fsf@snark.cb.piermont.com> <20080826140446.0479b7b3@localhost.localdomain> <87zlmzy6iw.fsf@snark.cb.piermont.com> <49D2DE14-1F0A-44B9-B05F-BA49D3766C7E@sanger.ac.uk> <9E8A6C2B-DB97-4BCC-97E2-A97EE5A11FA7@xs4all.nl> <0974FCFCC905405B8278F2D805667336@dynamic.usna.edu> <87prntyydv.fsf@snark.cb.piermont.com> <20080903090417.GA15987@bx9.net> <878wu9xqrv.fsf@snark.cb.piermont.com> Message-ID: <48BEB70C.40306@ias.edu> This discussion is still completely off-topic. This is a list about computing issues relating to beowulf clusters, not software engineering at large, sociology or psychology. -- Prentice From kyron at neuralbs.com Wed Sep 3 09:27:10 2008 From: kyron at neuralbs.com (Eric Thibodeau) Date: Thu Nov 20 01:07:42 2008 Subject: [Beowulf] Stroustrup regarding multicore In-Reply-To: <48BEB70C.40306@ias.edu> References: <87prnv20ky.fsf@snark.cb.piermont.com> <87d4jv1zwo.fsf@snark.cb.piermont.com> <20080826140446.0479b7b3@localhost.localdomain> <87zlmzy6iw.fsf@snark.cb.piermont.com> <49D2DE14-1F0A-44B9-B05F-BA49D3766C7E@sanger.ac.uk> <9E8A6C2B-DB97-4BCC-97E2-A97EE5A11FA7@xs4all.nl> <0974FCFCC905405B8278F2D805667336@dynamic.usna.edu> <87prntyydv.fsf@snark.cb.piermont.com> <20080903090417.GA15987@bx9.net> <878wu9xqrv.fsf@snark.cb.piermont.com> <48BEB70C.40306@ias.edu> Message-ID: <48BEBADE.3090202@neuralbs.com> Prentice Bisbal wrote: > This discussion is still completely off-topic. This is a list about > computing issues relating to beowulf clusters, not software engineering > at large, sociology or psychology. > Well, it seems the Beowulf mailing list is more vivid on those issues, my recent posts _were_ about HPC technicality and my Google Summer of Code project, the Gentoo Beowulf Clusering LiveCD and both got completely ignored (well...almost) Eric From gerry.creager at tamu.edu Wed Sep 3 09:39:25 2008 From: gerry.creager at tamu.edu (Gerry Creager) Date: Thu Nov 20 01:07:42 2008 Subject: [Beowulf] Stroustrup regarding multicore In-Reply-To: <48BEB70C.40306@ias.edu> References: <87prnv20ky.fsf@snark.cb.piermont.com> <87d4jv1zwo.fsf@snark.cb.piermont.com> <20080826140446.0479b7b3@localhost.localdomain> <87zlmzy6iw.fsf@snark.cb.piermont.com> <49D2DE14-1F0A-44B9-B05F-BA49D3766C7E@sanger.ac.uk> <9E8A6C2B-DB97-4BCC-97E2-A97EE5A11FA7@xs4all.nl> <0974FCFCC905405B8278F2D805667336@dynamic.usna.edu> <87prntyydv.fsf@snark.cb.piermont.com> <20080903090417.GA15987@bx9.net> <878wu9xqrv.fsf@snark.cb.piermont.com> <48BEB70C.40306@ias.edu> Message-ID: <48BEBDBD.5030903@tamu.edu> Prentice Bisbal wrote: > This discussion is still completely off-topic. This is a list about > computing issues relating to beowulf clusters, not software engineering > at large, sociology or psychology. Actually, it's approached software engineering on a socio-pathological level by now... -- Gerry Creager -- gerry.creager@tamu.edu Texas Mesonet -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.862.3982 FAX: 979.862.3983 Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843 From kyron at neuralbs.com Wed Sep 3 10:06:19 2008 From: kyron at neuralbs.com (Eric Thibodeau) Date: Thu Nov 20 01:07:42 2008 Subject: [Beowulf] Stroustrup regarding multicore In-Reply-To: <48BEBADE.3090202@neuralbs.com> References: <87prnv20ky.fsf@snark.cb.piermont.com> <87d4jv1zwo.fsf@snark.cb.piermont.com> <20080826140446.0479b7b3@localhost.localdomain> <87zlmzy6iw.fsf@snark.cb.piermont.com> <49D2DE14-1F0A-44B9-B05F-BA49D3766C7E@sanger.ac.uk> <9E8A6C2B-DB97-4BCC-97E2-A97EE5A11FA7@xs4all.nl> <0974FCFCC905405B8278F2D805667336@dynamic.usna.edu> <87prntyydv.fsf@snark.cb.piermont.com> <20080903090417.GA15987@bx9.net> <878wu9xqrv.fsf@snark.cb.piermont.com> <48BEB70C.40306@ias.edu> <48BEBADE.3090202@neuralbs.com> Message-ID: <48BEC40B.3010208@neuralbs.com> Eric Thibodeau wrote: > Prentice Bisbal wrote: >> This discussion is still completely off-topic. This is a list about >> computing issues relating to beowulf clusters, not software engineering >> at large, sociology or psychology. >> > Well, it seems the Beowulf mailing list is more vivid on those issues, > my recent posts _were_ about HPC technicality and my Google Summer of > Code project, the Gentoo Beowulf Clusering LiveCD and both got > completely ignored (well...almost) > > Eric Hehe, I pulled an RGB and CTRL-ENTERed too quickly... I forgot to mention that, nonetheless, I do enjoy reading these sometimes solilloquyish responses and, as a "student being sucked dry by getting him to do all the HPC clustering stuff" for the department, I find many of the comments pertinent and, at the very least, encouraging in the "hey, I'm not alone" sense of it. Cheers, Eric From james.p.lux at jpl.nasa.gov Wed Sep 3 10:22:12 2008 From: james.p.lux at jpl.nasa.gov (Lux, James P) Date: Thu Nov 20 01:07:42 2008 Subject: [Beowulf] Stroustrup regarding multicore In-Reply-To: <48BEB70C.40306@ias.edu> Message-ID: I would say that the single biggest problem in HPC today is not getting sufficient hardware horsepower, but in effectively using that power. 10 years ago, just getting a cluster going was a bit of a challenge, in terms of knowing what hardware to get, how to interconnect it, etc, but now, a lot of that is cookbook (or available turnkey from a variety of vendors... A very different matter from when Sterling, et al wrote their book back in 98/99). Sure, there are still hardware issues that are worthy of discussion on this list (details of interconnects, etc.), but one doesn?t see the discussions about topologies that one saw back then. The hardware is now to the point where you rack up the computers, hook them all to a very fast switch with huge bisection bandwidth, and you?re done. However, the topic of taking a simple problem and effectively parallelizing it (either at a EP level as can be done with some Monte Carlo or systematic simulations, or at a fine grained level, as with matrix numerical modeling) is very much grist for the mill. After all, what are all those folks building parallelizing/vectorizing compilers trying to do but reduce the substantial software engineering/design problem, so that a scientist or engineer can just write their problem out in simple form, and have ?the backend? figure out how to do it efficiently (or at all). There are many problems which are, by their nature, software design complex enough that it is not reasonable to have the person ?asking the question? also be knowledgeable enough to manage the substantial software development project. This would be true, if for no other reason than managing a software development effort takes a different skill set than asking good science or engineering questions. So, the real challenge facing builders (in the larger sense) of Beowulfs is in developing methods to get the work actually done, and if that requires developing skills in ?eliciting requirements? or, more probably, ?communicating between software speak and science speak?, then this is an appropriate place to do it (if not here, then where *would* be a place where it?s more germane.. I can't think of one off hand) It's sort of like our discussions about communicating with the facilities folks about power requirements or HVAC. Someone building a cluster needs to know something about this to be an intelligent consumer, but nobody expects the scientist to be down there sweating copper pipes for the chiller or cabling up the EPO button for the UPS. The list is valuable because there *are* folks here who do know how to sweat pipes, manage software projects, and interpret the electrical code, and you can ask a question about such things and get a host of responses, some more useful than others. Jim On 9/3/08 9:10 AM, "Prentice Bisbal" wrote: > This discussion is still completely off-topic. This is a list about > computing issues relating to beowulf clusters, not software engineering > at large, sociology or psychology. > > > From peter.st.john at gmail.com Wed Sep 3 10:34:49 2008 From: peter.st.john at gmail.com (Peter St. John) Date: Thu Nov 20 01:07:42 2008 Subject: [Beowulf] Stroustrup regarding multicore In-Reply-To: References: <48BEB70C.40306@ias.edu> Message-ID: I'm thinking that multicore will make topology interesting again, because of the difference between intercore on a common chip vs going through a nic to even the fastest fabric. Peter On 9/3/08, Lux, James P wrote: > > I would say that the single biggest problem in HPC today is not getting > sufficient hardware horsepower, but in effectively using that power. 10 > years ago, just getting a cluster going was a bit of a challenge, in terms > of knowing what hardware to get, how to interconnect it, etc, but now, a > lot > of that is cookbook (or available turnkey from a variety of vendors... A > very different matter from when Sterling, et al wrote their book back in > 98/99). Sure, there are still hardware issues that are worthy of discussion > on this list (details of interconnects, etc.), but one doesn?t see the > discussions about topologies that one saw back then. The hardware is now > to > the point where you rack up the computers, hook them all to a very fast > switch with huge bisection bandwidth, and you?re done. > > However, the topic of taking a simple problem and effectively parallelizing > it (either at a EP level as can be done with some Monte Carlo or systematic > simulations, or at a fine grained level, as with matrix numerical modeling) > is very much grist for the mill. > > After all, what are all those folks building parallelizing/vectorizing > compilers trying to do but reduce the substantial software > engineering/design problem, so that a scientist or engineer can just write > their problem out in simple form, and have ?the backend? figure out how to > do it efficiently (or at all). > > There are many problems which are, by their nature, software design complex > enough that it is not reasonable to have the person ?asking the question? > also be knowledgeable enough to manage the substantial software development > project. This would be true, if for no other reason than managing a > software > development effort takes a different skill set than asking good science or > engineering questions. > > So, the real challenge facing builders (in the larger sense) of Beowulfs is > in developing methods to get the work actually done, and if that requires > developing skills in ?eliciting requirements? or, more probably, > ?communicating between software speak and science speak?, then this is an > appropriate place to do it (if not here, then where *would* be a place > where > it?s more germane.. I can't think of one off hand) > > It's sort of like our discussions about communicating with the facilities > folks about power requirements or HVAC. Someone building a cluster needs > to > know something about this to be an intelligent consumer, but nobody expects > the scientist to be down there sweating copper pipes for the chiller or > cabling up the EPO button for the UPS. > > The list is valuable because there *are* folks here who do know how to > sweat > pipes, manage software projects, and interpret the electrical code, and you > can ask a question about such things and get a host of responses, some more > useful than others. > > Jim > > > > On 9/3/08 9:10 AM, "Prentice Bisbal" wrote: > > > This discussion is still completely off-topic. This is a list about > > computing issues relating to beowulf clusters, not software engineering > > at large, sociology or psychology. > > > > > > > > > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080903/56bf3379/attachment.html From ispmarin at gmail.com Wed Sep 3 10:35:21 2008 From: ispmarin at gmail.com (Ivan Marin) Date: Thu Nov 20 01:07:42 2008 Subject: [Beowulf] Stroustrup regarding multicore In-Reply-To: <48BEC40B.3010208@neuralbs.com> References: <87prnv20ky.fsf@snark.cb.piermont.com> <87d4jv1zwo.fsf@snark.cb.piermont.com> <20080826140446.0479b7b3@localhost.localdomain> <87zlmzy6iw.fsf@snark.cb.piermont.com> <49D2DE14-1F0A-44B9-B05F-BA49D3766C7E@sanger.ac.uk> <9E8A6C2B-DB97-4BCC-97E2-A97EE5A11FA7@xs4all.nl> <0974FCFCC905405B8278F2D805667336@dynamic.usna.edu> <87prntyydv.fsf@snark.cb.piermont.com> <20080903090417.GA15987@bx9.net> <878wu9xqrv.fsf@snark.cb.piermont.com> <48BEB70C.40306@ias.edu> <48BEBADE.3090202@neuralbs.com> <48BEC40B.3010208@neuralbs.com> Message-ID: <48BECAD9.6030502@gmail.com> I second Eric. I've been following this discussion, and identified myself in several different ways... The RGB's definition of a physicist was just hilarious (and sadly true.). Learned a lot on this thread, and the "not alone" feeling is at least reconforting. Ivan Eric Thibodeau escreveu: > Eric Thibodeau wrote: >> Prentice Bisbal wrote: >>> This discussion is still completely off-topic. This is a list about >>> computing issues relating to beowulf clusters, not software engineering >>> at large, sociology or psychology. >>> >> Well, it seems the Beowulf mailing list is more vivid on those >> issues, my recent posts _were_ about HPC technicality and my Google >> Summer of Code project, the Gentoo Beowulf Clusering LiveCD and both >> got completely ignored (well...almost) >> >> Eric > Hehe, I pulled an RGB and CTRL-ENTERed too quickly... I forgot to > mention that, nonetheless, I do enjoy reading these sometimes > solilloquyish responses and, as a "student being sucked dry by getting > him to do all the HPC clustering stuff" for the department, I find > many of the comments pertinent and, at the very least, encouraging in > the "hey, I'm not alone" sense of it. > > Cheers, > > Eric > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > From james.p.lux at jpl.nasa.gov Wed Sep 3 10:39:27 2008 From: james.p.lux at jpl.nasa.gov (Lux, James P) Date: Thu Nov 20 01:07:42 2008 Subject: [Beowulf] Stroustrup regarding multicore In-Reply-To: Message-ID: On 9/3/08 10:34 AM, "Peter St. John" wrote: I'm thinking that multicore will make topology interesting again, because of the difference between intercore on a common chip vs going through a nic to even the fastest fabric. Peter Yes, indeed.. Actually, it may be more like deja vu, because the core has it's own little address space, and then the space available through the fabric (which looks a lot, conceptually, like the pile of 386 PC with 10BaseT). Even more interesting will be that to effectively use them, some conceptual thought will have to be put into effectively using the techniques for communicating among processes, which don't necessarily run in lockstep systolic array fashion (or SIMD). Jim -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080903/7279357f/attachment.html From larry.stewart at sicortex.com Wed Sep 3 11:42:26 2008 From: larry.stewart at sicortex.com (Lawrence Stewart) Date: Thu Nov 20 01:07:42 2008 Subject: [Beowulf] Stroustrup regarding multicore In-Reply-To: References: Message-ID: <48BEDA92.6070504@sicortex.com> Lux, James P wrote: > > > > On 9/3/08 10:34 AM, "Peter St. John" wrote: > > I'm thinking that multicore will make topology interesting again, > because of the difference between intercore on a common chip vs > going through a nic to even the fastest fabric. > Peter > It is probably worth putting numbers on statements like this. For example, a main memory reference on a fast processor these days is around 80 nanoseconds. Sending a message to a process on another node on a fast IB network is getting to 1.2 microseconds. Communicating to another thread on the same socket is probably not much faster than a memory reference since you have to thrash a cache-line or two back and forth between cores. The numbers for SiCortex stuff are similar: 80 ns for memory, 1 microsecond for MPI nearest-neighbor, 1.3 microseconds for max-diameter. Core to core via shared memory is about 300 ns, IIRC. We think of messaging to other nodes as taking a long time, but it isn't really so. It is perfectly reasonable to think of programs that communicate every 1000 flops or so, in the same way we think of 15-50 flops per cache miss as "reasonable". So I am deeply skeptical of the current furor about how we need new programming models for "multicore chips". We have models that work perfectly well for 100-1000 core clusters, lets use them. -- -Larry / Sector IX -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080903/6a2a543e/attachment.html From perry at piermont.com Wed Sep 3 12:59:46 2008 From: perry at piermont.com (Perry E. Metzger) Date: Thu Nov 20 01:07:42 2008 Subject: [Beowulf] Stroustrup regarding multicore In-Reply-To: <48BEDA92.6070504@sicortex.com> (Lawrence Stewart's message of "Wed\, 03 Sep 2008 14\:42\:26 -0400") References: <48BEDA92.6070504@sicortex.com> Message-ID: <878wu9t2z1.fsf@snark.cb.piermont.com> Lawrence Stewart writes: >> On 9/3/08 10:34 AM, "Peter St. John" wrote: >> >> I'm thinking that multicore will make topology interesting again, >> because of the difference between intercore on a common chip vs >> going through a nic to even the fastest fabric. >> Peter > > It is probably worth putting numbers on statements like this. For > example, a main memory reference on a fast processor these days is > around 80 nanoseconds. Sending a message to a process on another > node on a fast IB network is getting to 1.2 microseconds. > Communicating to another thread on the same socket is probably not > much faster than a memory reference since you have to thrash a > cache-line or two back and forth between cores. Quite. It is possible that future generations of multi-core architectures will do differently, but right now, a multi-core chip looks a lot (to software) like a normal SMP setup. (I do wonder a lot whether a return to vector architectures might make more sense than multi-core -- there is at least a lot of precedent for making use of vector silicon with good compilers.) > So I am deeply skeptical of the current furor about how we need new > programming models for "multicore chips". We have models that work > perfectly well for 100-1000 core clusters, lets use them. Well, not quite. The HPC community is very good at using such things, so it isn't going to have trouble. The issue is not for people doing scientific computing, but for people doing "normal" applications. Beyond the scope of this mailing list of course. -- Perry E. Metzger perry@piermont.com From herborn at usna.edu Wed Sep 3 13:19:56 2008 From: herborn at usna.edu (Steve Herborn) Date: Thu Nov 20 01:07:42 2008 Subject: [Beowulf] Stroustrup regarding multicore In-Reply-To: References: <6.2.5.6.2.20080826081513.04f14298@swcp.com> <87prnv20ky.fsf@snark.cb.piermont.com> <87d4jv1zwo.fsf@snark.cb.piermont.com> <20080826140446.0479b7b3@localhost.localdomain> <87zlmzy6iw.fsf@snark.cb.piermont.com> <49D2DE14-1F0A-44B9-B05F-BA49D3766C7E@sanger.ac.uk> <9E8A6C2B-DB97-4BCC-97E2-A97EE5A11FA7@xs4all.nl> <0974FCFCC905405B8278F2D805667336@dynamic.usna.edu> <87prntyydv.fsf@snark.cb.piermont.com> Message-ID: <2196A5AE3175497A9788E9523217FB75@dynamic.usna.edu> I guess that is why I have always preferred a RAD/JAD environment to a strict Waterfall one. One can spend eons creating the world's most perfect spec, but then the problem changes. From my perspective ideally the Subject Matter Expert & the Programmer become tied at the hip with each taking away a little of the others knowledge by the time the "project" is over & done with. When I first started programming in support of Electronic Warfare I didn't know doodle about RADAR parametric data, know I at least know how to spell it. :-) _____ From: Peter St. John [mailto:peter.st.john@gmail.com] Sent: Thursday, August 28, 2008 11:54 AM To: Perry E. Metzger Cc: Steve Herborn; Beowulf@beowulf.org Subject: Re: [Beowulf] Stroustrup regarding multicore I agree entirely with Perry here. I'd take it further: even in the case of giving the machinist instructions, "12x12 with holes here and here", it would help if the machinist has some sense of what you are building. Will the product be hot enough so that metal expands and contracts? humidity? should the finish be gloss or matte? copper, aluminum...? The machinist will help you build a better gizmo if he has some feel for what your are building, what you need the part for. He will have relevant experience from his machinist perspective. Mixed unit tactics are the path to victory, but the mixed units perform better together if they have some sense of each other's jobs. Some of us specialize in a very focused way, some of us generalize, but to work as a team we need to learn some of each other's jobs. I think a physicist programming is like an astronomer grinding lenses (maybe nobody does that anymore). Some astronomers (in the old days) ground their own lenses and ended up contributing to optics; others never looked through telescopes, they do math on the measurements taken by others. Some computer scientists don't program; some mathematicians can hardly use email. But most of us learn bits and pieces of each others jobs, in varying degrees; it's necessary to communicate effectively, and what's the subject for one guy is a tool for another guy. Peter On 8/28/08, Perry E. Metzger wrote: "Steve Herborn" writes: > However, that being said I would think that it is usually easier to teach a > Scientist to code, then a coder the PhD level of the science. I think either is fine -- you wind up with someone who knows both. The problem is when you try to segregate the two skills. I think I finally have the right analogy. A physicist is interested in advancing physics, not in advancing mathematics, but as the tools of physics are all made of math, he cannot ignore the math or hope to turn to a specialized hired mathematician who knows no physics to do his math thinking for him. The math and the physics are integrated -- you need one mind to see both in order to get anywhere. Writing good software for physics problems is no different. The physics and the software are one. You can complain that you want to do physics, not computing, but that's exactly like complaining you want to do physics and not math. Indeed, software pretty much *IS* math. The attempt seems to be to somehow treat the computer science as though it were the software equivalent of machine shop work. You're building a new instrument, so you draw up the parts, and then you ask a machinist to make them. "I need a sheet of metal 12cm by 12cm with holes here and here." It is somehow imagined that you can do that with the software -- you make some vague guesses about what you might need and write a spec (which is imagined to be like a blueprint for a part) and ask a "software machinist" to make it for you. Unfortunately, this misses the point -- the computer programming is not like machining the parts for the instrument, it is like *designing* the instrument. That requires both knowledge of both fields, not just of one. It is not at all like machining. This is of course a serious problem. It takes at least several years of effort to become facile with computer software just as it takes several years of effort to become facile with calculus, differential equations, etc., etc., and fundamentally one wants to be doing science, not math or computer programming, but I can't see any real way around it in the long term if progress is to be made. Incidentally, THIS IS NOT A NEW ARGUMENT. It was only a little over a century ago that people scoffed at the idea that engineers needed to learn higher mathematics. "I'm trying to build a bridge, not to do math!" was the general sort of attitude that was common. Eventually, people realized that there was no way around it, you just had to spend the time to learn the math or you couldn't be productive. I expect something similar is going to happen here. Perry -- Perry E. Metzger perry@piermont.com _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080903/647fe9ed/attachment.html From gerry.creager at tamu.edu Wed Sep 3 15:46:14 2008 From: gerry.creager at tamu.edu (Gerry Creager) Date: Thu Nov 20 01:07:42 2008 Subject: Software engineering Re: [Beowulf] Stroustrup regarding multicore In-Reply-To: References: Message-ID: <48BF13B6.5060101@tamu.edu> Lux, James P wrote: > > > > On 9/3/08 7:31 AM, "stephen mulcahy" wrote: > > > > Perry E. Metzger wrote: > > How it is possible that people managed to read that much and hear > > exactly the inverse of my central thesis, I don't understand at > > all. Perhaps everyone just hears what they want to. > > Sheesh, I resisted for a long time but .... > > The scenario above pretty much sums up the situation I see with one of > the softer sides of software engineering - the requirements gathering, > which I'd see as fundamental to a successful (software, or indeed > general IT project). IMHO, the most important part of most projects is > figuring out what the heck the "stakeholder"[1] wants in the first > place. > > --- And that?s assuming the stakeholder really understands what they > want.. Often it evolves as understanding improves (this is one of > the arguments for RAD and XP). Rule 1: Never let an oceanographer with 2 FORTRAN courses design or maintain any software project with more than 16 lines of code checked into the repository. Rule 2: Oceanographers with less than 2 formal FORTRAN classes have decided they're really nacent software engineers because they've mastered most of the buzzwords, and thus will give you all sorts of "requirements" which are typically orthogonal to any software design or engineering training you've had. Rule 3: If you finally convince the denizens of Rule 2 that their application cannot be written as spec'd, they are suddenly experts in "Service Oriented Architecture" and Software as a Service. And you're not. So there. Just believe me. Really. Cause I said so. Don't ask me how learned these truths. Oh, and the bar is slightly higher for meteorologists, by a couple of additional formal software classes. But in the grand scheme of things, it's not much higher... > No matter how good your programming is, if your requirements are > wrong - you're heading in the wrong direction entirely (a bit like > building a really neat spacecraft and then launching it towards Pluto > instead of Mars[2]). ibid. Been there, done that. But not for spacecraft. Or, I could talk about computer scientists just now discovering what discipline experts do for the Data-Net NSF call that's out now, but that's another thread... > ----- All depends on the alignments of planets and stars.. I > wouldn?t go so far as to say things are planned using astrology, but > we (JPL) are probably one of the few businesses around that can use > the motions of heavenly bodies to predict our business base and > workforce requirements. Every 26 months as Earth comes into trine > with Mars is an auspicious time for launch (you want to launch at a > time that is roughly half the trip length before closest approach) Show-off :-) > This is SoftwareAnalysisAndDesign@beowulf.org right? > > > --- you betcha.. When it?s not HardwareAnalysisAndDesign... > > Jim Lux > > > -stephen > > [1] Am I the only one that can't help using that word and visualing a > Van Helsing type waving a wooden stake around? They're grasping it to keep me from driving it through their hearts... Oh, yeah. That meeting is already over. > --- Cecil Adams of ?The Straight Dope? says that wooden stakes only > work on some kinds of beasts. It?s apparently a geographic thing.. > Other places you need silver bullets, garlic, or something else. My personal preference is a garlic-flavored wooden stake. I keep the silver bullets for backup when I missed with the stake. -- Gerry Creager -- gerry.creager@tamu.edu Texas Mesonet -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.862.3982 FAX: 979.862.3983 Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843 From james.p.lux at jpl.nasa.gov Wed Sep 3 15:50:39 2008 From: james.p.lux at jpl.nasa.gov (Lux, James P) Date: Thu Nov 20 01:07:42 2008 Subject: Software engineering Re: [Beowulf] Stroustrup regarding multicore In-Reply-To: <48BF13B6.5060101@tamu.edu> References: <48BF13B6.5060101@tamu.edu> Message-ID: > --- Cecil Adams of "The Straight Dope" says that wooden stakes only > work on some kinds of beasts. It's apparently a geographic thing.. > Other places you need silver bullets, garlic, or something else. My personal preference is a garlic-flavored wooden stake. I keep the silver bullets for backup when I missed with the stake. Because it's really, really important... http://www.straightdope.com/columns/read/37/whats-the-best-way-to-kill-a-vampire From libo at buaa.edu.cn Wed Sep 3 23:34:00 2008 From: libo at buaa.edu.cn (Li, Bo) Date: Thu Nov 20 01:07:42 2008 Subject: [Beowulf] Re: Beowulf Digest, Vol 55, Issue 2 References: <200809021901.m82J0IaS029547@bluewest.scyld.com> <48BF1508.7000406@harddata.com> Message-ID: <002001c90e58$3c20c020$6300a8c0@LIBO> Hello, Is it too expensive for the platform? The easy solution is: And X48 level motherboard with CF support, about $150 Q6600 Processor, about $170 Two 4870X2 $1,100 Two Seagate SATA Harddisk 500G for Raid1, about $140 4*2G DDR2 RAM, about $150 PSU 1000W, about $200 A big box, about $100 That's all, in total, $2,010. Regards, Li, Bo ----- Original Message ----- From: Maurice Hilarius To: beowulf@beowulf.org Cc: kus@free.net ; libo@buaa.edu.cn ; i.kozin@dl.ac.uk Sent: Thursday, September 04, 2008 6:51 AM Subject: Re: Beowulf Digest, Vol 55, Issue 2 Li, bo wrote: .. From: "Li, Bo" Subject: Re: [Beowulf] gpgpu Hello, It seemed that you had got a very good example for GPGPU. As I said before, it's not the time for GPGPU to do the DP calculation at the moment. If you can bear SP computation, you will find more about it. NVidia just sent me some special offer about their Tesla platforms, which said that the workstation equipped with two GTX280 level professional cards costs about $5000, not bad. But my intention is still to lower the core frequency of a gaming card, and use it for computation. Regards, Li, Bo Looking at AMD/ATI Firestream and 4850 pricing, it is not too bad: AMD FIRESTREAM 9250 STREAM PROCESSOR (P/N: 100-505563) $880 VISIONTEK RADEON HD4870X2 2GB PCI-E (P/N: 900250) $575 VISIONTEK RADEON HD 4870 512MB PCI-E (P/N: 900244) $355 The 4870 and X2 also run the AMD code. So, given a decent machine, with 4 cores and a pair of the 4870X2, one can achieve some pretty amazing GPU performance levels for a system well under $4,000. With dualX2s ( 4 GPU engines) around $4700 ( extra PSU capacity and cooling is needed for that level). I hear that AMD have a new Firestream coming, with the 48x0 family chips on it, but that will likely be a bit on the pricier side.. Anyway, the Firestream has GPUs with Double-Precision Floating Point. Something the nVidia offerings do not. Worth considering. http://ati.amd.com/technology/streamcomputing/product_firestream_9250.html SDK: http://ati.amd.com/technology/streamcomputing/sdkdwnld.html -- With our best regards, Maurice W. Hilarius Telephone: 01-780-456-9771 Hard Data Ltd. FAX: 01-780-456-9772 11060 - 166 Avenue email:maurice@harddata.com Edmonton, AB, Canada http://www.harddata.com/ T5X 1Y3 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080904/08880a09/attachment.html From maurice at harddata.com Wed Sep 3 15:51:52 2008 From: maurice at harddata.com (Maurice Hilarius) Date: Thu Nov 20 01:07:42 2008 Subject: [Beowulf] Re: Beowulf Digest, Vol 55, Issue 2 In-Reply-To: <200809021901.m82J0IaS029547@bluewest.scyld.com> References: <200809021901.m82J0IaS029547@bluewest.scyld.com> Message-ID: <48BF1508.7000406@harddata.com> Li, bo wrote: > .. > From: "Li, Bo" > Subject: Re: [Beowulf] gpgpu > > Hello, > It seemed that you had got a very good example for GPGPU. As I said before, it's not the time for GPGPU to do the DP calculation at the moment. If you can bear SP computation, you will find more about it. > NVidia just sent me some special offer about their Tesla platforms, which said that the workstation equipped with two GTX280 level professional cards costs about $5000, not bad. But my intention is still to lower the core frequency of a gaming card, and use it for computation. > Regards, > Li, Bo > Looking at AMD/ATI Firestream and 4850 pricing, it is not too bad: AMD FIRESTREAM 9250 STREAM PROCESSOR (P/N: 100-505563) $880 VISIONTEK RADEON HD4870X2 2GB PCI-E (P/N: 900250) $575 VISIONTEK RADEON HD 4870 512MB PCI-E (P/N: 900244) $355 The 4870 and X2 also run the AMD code. So, given a decent machine, with 4 cores and a pair of the 4870X2, one can achieve some pretty amazing GPU performance levels for a system well under $4,000. With dualX2s ( 4 GPU engines) around $4700 ( extra PSU capacity and cooling is needed for that level). I hear that AMD have a new Firestream coming, with the 48x0 family chips on it, but that will likely be a bit on the pricier side.. Anyway, the Firestream has GPUs with Double-Precision Floating Point. Something the nVidia offerings do not. Worth considering. http://ati.amd.com/technology/streamcomputing/product_firestream_9250.html SDK: http://ati.amd.com/technology/streamcomputing/sdkdwnld.html -- With our best regards, //Maurice W. Hilarius Telephone: 01-780-456-9771/ /Hard Data Ltd. FAX: 01-780-456-9772/ /11060 - 166 Avenue email:maurice@harddata.com/ /Edmonton, AB, Canada http://www.harddata.com// / T5X 1Y3/ / -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080903/90d3f915/attachment.html From Craig.Tierney at noaa.gov Thu Sep 4 08:56:13 2008 From: Craig.Tierney at noaa.gov (Craig Tierney) Date: Thu Nov 20 01:07:42 2008 Subject: [Beowulf] Re: Beowulf Digest, Vol 55, Issue 2 In-Reply-To: <48BF1508.7000406@harddata.com> References: <200809021901.m82J0IaS029547@bluewest.scyld.com> <48BF1508.7000406@harddata.com> Message-ID: <48C0051D.2020700@noaa.gov> ... stuff deleted > > AMD FIRESTREAM 9250 STREAM PROCESSOR (P/N: 100-505563) $880 > VISIONTEK RADEON HD4870X2 2GB PCI-E (P/N: 900250) $575 > VISIONTEK RADEON HD 4870 512MB PCI-E (P/N: 900244) $355 > > The 4870 and X2 also run the AMD code. > > So, given a decent machine, with 4 cores and a pair of the 4870X2, one > can achieve some pretty amazing GPU > performance levels for a system well under $4,000. > > With dualX2s ( 4 GPU engines) around $4700 ( extra PSU capacity and > cooling is needed for that level). > > I hear that AMD have a new Firestream coming, with the 48x0 family chips > on it, but that will likely be a bit on the pricier side.. > > Anyway, the Firestream has GPUs with Double-Precision Floating Point. > Something the nVidia offerings do not. > > Worth considering. > This is not correct. The NVIDIA GT200 series supports IEEE DP FP in hardware. NVIDIA only has 1 DP FP unit per streaming processor (24 on the GTX280) which is 1/8 the number of units of single-precision floating point (each thread has its own unit). So the max DP FP rate on a GTX280 is about 90 Gflops. Does anyone know the peak bandwidth of the new Firestream cards? I looked around and all I could find is that it uses GDDR3. The wikipedia entry says the max bandwidth of the 9250 is 63.5 GB/s. This is less than half the GTX280 (max at 140 GB/s, measured using stream like app at 115 GB/s). If it is true, the GTX280 may be better for memory bound codes. That is, if we can write efficient code for them and leave the whole problem on the GPU to avoid memory bandwidth issues across the bus. Craig > > > ------------------------------------------------------------------------ > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- Craig Tierney (craig.tierney@noaa.gov) From kus at free.net Thu Sep 4 09:51:51 2008 From: kus at free.net (Mikhail Kuzminsky) Date: Thu Nov 20 01:07:42 2008 Subject: [Beowulf] Re: Beowulf Digest, Vol 55, Issue 2 In-Reply-To: <002001c90e58$3c20c020$6300a8c0@LIBO> Message-ID: In message from "Li, Bo" (Thu, 4 Sep 2008 14:34:00 +0800): >Hello, >Is it too expensive for the platform? >The easy solution is: >And X48 level motherboard with CF support, about $150 >Q6600 Processor, about $170 >Two 4870X2 $1,100 Do somebody know, are ACML routines parallelized for using of few GPGPUs ? Mikhail >Two Seagate SATA Harddisk 500G for Raid1, about $140 >4*2G DDR2 RAM, about $150 >PSU 1000W, about $200 >A big box, about $100 > >That's all, in total, $2,010. >Regards, >Li, Bo > ----- Original Message ----- > From: Maurice Hilarius > To: beowulf@beowulf.org > Cc: kus@free.net ; libo@buaa.edu.cn ; i.kozin@dl.ac.uk > Sent: Thursday, September 04, 2008 6:51 AM > Subject: Re: Beowulf Digest, Vol 55, Issue 2 > > > Li, bo wrote: >.. >From: "Li, Bo" >Subject: Re: [Beowulf] gpgpu > >Hello, >It seemed that you had got a very good example for GPGPU. As I said >before, it's not the time for GPGPU to do the DP calculation at the >moment. If you can bear SP computation, you will find more about it. >NVidia just sent me some special offer about their Tesla platforms, >which said that the workstation equipped with two GTX280 level >professional cards costs about $5000, not bad. But my intention is >still to lower the core frequency of a gaming card, and use it for >computation. >Regards, >Li, Bo > Looking at AMD/ATI Firestream and 4850 pricing, it is not too bad: > > AMD FIRESTREAM 9250 STREAM PROCESSOR (P/N: 100-505563) $880 > VISIONTEK RADEON HD4870X2 2GB PCI-E (P/N: 900250) > $575 > VISIONTEK RADEON HD 4870 512MB PCI-E (P/N: 900244) > $355 > > The 4870 and X2 also run the AMD code. > > So, given a decent machine, with 4 cores and a pair of the 4870X2, >one can achieve some pretty amazing GPU > performance levels for a system well under $4,000. > > With dualX2s ( 4 GPU engines) around $4700 ( extra PSU capacity and >cooling is needed for that level). > > I hear that AMD have a new Firestream coming, with the 48x0 family >chips on it, but that will likely be a bit on the pricier side.. > > Anyway, the Firestream has GPUs with Double-Precision Floating >Point. > Something the nVidia offerings do not. > > Worth considering. > > http://ati.amd.com/technology/streamcomputing/product_firestream_9250.html > > SDK: > http://ati.amd.com/technology/streamcomputing/sdkdwnld.html > > > > > -- > With our best regards, > > Maurice W. Hilarius Telephone: 01-780-456-9771 > Hard Data Ltd. FAX: 01-780-456-9772 > 11060 - 166 Avenue email:maurice@harddata.com > Edmonton, AB, Canada http://www.harddata.com/ > T5X 1Y3 From prentice at ias.edu Thu Sep 4 11:37:23 2008 From: prentice at ias.edu (Prentice Bisbal) Date: Thu Nov 20 01:07:42 2008 Subject: [Beowulf] Infiniband Subnet Manager In-Reply-To: <48B69CEE.3040802@ias.edu> References: <48B69CEE.3040802@ias.edu> Message-ID: <48C02AE3.9040806@ias.edu> Prentice Bisbal wrote: > Since an infiniband fabric needs a subnet mananger, should the master > node have an IB HCA and be connected to the IB network in order to run > the subnet manager? > > My logic behind this is that the master node will be full > enterprise-level hardware (redundant every thing), and should never go > down or be rebooted during normal use. I expect the nodes to go down > more frequently (not fully redundant hardware, higher operating loads, > etc.). > > Exactly what functions does the subnet manager perform, and what happens > if it disappears from the IB fabric? > > I've been doing research into IB all day yesterday, and I'm continuing > today, so please no RTFM answers. > I've gotten a lot of response to my IB questions that I posed to the list. Thanks for all your help. All of my questions have been answered. It turns out, as some as you pointed out, that my switch will have a built-in subnet manager, so I won't need to run one on a node. -- Prentice From atp at piskorski.com Thu Sep 4 12:54:20 2008 From: atp at piskorski.com (Andrew Piskorski) Date: Thu Nov 20 01:07:42 2008 Subject: [Beowulf] Nvidia GT200, double precision vs. native pair In-Reply-To: <48C0051D.2020700@noaa.gov> References: <48C0051D.2020700@noaa.gov> Message-ID: <20080904195420.GC34060@piskorski.com> On Thu, Sep 04, 2008 at 09:56:13AM -0600, Craig Tierney wrote: > Subject: Re: [Beowulf] Re: Beowulf Digest, Vol 55, Issue 2 > This is not correct. The NVIDIA GT200 series supports IEEE DP FP in > hardware. NVIDIA only has 1 DP FP unit per streaming processor (24 > on the GTX280) which is 1/8 the number of units of single-precision > floating point (each thread has its own unit). So the max DP FP > rate on a GTX280 is about 90 Gflops. So has anyone taken those 8 single-precision floating point units and tried using them to get double-precision or better accuracy? Perhaps using the "native-pair" and "speculative precision" approaches discussed here: http://aggregate.org/NPAR/ The 2006 paper there talks about doing so on a Nvidia GeForce 6800 Ultra, on which a (c. 64 bit) native-pair calculation took about 10x the clock cycles of a single 32 bit flop (better for sqrt). -- Andrew Piskorski http://www.piskorski.com/ From niftyompi at niftyegg.com Thu Sep 4 15:16:36 2008 From: niftyompi at niftyegg.com (Nifty niftyompi Mitch) Date: Thu Nov 20 01:07:42 2008 Subject: [Beowulf] Infiniband Subnet Manager In-Reply-To: <48C02AE3.9040806@ias.edu> References: <48B69CEE.3040802@ias.edu> <48C02AE3.9040806@ias.edu> Message-ID: <20080904221636.GA4234@hpegg.wr.niftyegg.com> On Thu, Sep 04, 2008 at 02:37:23PM -0400, Prentice Bisbal wrote: > Prentice Bisbal wrote: > > Since an infiniband fabric needs a subnet mananger, should the master > > node have an IB HCA and be connected to the IB network in order to run > > the subnet manager? > > > > My logic behind this is that the master node will be full > > enterprise-level hardware (redundant every thing), and should never go > > down or be rebooted during normal use. I expect the nodes to go down > > more frequently (not fully redundant hardware, higher operating loads, > > etc.). > > > > Exactly what functions does the subnet manager perform, and what happens > > if it disappears from the IB fabric? > > > > I've been doing research into IB all day yesterday, and I'm continuing > > today, so please no RTFM answers. > > > > I've gotten a lot of response to my IB questions that I posed to the > list. Thanks for all your help. All of my questions have been answered. > It turns out, as some as you pointed out, that my switch will have a > built-in subnet manager, so I won't need to run one on a node. > I should add that a built in subnet manager is extra $$. Also they tend to run on a modest dedicated processor card. The modest dedicated card solutions have limited RAM and will not support a gonzo big fabric. A rule of thumb, depending on the subnet manager on the card the run out of memory recources trip point is about 144 ports. The richer the statistics gathered and retained the larger the footprint is. Large fabrics will need a host based subnet manager. -- T o m M i t c h e l l Got a great hat... now what. From libo at buaa.edu.cn Thu Sep 4 22:06:49 2008 From: libo at buaa.edu.cn (Li, Bo) Date: Thu Nov 20 01:07:42 2008 Subject: [Beowulf] Re: GPU boards and cluster servers. References: <200809021901.m82J0IaS029547@bluewest.scyld.com> <48BF1508.7000406@harddata.com> <002001c90e58$3c20c020$6300a8c0@LIBO> <48C08D34.1060704@harddata.com> Message-ID: <002401c90f15$38e48500$6300a8c0@LIBO> Hello, It seems your platform is more suitable for a cluster. Great, and when are the products available? And is there any software support from you? Regards, Li, Bo ----- Original Message ----- From: Maurice Hilarius To: Li, Bo Cc: kus@free.net ; i.kozin@dl.ac.uk ; Beowulf Mailing List Sent: Friday, September 05, 2008 9:36 AM Subject: GPU boards and cluster servers. Li, Bo wrote: Hello, Is it too expensive for the platform? The easy solution is: And X48 level motherboard with CF support, about $150 Q6600 Processor, about $170 Two 4870X2 $1,100 Two Seagate SATA Harddisk 500G for Raid1, about $140 4*2G DDR2 RAM, about $150 PSU 1000W, about $200 A big box, about $100 That's all, in total, $2,010. Regards, Li, Bo True, to a point. Most people will not use a desktop board for a cluster. Too I/O bound. Finally the memory capacity of these desktop boards is pretty limiting. Typically 8GB maximum. Generally a XEON or Opteron chipset and CPUs will be the choice. Also, for most GPU/FPU performance work, the memory bandwidth bottleneck on the Intel product is too much of a negative factor. Lastly, for clusters, most want a rackmount chassis. We developed a 2U designed for a server board and 2 GPU boards. The big challenge there is power. We use dual 600W PSUs. One for motherboard, and one for dual GPU boards. -- With our best regards, Maurice W. Hilarius Telephone: 01-780-456-9771 Hard Data Ltd. FAX: 01-780-456-9772 11060 - 166 Avenue email:maurice@harddata.com Edmonton, AB, Canada http://www.harddata.com/ T5X 1Y3 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080905/397edc3b/attachment.html From andrew at moonet.co.uk Fri Sep 5 00:51:55 2008 From: andrew at moonet.co.uk (andrew holway) Date: Thu Nov 20 01:07:42 2008 Subject: [Beowulf] Re: GPU boards and cluster servers. In-Reply-To: <002401c90f15$38e48500$6300a8c0@LIBO> References: <200809021901.m82J0IaS029547@bluewest.scyld.com> <48BF1508.7000406@harddata.com> <002001c90e58$3c20c020$6300a8c0@LIBO> <48C08D34.1060704@harddata.com> <002401c90f15$38e48500$6300a8c0@LIBO> Message-ID: The new Dell R5400 Rackmount workstation is ideal for this. You can slip two Xeons, 16GB ram and two chunky graphics cards in there. ta Andy On Fri, Sep 5, 2008 at 6:06 AM, Li, Bo wrote: > Hello, > It seems your platform is more suitable for a cluster. Great, and when are > the products available? And is there any software support from you? > Regards, > Li, Bo > > ----- Original Message ----- > From: Maurice Hilarius > To: Li, Bo > Cc: kus@free.net ; i.kozin@dl.ac.uk ; Beowulf Mailing List > Sent: Friday, September 05, 2008 9:36 AM > Subject: GPU boards and cluster servers. > Li, Bo wrote: > > Hello, > Is it too expensive for the platform? > The easy solution is: > And X48 level motherboard with CF support, about $150 > Q6600 Processor, about $170 > Two 4870X2 $1,100 > Two Seagate SATA Harddisk 500G for Raid1, about $140 > 4*2G DDR2 RAM, about $150 > PSU 1000W, about $200 > A big box, about $100 > > That's all, in total, $2,010. > Regards, > Li, Bo > > True, to a point. > Most people will not use a desktop board for a cluster. > Too I/O bound. > Finally the memory capacity of these desktop boards is pretty limiting. > Typically 8GB maximum. > > > Generally a XEON or Opteron chipset and CPUs will be the choice. > > Also, for most GPU/FPU performance work, the memory bandwidth bottleneck on > the Intel product is too much of a negative factor. > > Lastly, for clusters, most want a rackmount chassis. > We developed a 2U designed for a server board and 2 GPU boards. > The big challenge there is power. > > We use dual 600W PSUs. One for motherboard, and one for dual GPU boards. > > > -- > With our best regards, > > Maurice W. Hilarius Telephone: 01-780-456-9771 > Hard Data Ltd. FAX: 01-780-456-9772 > 11060 - 166 Avenue email:maurice@harddata.com > Edmonton, AB, Canada http://www.harddata.com/ > T5X 1Y3 > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > From i.kozin at dl.ac.uk Fri Sep 5 04:36:53 2008 From: i.kozin at dl.ac.uk (Kozin, I (Igor)) Date: Thu Nov 20 01:07:42 2008 Subject: [Beowulf] Re: GPU boards and cluster servers. In-Reply-To: Message-ID: > The new Dell R5400 Rackmount workstation is ideal for this. You can > slip two Xeons, 16GB ram and two chunky graphics cards in there. The slots in R5400 are PCIe gen1 and 300W total for the graphics might be a bit too low. The best I've seen so far are 1U HP DL160G5 servers which offer two PCIe 16x Gen2 slots. Granted, you will not be able to fit in a powerful graphics card in there but a Tesla setup works quite well. There is a very interesting recent report published by HP http://www.hp.com/techservers/hpccn/hpccollaboration/ADCatalyst/download s/accelerating-HPCUsing-GPUs.pdf They benchmarked DL160G5 (with single processor => pretty low cost of the host server) with S870 attached to it. Observed peak performance on SGEMM was about 200 GFLOPS which is much lower than the theoretical peak 512 GFLOPS (even much less than 350 sustained claimed by Nvidia). When they factor in i/o, the performance rapidly approaches that of Intel quad-core. That's not to say GPUs are useless even at single precision; some results are pretty good. The team promised to benchmark FireStream next. > Generally a XEON or Opteron chipset and CPUs will be the choice. > > Also, for most GPU/FPU performance work, the memory bandwidth bottleneck > on the Intel product is too much of a negative factor. Yes, memory bandwidth can be a problem for Intel servers. Now. But we all know this is going to change soon. More surprisingly Opteron based servers do not offer PCIe Gen2 just yet. Perhaps it was long time ago when I checked it last time. The paper cited above indicates very significant impact of PCIe Gen2 on the bandwidth. From gerry.creager at tamu.edu Fri Sep 5 05:51:58 2008 From: gerry.creager at tamu.edu (Gerry Creager) Date: Thu Nov 20 01:07:42 2008 Subject: [Beowulf] Re: GPU boards and cluster servers. In-Reply-To: References: <200809021901.m82J0IaS029547@bluewest.scyld.com> <48BF1508.7000406@harddata.com> <002001c90e58$3c20c020$6300a8c0@LIBO> <48C08D34.1060704@harddata.com> <002401c90f15$38e48500$6300a8c0@LIBO> Message-ID: <48C12B6E.4090903@tamu.edu> At $6k US, and requiring me to get Vista, I'd rather build a system starting with, e.g., an Asus motherboard. I save one-third the price and I don't have to file the environmental impact statement on the flawed OS. I also get NICs I can easily set to accommodate Jumbo Frames. gerry andrew holway wrote: > The new Dell R5400 Rackmount workstation is ideal for this. You can > slip two Xeons, 16GB ram and two chunky graphics cards in there. > > ta > > Andy > > On Fri, Sep 5, 2008 at 6:06 AM, Li, Bo wrote: >> Hello, >> It seems your platform is more suitable for a cluster. Great, and when are >> the products available? And is there any software support from you? >> Regards, >> Li, Bo >> >> ----- Original Message ----- >> From: Maurice Hilarius >> To: Li, Bo >> Cc: kus@free.net ; i.kozin@dl.ac.uk ; Beowulf Mailing List >> Sent: Friday, September 05, 2008 9:36 AM >> Subject: GPU boards and cluster servers. >> Li, Bo wrote: >> >> Hello, >> Is it too expensive for the platform? >> The easy solution is: >> And X48 level motherboard with CF support, about $150 >> Q6600 Processor, about $170 >> Two 4870X2 $1,100 >> Two Seagate SATA Harddisk 500G for Raid1, about $140 >> 4*2G DDR2 RAM, about $150 >> PSU 1000W, about $200 >> A big box, about $100 >> >> That's all, in total, $2,010. >> Regards, >> Li, Bo >> >> True, to a point. >> Most people will not use a desktop board for a cluster. >> Too I/O bound. >> Finally the memory capacity of these desktop boards is pretty limiting. >> Typically 8GB maximum. >> >> >> Generally a XEON or Opteron chipset and CPUs will be the choice. >> >> Also, for most GPU/FPU performance work, the memory bandwidth bottleneck on >> the Intel product is too much of a negative factor. >> >> Lastly, for clusters, most want a rackmount chassis. >> We developed a 2U designed for a server board and 2 GPU boards. >> The big challenge there is power. >> >> We use dual 600W PSUs. One for motherboard, and one for dual GPU boards. >> >> >> -- >> With our best regards, >> >> Maurice W. Hilarius Telephone: 01-780-456-9771 >> Hard Data Ltd. FAX: 01-780-456-9772 >> 11060 - 166 Avenue email:maurice@harddata.com >> Edmonton, AB, Canada http://www.harddata.com/ >> T5X 1Y3 >> >> _______________________________________________ >> Beowulf mailing list, Beowulf@beowulf.org >> To change your subscription (digest mode or unsubscribe) visit >> http://www.beowulf.org/mailman/listinfo/beowulf >> >> > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- Gerry Creager -- gerry.creager@tamu.edu Texas Mesonet -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.862.3983 Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843 From prentice at ias.edu Fri Sep 5 07:48:16 2008 From: prentice at ias.edu (Prentice Bisbal) Date: Thu Nov 20 01:07:42 2008 Subject: [Beowulf] Infiniband Subnet Manager In-Reply-To: <20080904221636.GA4234@hpegg.wr.niftyegg.com> References: <48B69CEE.3040802@ias.edu> <48C02AE3.9040806@ias.edu> <20080904221636.GA4234@hpegg.wr.niftyegg.com> Message-ID: <48C146B0.1020102@ias.edu> Nifty niftyompi Mitch wrote: >> I've gotten a lot of response to my IB questions that I posed to the >> list. Thanks for all your help. All of my questions have been answered. >> It turns out, as some as you pointed out, that my switch will have a >> built-in subnet manager, so I won't need to run one on a node. >> > > I should add that a built in subnet manager is extra $$. Also they tend > to run on a modest dedicated processor card. The modest dedicated card > solutions have limited RAM and will not support a gonzo big fabric. > A rule of thumb, depending on the subnet manager on the card the run out of memory > recources trip point is about 144 ports. > > The richer the statistics gathered and retained the larger the footprint is. > > Large fabrics will need a host based subnet manager. Thanks my cluster is only 64 nodes, so that shouldn't be a problem. -- Prentice From rgb at phy.duke.edu Fri Sep 5 08:29:02 2008 From: rgb at phy.duke.edu (Robert G. Brown) Date: Thu Nov 20 01:07:42 2008 Subject: [Beowulf] Re: GPU boards and cluster servers. In-Reply-To: <48C12B6E.4090903@tamu.edu> References: <200809021901.m82J0IaS029547@bluewest.scyld.com> <48BF1508.7000406@harddata.com> <002001c90e58$3c20c020$6300a8c0@LIBO> <48C08D34.1060704@harddata.com> <002401c90f15$38e48500$6300a8c0@LIBO> <48C12B6E.4090903@tamu.edu> Message-ID: On Fri, 5 Sep 2008, Gerry Creager wrote: > At $6k US, and requiring me to get Vista, I'd rather build a system starting > with, e.g., an Asus motherboard. I save one-third the price and I don't have > to file the environmental impact statement on the flawed OS. I also get NICs > I can easily set to accommodate Jumbo Frames. If you talk to a Dell rep, you can ALMOST invariably get any server-class system they sell without an operating system or with Linux installed, especially if you are ordering in quantity. Just FYI -- otherwise I don't disagree with anything you said, especially Vista of Evil. Although hey, it runs great on 4 GB and up systems, at least if you don't run large applications on it... or so I'm told. rgb > > gerry > > andrew holway wrote: >> The new Dell R5400 Rackmount workstation is ideal for this. You can >> slip two Xeons, 16GB ram and two chunky graphics cards in there. >> >> ta >> >> Andy >> >> On Fri, Sep 5, 2008 at 6:06 AM, Li, Bo wrote: >>> Hello, >>> It seems your platform is more suitable for a cluster. Great, and when are >>> the products available? And is there any software support from you? >>> Regards, >>> Li, Bo >>> >>> ----- Original Message ----- >>> From: Maurice Hilarius >>> To: Li, Bo >>> Cc: kus@free.net ; i.kozin@dl.ac.uk ; Beowulf Mailing List >>> Sent: Friday, September 05, 2008 9:36 AM >>> Subject: GPU boards and cluster servers. >>> Li, Bo wrote: >>> >>> Hello, >>> Is it too expensive for the platform? >>> The easy solution is: >>> And X48 level motherboard with CF support, about $150 >>> Q6600 Processor, about $170 >>> Two 4870X2 $1,100 >>> Two Seagate SATA Harddisk 500G for Raid1, about $140 >>> 4*2G DDR2 RAM, about $150 >>> PSU 1000W, about $200 >>> A big box, about $100 >>> >>> That's all, in total, $2,010. >>> Regards, >>> Li, Bo >>> >>> True, to a point. >>> Most people will not use a desktop board for a cluster. >>> Too I/O bound. >>> Finally the memory capacity of these desktop boards is pretty limiting. >>> Typically 8GB maximum. >>> >>> >>> Generally a XEON or Opteron chipset and CPUs will be the choice. >>> >>> Also, for most GPU/FPU performance work, the memory bandwidth bottleneck >>> on >>> the Intel product is too much of a negative factor. >>> >>> Lastly, for clusters, most want a rackmount chassis. >>> We developed a 2U designed for a server board and 2 GPU boards. >>> The big challenge there is power. >>> >>> We use dual 600W PSUs. One for motherboard, and one for dual GPU boards. >>> >>> >>> -- >>> With our best regards, >>> >>> Maurice W. Hilarius Telephone: 01-780-456-9771 >>> Hard Data Ltd. FAX: 01-780-456-9772 >>> 11060 - 166 Avenue email:maurice@harddata.com >>> Edmonton, AB, Canada http://www.harddata.com/ >>> T5X 1Y3 >>> >>> _______________________________________________ >>> Beowulf mailing list, Beowulf@beowulf.org >>> To change your subscription (digest mode or unsubscribe) visit >>> http://www.beowulf.org/mailman/listinfo/beowulf >>> >>> >> _______________________________________________ >> Beowulf mailing list, Beowulf@beowulf.org >> To change your subscription (digest mode or unsubscribe) visit >> http://www.beowulf.org/mailman/listinfo/beowulf > > -- Robert G. Brown Phone(cell): 1-919-280-8443 Duke University Physics Dept, Box 90305 Durham, N.C. 27708-0305 Web: http://www.phy.duke.edu/~rgb Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 From diep at xs4all.nl Fri Sep 5 11:23:58 2008 From: diep at xs4all.nl (Vincent Diepeveen) Date: Thu Nov 20 01:07:42 2008 Subject: [Beowulf] Re: GPU boards and cluster servers. In-Reply-To: References: <200809021901.m82J0IaS029547@bluewest.scyld.com> <48BF1508.7000406@harddata.com> <002001c90e58$3c20c020$6300a8c0@LIBO> <48C08D34.1060704@harddata.com> <002401c90f15$38e48500$6300a8c0@LIBO> <48C12B6E.4090903@tamu.edu> Message-ID: AFAIK only rich government departments do business with companies such as DELL. If you're real big you HAVE to sign some sort of deal with a big store anyway. DELL delivers very old junk for the price you can get newer junk usual. Big companies have big overhead, maybe sometimes simply because they decided they WANT x% profit on all deals. I remember at some company i worked for a few months, that i got from DELL hardware that wasn't getting sold even anymore. Obviously that means the service contract in question is just wasting government money. The big trick all salesman understand and civil servant type managers/ directors do not so well understand is that something that's new this year, that if you deliver that 2 years from now, that it's total outdated. Not long ago i heard that a specific company bought off a service contract delivering XT machines. For those here who want to run artificial intelligent software they all have a big need for crunching power at a low price, much in contradiction to the rest here which just wants double precision AND big bandwidth AND big ram usually AND 100% reliability. At 100% reliability and BIG ram AND big bandwidth to huge RAM, there is a big price. Optimizations in that category tree searching software and such (encryption is just a subsection of it) happen at a level which most guys here at this list will never understand. It is much better optimized than other software. There happens sometimes the kind of optimizations that hardware engineers, not exactly layman, sometimes say: "oh dear is that the case?", when talking to them. Suppose you search for the holy grail at a GPU and 1 gpu's RAM is bad, so all your calculations there failed. Heh, you won't even notice it soon, as you have no 'result' that is deterministically verifiable. An example is a parameter optimization i want to perform for my chessprogram. I would need to write a new program for it which is huge at a GPU, basically that program would only be the evaluation function of my program. Such optimization runs are embarrassingly parallel. What matters is simply how many instructions a cycle i can push through. Having at home a few GPU's do that crunching work is very attractive. Biggest problem of GPU's is that i have no money to buy a machine to put a GPU in, let alone buy GPU's just to toy with it. So that's why a friend of mine is hopefully gonna run it at a core or 160 Xeons at TU-Delft, when the machines are idle and not getting used by others. Probably the best project name for this is Ikarus, wasn't it that it is already an existing chess programs name, as the final goal is to explore possibilities after auto recognizing new patterns in a later phase. Will run for years. The big difference in this type of crunching power is that if something goes wrong, that's not a problem; If a bit flips or whatever, it all doesn't matter. I just need the best parameter set it can find for me when searching for the holy grail. I understand Bo Li there very well. He wants the maximum amount of crunching power and can do with 32 bits. A good 1000 watt psu here is around a 100 euro. For under 500 euro you can assemble a great box, then only add the GPU's. Getting 40% performance out of a videocard is very impressive by the way, especially if i consider that no one around me with different types of software (from statistical software to monte carlo to multimedia encoding) doesn't get anywhere near that performance out of it. Yet the difference is, is that all persons here who are cheering for the GPU crunching power, are the same type of guys. Though on paper the software is doing something total different, they all search for some sort of holy grail in an embarrassingly parallel manner. The failed attempts are usually game tree searches that need to combine somehow results using hashtables and/or FFT type tries. Vincent On Sep 5, 2008, at 5:29 PM, Robert G. Brown wrote: > On Fri, 5 Sep 2008, Gerry Creager wrote: > >> At $6k US, and requiring me to get Vista, I'd rather build a >> system starting with, e.g., an Asus motherboard. I save one-third >> the price and I don't have to file the environmental impact >> statement on the flawed OS. I also get NICs I can easily set to >> accommodate Jumbo Frames. > > If you talk to a Dell rep, you can ALMOST invariably get any > server-class system they sell without an operating system or with > Linux > installed, especially if you are ordering in quantity. > > Just FYI -- otherwise I don't disagree with anything you said, > especially Vista of Evil. Although hey, it runs great on 4 GB and up > systems, at least if you don't run large applications on it... or > so I'm > told. > > rgb > >> >> gerry >> >> andrew holway wrote: >>> The new Dell R5400 Rackmount workstation is ideal for this. You can >>> slip two Xeons, 16GB ram and two chunky graphics cards in there. >>> ta >>> Andy >>> On Fri, Sep 5, 2008 at 6:06 AM, Li, Bo wrote: >>>> Hello, >>>> It seems your platform is more suitable for a cluster. Great, >>>> and when are >>>> the products available? And is there any software support from you? >>>> Regards, >>>> Li, Bo >>>> ----- Original Message ----- >>>> From: Maurice Hilarius >>>> To: Li, Bo >>>> Cc: kus@free.net ; i.kozin@dl.ac.uk ; Beowulf Mailing List >>>> Sent: Friday, September 05, 2008 9:36 AM >>>> Subject: GPU boards and cluster servers. >>>> Li, Bo wrote: >>>> Hello, >>>> Is it too expensive for the platform? >>>> The easy solution is: >>>> And X48 level motherboard with CF support, about $150 >>>> Q6600 Processor, about $170 >>>> Two 4870X2 $1,100 >>>> Two Seagate SATA Harddisk 500G for Raid1, about $140 >>>> 4*2G DDR2 RAM, about $150 >>>> PSU 1000W, about $200 >>>> A big box, about $100 >>>> That's all, in total, $2,010. >>>> Regards, >>>> Li, Bo >>>> True, to a point. >>>> Most people will not use a desktop board for a cluster. >>>> Too I/O bound. >>>> Finally the memory capacity of these desktop boards is pretty >>>> limiting. >>>> Typically 8GB maximum. >>>> Generally a XEON or Opteron chipset and CPUs will be the choice. >>>> Also, for most GPU/FPU performance work, the memory bandwidth >>>> bottleneck on >>>> the Intel product is too much of a negative factor. >>>> Lastly, for clusters, most want a rackmount chassis. >>>> We developed a 2U designed for a server board and 2 GPU boards. >>>> The big challenge there is power. >>>> We use dual 600W PSUs. One for motherboard, and one for dual GPU >>>> boards. >>>> -- >>>> With our best regards, >>>> Maurice W. Hilarius Telephone: 01-780-456-9771 >>>> Hard Data Ltd. FAX: 01-780-456-9772 >>>> 11060 - 166 Avenue email:maurice@harddata.com >>>> Edmonton, AB, Canada http://www.harddata.com/ >>>> T5X 1Y3 >>>> _______________________________________________ >>>> Beowulf mailing list, Beowulf@beowulf.org >>>> To change your subscription (digest mode or unsubscribe) visit >>>> http://www.beowulf.org/mailman/listinfo/beowulf >>> _______________________________________________ >>> Beowulf mailing list, Beowulf@beowulf.org >>> To change your subscription (digest mode or unsubscribe) visit >>> http://www.beowulf.org/mailman/listinfo/beowulf >> >> > > -- > Robert G. Brown Phone(cell): 1-919-280-8443 > Duke University Physics Dept, Box 90305 > Durham, N.C. 27708-0305 > Web: http://www.phy.duke.edu/~rgb > Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php > Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977 > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > From perry at piermont.com Fri Sep 5 13:19:16 2008 From: perry at piermont.com (Perry E. Metzger) Date: Thu Nov 20 01:07:42 2008 Subject: [Beowulf] Re: GPU boards and cluster servers. In-Reply-To: (Vincent Diepeveen's message of "Fri\, 5 Sep 2008 20\:23\:58 +0200") References: <200809021901.m82J0IaS029547@bluewest.scyld.com> <48BF1508.7000406@harddata.com> <002001c90e58$3c20c020$6300a8c0@LIBO> <48C08D34.1060704@harddata.com> <002401c90f15$38e48500$6300a8c0@LIBO> <48C12B6E.4090903@tamu.edu> Message-ID: <871vzys5vf.fsf@snark.cb.piermont.com> Vincent Diepeveen writes: > AFAIK only rich government departments do business with companies > such as DELL. I often buy they're equipment when I'm just looking for one ordinary 1U rackmount and such -- they're often the lowest price vendor or nearly so. If I have to do something unusual, I don't talk to them, but that's not surprising as they specialize in providing standard stuff cheap, not in providing unusual things. Perry From csamuel at vpac.org Sun Sep 7 01:58:32 2008 From: csamuel at vpac.org (Chris Samuel) Date: Thu Nov 20 01:07:42 2008 Subject: [Beowulf] hang-up of HPC Challenge In-Reply-To: <1152250.21220687590853.JavaMail.csamuel@ubuntu> Message-ID: <13080772.41220687850031.JavaMail.csamuel@ubuntu> ----- "Mikhail Kuzminsky" wrote: Hi Mikhail, Sorry for the delay in getting back to you, work has been keeping me very occupied! > In message from Chris Samuel (Wed, 20 Aug 2008 > 11:12:52 +1000 (EST)): > > >Does the code crash, does it just stop & idle, does it > >busy loop, does the node oops, does it lockup, etc ? > > I beleive that program crash is not hangup. When I wrote > about Linux hangup, I means that Linux don't response to > any interrupts - from keyboard, from ssh client requests etc. That really sounds like either your hitting a kernel or hardware issues - might be worth trying out the BreakIn tool that Jason posted about elsewhere on the list: http://www.advancedclustering.com/software/breakin.html > I use 2.6.22.5-31 kernel from SuSE 10.3 distribution. That's pretty old now, I'd strongly suggest trying out the current mainline kernel on there, this works pretty well on our SuperMicro based Barcelona cluster. cheers! Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency From maurice at harddata.com Fri Sep 5 07:32:50 2008 From: maurice at harddata.com (Maurice Hilarius) Date: Thu Nov 20 01:07:42 2008 Subject: [Beowulf] Re: GPU boards and cluster servers. In-Reply-To: References: Message-ID: <48C14312.4010502@harddata.com> Kozin, I (Igor) wrote: > .. > Yes, memory bandwidth can be a problem for Intel servers. Now. But we > all know this is going to change soon. > "Soon" ? We are hearing about a year. In the meantime AMD "Shanghai" with 4 cores on 45nm process ships this year. Also 6MB of L3 cache. This bump basically puts the AMD line even with the Intel on clock speeds with Intel. Meaning Intel will have to drop the prices on the higher clocks to be competitive, a good thing for all of us. Come mid 2009 AMD releases "Istanbul" with 6 cores, 6MB L3 cache. HT3 means the memory bandwidth interconnects double in speed, pulling them well ahead of the new Intel designs in terms of memory bandwidth. And, remember these still use simple DDR2. No FBDIMMS, DDR3, or other expensive tricks. If we are forward looking, expect 8 and 12 core Opterons by the end of 2009. Fiorano chipset comes then too, with 4 x PCI-E @ 2.0 > More surprisingly Opteron based servers do not offer PCIe Gen2 just yet. > No, but not really needed (yet) Barcelona already offers 2 links to the CPUs at 8GB/sec, so supporting enough PCI-E lanes for any board designs is easy. When you can support 4 x PCI-E 16 lanes RIGHT NOW ( as opposed to 4 x PCI-E 8 lanes on Intel chipsets), Why do you need PCI-E 2.0? Especially when you have the memory bandwidth to support it. Until Intel chipsets get better bandwidth to RAM all the slots in the world are irrelevant. > Perhaps it was long time ago when I checked it last time. The paper > cited above indicates very significant impact of PCIe Gen2 on the > bandwidth. > > They do, on the newer chipsets. -- With our best regards, //Maurice W. Hilarius Telephone: 01-780-456-9771/ /Hard Data Ltd. FAX: 01-780-456-9772/ /11060 - 166 Avenue email:maurice@harddata.com/ /Edmonton, AB, Canada http://www.harddata.com// / T5X 1Y3/ / -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080905/a7110a6c/attachment.html From timchipman at myrealbox.com Fri Sep 5 11:13:30 2008 From: timchipman at myrealbox.com (Tim Chipman) Date: Thu Nov 20 01:07:42 2008 Subject: [Beowulf] Q: AMD Opteron (Barcelona) 2356 vs Intel Xeon 5460 Message-ID: <1220638410.4385019ctimchipman@myrealbox.com> Very likely a hopeless question, with this little information, but just in case: Does anyone have any 'real world' experience with 'both' of these CPUs, in terms of relative performance for 'whatever work you do' ? I realize the xeon is a faster mhz part (3.16ghz xeon vs 2.3ghz opteron) so I'm more concerned with "relative performance per mhz" I'm involved with a cluster project, and we have 2 options from our vendor, - 25 compute nodes, dual-quadcore intel 5460, or - 32 compute nodes, dual-quadcore amd 2356 >From what I've been able to glean (Spec.org / SpecCPU 2006), - the intel chips have better integer performance - the amd chips have better FPU performance so the likely anticipated real-world performance result .. will depend on how a given application blends / balances things. Any comments / thoughts are certainly appreciated. --Tim Chipman From thpierce at gmail.com Sat Sep 6 08:26:35 2008 From: thpierce at gmail.com (Tom Pierce) Date: Thu Nov 20 01:07:42 2008 Subject: [Beowulf] Re: GPU boards and cluster servers. In-Reply-To: References: <200809021901.m82J0IaS029547@bluewest.scyld.com> <48BF1508.7000406@harddata.com> <002001c90e58$3c20c020$6300a8c0@LIBO> <48C08D34.1060704@harddata.com> <002401c90f15$38e48500$6300a8c0@LIBO> <48C12B6E.4090903@tamu.edu> Message-ID: <25e9e5ad0809060826n33904e52pb9f3325fa254cc20@mail.gmail.com> On Fri, Sep 5, 2008 at 2:23 PM, Vincent Diepeveen wrote: > AFAIK only rich government departments do business with companies such as > DELL. I buy DELL servers for a cluster at a commercial chemical company. They are a good price for standard systems. They also have a great Linux support organization in Austin Texas. Good equipment and high quality support for issues that arise over time. It is a cost effective solution, and Dell clusters keep popping up at US Universities as well. Tom -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080906/42b5f2d2/attachment.html From carsten.aulbert at aei.mpg.de Mon Sep 8 11:16:31 2008 From: carsten.aulbert at aei.mpg.de (Carsten Aulbert) Date: Thu Nov 20 01:07:42 2008 Subject: [Beowulf] Q: AMD Opteron (Barcelona) 2356 vs Intel Xeon 5460 In-Reply-To: <1220638410.4385019ctimchipman@myrealbox.com> References: <1220638410.4385019ctimchipman@myrealbox.com> Message-ID: <48C56BFF.6000304@aei.mpg.de> Hi Tim, Tim Chipman wrote: > Does anyone have any 'real world' experience with 'both' of these CPUs, in terms of relative performance for 'whatever work you do' ? > > I realize the xeon is a faster mhz part (3.16ghz xeon vs 2.3ghz opteron) so I'm more concerned with "relative performance per mhz" > > I'm involved with a cluster project, and we have 2 options from our vendor, > > - 25 compute nodes, dual-quadcore intel 5460, or > - 32 compute nodes, dual-quadcore amd 2356 > >>From what I've been able to glean (Spec.org / SpecCPU 2006), > > - the intel chips have better integer performance > - the amd chips have better FPU performance > > so the likely anticipated real-world performance result .. will depend on how a given application blends / balances things. We have "only" the Quad-Xeon boxes with E5435 and these are quite fast, indeed it seems that FFTW seems to run faster on Xeons but I have not made any benchmarks for the past ~ 9 months, so I don't know about the latest Opterons. I think you need to come up with a real world scenario of what will be run on the cluster and maybe compile a little benchmark yourself ans aks the vendor to run both (or get hands on on both boxes. I think that's the only "fair" comparison that's possible. HTH Carsten From tom.elken at qlogic.com Mon Sep 8 11:23:48 2008 From: tom.elken at qlogic.com (Tom Elken) Date: Thu Nov 20 01:07:42 2008 Subject: [Beowulf] Re: GPU boards and cluster servers. In-Reply-To: <25e9e5ad0809060826n33904e52pb9f3325fa254cc20@mail.gmail.com> References: <200809021901.m82J0IaS029547@bluewest.scyld.com><48BF1508.7000406@harddata.com> <002001c90e58$3c20c020$6300a8c0@LIBO><48C08D34.1060704@harddata.com> <002401c90f15$38e48500$6300a8c0@LIBO><48C12B6E.4090903@tamu.edu> <25e9e5ad0809060826n33904e52pb9f3325fa254cc20@mail.gmail.com> Message-ID: <6DB5B58A8E5AB846A7B3B3BFF1B4315A02504889@AVEXCH1.qlogic.org> From: beowulf-bounces@beowulf.org [mailto:beowulf-bounces@beowulf.org] On Behalf Of Tom Pierce ... It is a cost effective solution, and Dell clusters keep popping up at US Universities as well. Tom The same is true at UK Universities. -Tom From prentice at ias.edu Mon Sep 8 11:58:36 2008 From: prentice at ias.edu (Prentice Bisbal) Date: Thu Nov 20 01:07:42 2008 Subject: [Beowulf] Re: GPU boards and cluster servers. In-Reply-To: <6DB5B58A8E5AB846A7B3B3BFF1B4315A02504889@AVEXCH1.qlogic.org> References: <200809021901.m82J0IaS029547@bluewest.scyld.com><48BF1508.7000406@harddata.com> <002001c90e58$3c20c020$6300a8c0@LIBO><48C08D34.1060704@harddata.com> <002401c90f15$38e48500$6300a8c0@LIBO><48C12B6E.4090903@tamu.edu> <25e9e5ad0809060826n33904e52pb9f3325fa254cc20@mail.gmail.com> <6DB5B58A8E5AB846A7B3B3BFF1B4315A02504889@AVEXCH1.qlogic.org> Message-ID: <48C575DC.2000608@ias.edu> Tom Elken wrote: > From: beowulf-bounces@beowulf.org > [mailto:beowulf-bounces@beowulf.org] On Behalf Of Tom Pierce > > > ... It is a cost effective solution, and Dell clusters keep > popping up at US Universities as well. > > Tom > > The same is true at UK Universities. > > -Tom I think these trends have more to do with the cheap cost of Dell Hardware and Dell's sales force and marketing to upper management than they do with any technical advantages Dell has over the competition. I have no problem with Dell Hardware. There's nothing wrong with it, and Beowulf clusters are *supposed* to be based on affordable commodity hardware. I didn't see much basis for the earlier post disparaging Dell hardware. However, If you're buying a "turn-key" cluster solution based on advertised "clustering services", I'd be cautious with *any* of the big vendors, where you're known mostly as a customer ID in their CRM database, and you're dealing with salespeople and not technical people. Especially if they offer any kind of "customization" -- more than likely, any customization will be at additional costs, and you'll still have to do some reconfiguration to get it set up the way yo want (and is therefore no longer a turn-key system). I've got a few stories, but the guilty shall remain nameless. Getting back to hardware, I've always been impressed with the robustness of HP Proliant hardware -- Prentice From landman at scalableinformatics.com Mon Sep 8 12:01:59 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Thu Nov 20 01:07:42 2008 Subject: [Beowulf] Re: GPU boards and cluster servers. In-Reply-To: <6DB5B58A8E5AB846A7B3B3BFF1B4315A02504889@AVEXCH1.qlogic.org> References: <200809021901.m82J0IaS029547@bluewest.scyld.com><48BF1508.7000406@harddata.com> <002001c90e58$3c20c020$6300a8c0@LIBO><48C08D34.1060704@harddata.com> <002401c90f15$38e48500$6300a8c0@LIBO><48C12B6E.4090903@tamu.edu> <25e9e5ad0809060826n33904e52pb9f3325fa254cc20@mail.gmail.com> <6DB5B58A8E5AB846A7B3B3BFF1B4315A02504889@AVEXCH1.qlogic.org> Message-ID: <48C576A7.3020900@scalableinformatics.com> Tom Elken wrote: > From: beowulf-bounces@beowulf.org > [mailto:beowulf-bounces@beowulf.org] On Behalf Of Tom Pierce > > > ... It is a cost effective solution, and Dell clusters keep > popping up at US Universities as well. > > Tom > > The same is true at UK Universities. > > -Tom Don't know about the UK universities, but more than a couple US ones have signed single source agreements with Dell (or HP or [insert the large tier 1 vendor of your choice]). Makes selling things to these folks often a little more of a challenge ... not for pricing reasons, but purely for paperwork reasons. I find it amusing on good days when we are asked to write a sole-source memo for our customers. I won't comment on whether or not this is the right thing to do, or even a good thing to do for the universities. Joe -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 From apittman at concurrent-thinking.com Mon Sep 8 12:05:37 2008 From: apittman at concurrent-thinking.com (Ashley Pittman) Date: Thu Nov 20 01:07:42 2008 Subject: [Beowulf] Re: GPU boards and cluster servers. In-Reply-To: <48C575DC.2000608@ias.edu> References: <200809021901.m82J0IaS029547@bluewest.scyld.com> <48BF1508.7000406@harddata.com> <002001c90e58$3c20c020$6300a8c0@LIBO> <48C08D34.1060704@harddata.com> <002401c90f15$38e48500$6300a8c0@LIBO> <48C12B6E.4090903@tamu.edu> <25e9e5ad0809060826n33904e52pb9f3325fa254cc20@mail.gmail.com> <6DB5B58A8E5AB846A7B3B3BFF1B4315A02504889@AVEXCH1.qlogic.org> <48C575DC.2000608@ias.edu> Message-ID: <1220900737.4641.31.camel@bruce.priv.wark.uk.streamline-computing.com> On Mon, 2008-09-08 at 14:58 -0400, Prentice Bisbal wrote: > I think these trends have more to do with the cheap cost of Dell > Hardware and Dell's sales force and marketing to upper management than > they do with any technical advantages Dell has over the competition. > > I have no problem with Dell Hardware. There's nothing wrong with it, and > Beowulf clusters are *supposed* to be based on affordable commodity > hardware. I didn't see much basis for the earlier post disparaging Dell > hardware. > > However, If you're buying a "turn-key" cluster solution based on > advertised "clustering services", I'd be cautious with *any* of the big > vendors, where you're known mostly as a customer ID in their CRM > database, and you're dealing with salespeople and not technical people. > Especially if they offer any kind of "customization" -- more than > likely, any customization will be at additional costs, and you'll still > have to do some reconfiguration to get it set up the way yo want (and is > therefore no longer a turn-key system). I've got a few stories, but the > guilty shall remain nameless. You don't have to buy Dell hardware direct from Dell, there are plenty of people who will sell you dell nodes with value-add hardware and software. Ashley, From landman at scalableinformatics.com Mon Sep 8 12:26:40 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Thu Nov 20 01:07:42 2008 Subject: [Beowulf] Re: GPU boards and cluster servers. In-Reply-To: <48C575DC.2000608@ias.edu> References: <200809021901.m82J0IaS029547@bluewest.scyld.com><48BF1508.7000406@harddata.com> <002001c90e58$3c20c020$6300a8c0@LIBO><48C08D34.1060704@harddata.com> <002401c90f15$38e48500$6300a8c0@LIBO><48C12B6E.4090903@tamu.edu> <25e9e5ad0809060826n33904e52pb9f3325fa254cc20@mail.gmail.com> <6DB5B58A8E5AB846A7B3B3BFF1B4315A02504889@AVEXCH1.qlogic.org> <48C575DC.2000608@ias.edu> Message-ID: <48C57C70.3010704@scalableinformatics.com> Prentice Bisbal wrote: [...] > Getting back to hardware, I've always been impressed with the robustness > of HP Proliant hardware Of course, the dirty little (not so) secret of tier 1 systems are that they are all built by the same 2-3 contract manufacturers, from the same parts troughs ...... .... there is an economic reason for this. -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 From prentice at ias.edu Mon Sep 8 13:18:43 2008 From: prentice at ias.edu (Prentice Bisbal) Date: Thu Nov 20 01:07:42 2008 Subject: [Beowulf] Re: GPU boards and cluster servers. In-Reply-To: <48C57C70.3010704@scalableinformatics.com> References: <200809021901.m82J0IaS029547@bluewest.scyld.com><48BF1508.7000406@harddata.com> <002001c90e58$3c20c020$6300a8c0@LIBO><48C08D34.1060704@harddata.com> <002401c90f15$38e48500$6300a8c0@LIBO><48C12B6E.4090903@tamu.edu> <25e9e5ad0809060826n33904e52pb9f3325fa254cc20@mail.gmail.com> <6DB5B58A8E5AB846A7B3B3BFF1B4315A02504889@AVEXCH1.qlogic.org> <48C575DC.2000608@ias.edu> <48C57C70.3010704@scalableinformatics.com> Message-ID: <48C588A3.5090606@ias.edu> Joe Landman wrote: > Prentice Bisbal wrote: > > [...] > >> Getting back to hardware, I've always been impressed with the robustness >> of HP Proliant hardware > > Of course, the dirty little (not so) secret of tier 1 systems are that > they are all built by the same 2-3 contract manufacturers, from the same > parts troughs ...... > > .... there is an economic reason for this. I'm sure. The bike business is the same way. For the most part (there are exceptions), a handful of bike factories make all the bikes for the different bike companies, the country of manufacture determines the cost of manufacturing and price: Japan: Highest quality and price Taiwan: Middle quality and price China: Lowest quality and price Usually the highest-end models are made in-house, so this doesn't apply the them. Even at the same factory, each vendor can specify the quality of manufacture (quality of materials/components, tolerances, and thoroughness of Q/A testing), and pay accordingly. I'm sure even in the computer world a similar rule applies. $ = cheap components, $$= better components, etc. -- Prentice From landman at scalableinformatics.com Mon Sep 8 14:07:22 2008 From: landman at scalableinformatics.com (Joe Landman) Date: Thu Nov 20 01:07:42 2008 Subject: [Beowulf] Re: GPU boards and cluster servers. In-Reply-To: <48C588A3.5090606@ias.edu> References: <200809021901.m82J0IaS029547@bluewest.scyld.com><48BF1508.7000406@harddata.com> <002001c90e58$3c20c020$6300a8c0@LIBO><48C08D34.1060704@harddata.com> <002401c90f15$38e48500$6300a8c0@LIBO><48C12B6E.4090903@tamu.edu> <25e9e5ad0809060826n33904e52pb9f3325fa254cc20@mail.gmail.com> <6DB5B58A8E5AB846A7B3B3BFF1B4315A02504889@AVEXCH1.qlogic.org> <48C575DC.2000608@ias.edu> <48C57C70.3010704@scalableinformatics.com> <48C588A3.5090606@ias.edu> Message-ID: <48C5940A.606@scalableinformatics.com> Prentice Bisbal wrote: > I'm sure even in the computer world a similar rule applies. $ = cheap > components, $$= better components, etc. A Xeon is a Xeon is a Xeon. Some RAM DIMM builders use ... ah ... less than spectacular ... parts. But peel off some of the carefully applied labels on the tier-1 units and you find some ... interesting ... things beneath (usually the labels that say you void your warranty if you remove them). -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 From prentice at ias.edu Mon Sep 8 14:13:07 2008 From: prentice at ias.edu (Prentice Bisbal) Date: Thu Nov 20 01:07:42 2008 Subject: [Beowulf] Re: GPU boards and cluster servers. In-Reply-To: <48C5940A.606@scalableinformatics.com> References: <200809021901.m82J0IaS029547@bluewest.scyld.com><48BF1508.7000406@harddata.com> <002001c90e58$3c20c020$6300a8c0@LIBO><48C08D34.1060704@harddata.com> <002401c90f15$38e48500$6300a8c0@LIBO><48C12B6E.4090903@tamu.edu> <25e9e5ad0809060826n33904e52pb9f3325fa254cc20@mail.gmail.com> <6DB5B58A8E5AB846A7B3B3BFF1B4315A02504889@AVEXCH1.qlogic.org> <48C575DC.2000608@ias.edu> <48C57C70.3010704@scalableinformatics.com> <48C588A3.5090606@ias.edu> <48C5940A.606@scalableinformatics.com> Message-ID: <48C59563.2080404@ias.edu> Joe Landman wrote: > Prentice Bisbal wrote: > >> I'm sure even in the computer world a similar rule applies. $ = cheap >> components, $$= better components, etc. > > A Xeon is a Xeon is a Xeon. > > Some RAM DIMM builders use ... ah ... less than spectacular ... parts. > > But peel off some of the carefully applied labels on the tier-1 units > and you find some ... interesting ... things beneath (usually the labels > that say you void your warranty if you remove them). > Been there, done that. I love that SGI wanted $4k for a fibre-channel HBA for an Origin 350, which was made by QLogic, and bought the exact same thing for only $800 from CDW. Of course, SGI would refuse to support it if I ever had a tech support issue, but that never happened. I know Sun and the other big Unix Co's did the same thing, which is why we all use Linux on commodity hardware these days. -- Prentice From perry at piermont.com Mon Sep 8 14:15:27 2008 From: perry at piermont.com (Perry E. Metzger) Date: Thu Nov 20 01:07:43 2008 Subject: [Beowulf] Re: GPU boards and cluster servers. In-Reply-To: <48C5940A.606@scalableinformatics.com> (Joe Landman's message of "Mon\, 08 Sep 2008 17\:07\:22 -0400") References: <200809021901.m82J0IaS029547@bluewest.scyld.com> <48BF1508.7000406@harddata.com> <002001c90e58$3c20c020$6300a8c0@LIBO> <48C08D34.1060704@harddata.com> <002401c90f15$38e48500$6300a8c0@LIBO> <48C12B6E.4090903@tamu.edu> <25e9e5ad0809060826n33904e52pb9f3325fa254cc20@mail.gmail.com> <6DB5B58A8E5AB846A7B3B3BFF1B4315A02504889@AVEXCH1.qlogic.org> <48C575DC.2000608@ias.edu> <48C57C70.3010704@scalableinformatics.com> <48C588A3.5090606@ias.edu> <48C5940A.606@scalableinformatics.com> Message-ID: <877i9mfifk.fsf@snark.cb.piermont.com> Joe Landman writes: > Prentice Bisbal wrote: > >> I'm sure even in the computer world a similar rule applies. $ = cheap >> components, $$= better components, etc. > > A Xeon is a Xeon is a Xeon. > > Some RAM DIMM builders use ... ah ... less than spectacular ... parts. > > But peel off some of the carefully applied labels on the tier-1 units > and you find some ... interesting ... things beneath (usually the > labels that say you void your warranty if you remove them). There is considerable difference in quality between different motherboards, even if all the Xeons you put in them are the same. Another bi