From john.hearns at mclaren.com Mon Oct 4 08:04:45 2010 From: john.hearns at mclaren.com (Hearns, John) Date: Mon, 4 Oct 2010 16:04:45 +0100 Subject: [Beowulf] DIY CAVE - Liquid Galaxy Message-ID: <68A57CCFD4005646957BD2D18E60667B11FCE25F@milexchmb1.mil.tagmclarengroup.com> Yet again another Register article... http://code.google.com/p/liquid-galaxy/ Might be interesting, as this is a cluster of computers used to make an immersive visualisation setup. The 'secret sauce' is a feature in Google Earth which makes it really easy to slave displays together: If send == true, sets the IP where the datagrams are sent ; Can be a broadcast address ViewSync/hostname = SLAVE_IP_GOES_HERE ViewSync/port = 21567 John Hearns McLaren Racing The contents of this email are confidential and for the exclusive use of the intended recipient. If you receive this email in error you should not copy it, retransmit it, use it or disclose its contents but should return it to the sender immediately and delete your copy. From bug at sas.upenn.edu Mon Oct 4 09:40:45 2010 From: bug at sas.upenn.edu (Gavin W. Burris) Date: Mon, 04 Oct 2010 12:40:45 -0400 Subject: [Beowulf] DIY CAVE - Liquid Galaxy In-Reply-To: <68A57CCFD4005646957BD2D18E60667B11FCE25F@milexchmb1.mil.tagmclarengroup.com> References: <68A57CCFD4005646957BD2D18E60667B11FCE25F@milexchmb1.mil.tagmclarengroup.com> Message-ID: <4CAA038D.4000405@sas.upenn.edu> Neat! I have built tiled display walls in the past. You can also do a similar thing for any video that VLC will play. Or you can use the DMX project to link all Xorg displays into one giant interactive desktop. http://viz.aset.psu.edu/ga5in/DisplayWall.html On 10/04/2010 11:04 AM, Hearns, John wrote: > Yet again another Register article... > > http://code.google.com/p/liquid-galaxy/ > > > Might be interesting, as this is a cluster of computers used to make an > immersive visualisation setup. > The 'secret sauce' is a feature in Google Earth which makes it really > easy to slave displays together: > > > If send == true, sets the IP where the datagrams are sent > ; Can be a broadcast address > ViewSync/hostname = SLAVE_IP_GOES_HERE > ViewSync/port = 21567 > > > > John Hearns > McLaren Racing > > The contents of this email are confidential and for the exclusive use of the intended recipient. If you receive this email in error you should not copy it, retransmit it, use it or disclose its contents but should return it to the sender immediately and delete your copy. > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > -- Gavin W. Burris Senior Systems Programmer Information Security and Unix Systems School of Arts and Sciences University of Pennsylvania From mdidomenico4 at gmail.com Mon Oct 4 13:12:20 2010 From: mdidomenico4 at gmail.com (Michael Di Domenico) Date: Mon, 4 Oct 2010 16:12:20 -0400 Subject: [Beowulf] Begginers question # 1 In-Reply-To: <864070.13807.qm@web51103.mail.re2.yahoo.com> References: <864070.13807.qm@web51103.mail.re2.yahoo.com> Message-ID: To answer your question directly, the answer is, no the performance is not the same. BUT, you've asked a very workload dependent question, but have not told us anything about what you're trying to do, so deciding which is the right choice is pretty hard. It would be unwise to make this decision solely on price, without understanding the trade offs in productivity. On Sat, Sep 25, 2010 at 1:07 PM, gabriel lorenzo wrote: > IN CLUSTER COMPUTING, IS THE AMOUNT OF CORE THAT COUNTS? > If I build a cluster with 8 motherboards with 1 single core each would it be the same as using just one motherboard but with two quad core processors? I wanna build one of these but wanna save money and space and if what counts is the amount of cores to process info I think fewer motherboards with dual six-core processors is definitely cheaper just because I wont be needing that many mothers power supplies etc. thanks > > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > From hahn at mcmaster.ca Mon Oct 4 18:44:49 2010 From: hahn at mcmaster.ca (Mark Hahn) Date: Mon, 4 Oct 2010 21:44:49 -0400 (EDT) Subject: [Beowulf] Begginers question # 1 In-Reply-To: <864070.13807.qm@web51103.mail.re2.yahoo.com> References: <864070.13807.qm@web51103.mail.re2.yahoo.com> Message-ID: > IN CLUSTER COMPUTING, IS THE AMOUNT OF CORE THAT COUNTS? no. it's the application that counts. > If I build a cluster with 8 motherboards with 1 single core each would it > be the same as using just one motherboard but with two quad core > processors? of course not. communication among cores on a single board will certainly be faster than inter-board communication. it's the application that matters: how frequently do threads/ranks of the application communicate? are messages small or large? can the app's communication be formulated as mostly-read sharing of data? these are all very much properties of the application, and they determine how suitable any particular hardware will be. > I wanna build one of these but wanna save money and space and > if what counts is the amount of cores to process info I think fewer > motherboards with dual six-core processors is definitely cheaper just > because I wont be needing that many mothers power supplies etc. thanks power supplies aren't your main concern, since good ones are about 93% efficient. but going with more-core systems is, in general, a good idea. mainly for amortization reasons: probably fewer disks, extraneous sutff like video interfaces, fewer parts to fail, fewer systems to administer, etc. there can be disadvantages to more-core systems too, since some of the parts being shared (amortized) may be performance bottlenecks. the sweet spots depends on what systems are in volume production - right now, 2-socket systems are the right building block in most cases. 4-socket systems would be attractive, but they tend to ship in so much lower volume that their price is nonlinearly high. 1-socket servers tend to cost more than half a 2-socket (where "server" means at least "has ECC memory" - that is, not a desktop.) From joshua_mora at usa.net Mon Oct 4 21:27:17 2010 From: joshua_mora at usa.net (Joshua mora acosta) Date: Mon, 04 Oct 2010 23:27:17 -0500 Subject: [Beowulf] Begginers question # 1 Message-ID: <583oJeeAR3744S04.1286252837@web04.cms.usa.net> Hello Gabriel Beginner's questions are usually the harder ones ;) Without any personal interest, here you have an easy reading that should help you break the ice http://www.sun.com/x64/ebooks/hpc_for_dummies.pdf In my opinion this (ie. HPC) is a very experimental field on both HW and SW so your best way to learn is by getting on something very affordable, and trying to use it as much as you can. For that it will be good to get familiarized with profilers ( performance counter tools ) so you gain confidence in what you do. That will force you to learn what is capable the whole thing (specially your application). Then once the app can use it all and you find what part of the HW or/and SW is limiting the performance, start being demanding in that direction (there are many directions), but one thing at a time or very few at a time if you know how each thing contributes. That process is lengthy and the newsgroup could answer much better specific questions rather than generic ones while you go through it. At the end, well there is no end, just a continuous refactoring process of "your own solution" that you impose to yourself while you try to keep up with the technologies that will allow you to get to the next computational/science challenge. Best regards, Joshua Mora. ------ Original Message ------ Received: 08:53 PM CDT, 10/04/2010 From: Mark Hahn To: gabriel lorenzo Cc: beowulf at beowulf.org Subject: Re: [Beowulf] Begginers question # 1 > > IN CLUSTER COMPUTING, IS THE AMOUNT OF CORE THAT COUNTS? > > no. it's the application that counts. > > > If I build a cluster with 8 motherboards with 1 single core each would it > > be the same as using just one motherboard but with two quad core > > processors? > > of course not. communication among cores on a single board > will certainly be faster than inter-board communication. > it's the application that matters: how frequently do threads/ranks > of the application communicate? are messages small or large? > can the app's communication be formulated as mostly-read sharing of data? > these are all very much properties of the application, > and they determine how suitable any particular hardware will be. > > > I wanna build one of these but wanna save money and space and > > if what counts is the amount of cores to process info I think fewer > > motherboards with dual six-core processors is definitely cheaper just > > because I wont be needing that many mothers power supplies etc. thanks > > power supplies aren't your main concern, since good ones are about 93% > efficient. but going with more-core systems is, in general, a good idea. > mainly for amortization reasons: probably fewer disks, extraneous sutff > like video interfaces, fewer parts to fail, fewer systems to administer, etc. > there can be disadvantages to more-core systems too, since some of the parts > being shared (amortized) may be performance bottlenecks. > > the sweet spots depends on what systems are in volume production - > right now, 2-socket systems are the right building block in most cases. > 4-socket systems would be attractive, but they tend to ship in so much > lower volume that their price is nonlinearly high. 1-socket servers > tend to cost more than half a 2-socket (where "server" means at least > "has ECC memory" - that is, not a desktop.) > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From hearnsj at googlemail.com Mon Oct 4 23:05:06 2010 From: hearnsj at googlemail.com (John Hearns) Date: Tue, 5 Oct 2010 07:05:06 +0100 Subject: [Beowulf] Begginers question # 1 In-Reply-To: <864070.13807.qm@web51103.mail.re2.yahoo.com> References: <864070.13807.qm@web51103.mail.re2.yahoo.com> Message-ID: On 25 September 2010 18:07, gabriel lorenzo wrote: > IN CLUSTER COMPUTING, IS THE AMOUNT OF CORE THAT COUNTS? > If I build a cluster with 8 motherboards with 1 single core each would it be the same as using just one motherboard but with two quad core processors? My response is that this is a list about building clusters from commonly used PCs - a 'Beowulf'. Gabriel, if you asked this question two years ago this list would be very clear on giving you advice on building a cluster from 8 motherboards. However, the era of multicore processors is now upon us, and as Gabriel has found if you look at a simple metric - price per core - you hit a sweet spot of 8 cores on one motherboard. As I have said before - the day of the SMP system has returned. OK - but now for my answer. The one motherboard system will be excellent for you to learn parallel programming. Go out and buy it. Now for the second part of my answer - the one motherboard system will be inevitably limited in RAM, unless you are very, very rich. So the 8 motherboard cluster is still useful for those problems which need more memory. It also scales better - you can add more systems. It also suits applications which need to perform a lot of input/output - take movie rendering for instance. The 8 motherboard system will help you learn about cluster install techniques - how to install the smae image on many systems, or how to run systems with no disks, and also will teach a lot about networking - as you fundamentally have to have a network to get it running, and will have to o network troubleshooting. So I guess you need to look at parallel programming versus learning about cluster configuration and management. From deadline at eadline.org Tue Oct 5 05:40:34 2010 From: deadline at eadline.org (Douglas Eadline) Date: Tue, 5 Oct 2010 08:40:34 -0400 (EDT) Subject: [Beowulf] Begginers question # 1 In-Reply-To: <864070.13807.qm@web51103.mail.re2.yahoo.com> References: <864070.13807.qm@web51103.mail.re2.yahoo.com> Message-ID: <50912.192.168.93.213.1286282434.squirrel@mail.eadline.org> > IN CLUSTER COMPUTING, IS THE AMOUNT OF CORE THAT COUNTS? > If I build a cluster with 8 motherboards with 1 single core each would it > be the same as using just one motherboard but with two quad core > processors? I wanna build one of these but wanna save money and space and > if what counts is the amount of cores to process info I think fewer > motherboards with dual six-core processors is definitely cheaper just > because I wont be needing that many mothers power supplies etc. thanks First the short and easy answer: "It all depends" Now the longer answer. A single 8-way system has plenty of advantages and four 2-way or eight 1-way systems certainly have a more overhead, cables, space etc. If you want to play with parallel computing and MPI the 8-way system will work just fine. (And yes, MPI works just fine on SMP systems.) OpenMP is also an option in this case, but remember, OpenMP will not "scale beyond the motherboard" (or at least there are no guarantees) while MPI can. I recently did a whole bunch of tests using both MPI and OpenMP a 12-way (dual 6-core) SMP box I will be posting soon. As I see it, one of the issues with the higher core counts is memory contention. An 8-way parallel program that hits hard on the memory may not scale as well as eight 1-way cores. This is where "it all depends" comes into play because it is very application dependent. I have a small script that I run on multi-core systems that uses the NAS parallel suite (single process) to give a hint at memory performance. I call it "effective cores." Check these two articles for some recent results: http://www.linux-mag.com/id/7855 http://www.linux-mag.com/id/7860 (you have to register, rather painless) Note, the Limulus Project is an attempt to lower the overhead for small personal clusters. I'll have some news "real soon" about an 18-core design (one 6-way and three 4-way) that fits in one case and uses one power supply. More here: http://limulus.basement-supercomputing.com/ (there are links to pics and video) -- Doug > > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > -- > This message has been scanned for viruses and > dangerous content by MailScanner, and is > believed to be clean. > -- Doug -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From bcostescu at gmail.com Tue Oct 5 06:23:30 2010 From: bcostescu at gmail.com (Bogdan Costescu) Date: Tue, 5 Oct 2010 15:23:30 +0200 Subject: [Beowulf] Broadcast - not for HPC - or is it? In-Reply-To: References: Message-ID: On Fri, Sep 24, 2010 at 12:21 PM, Matt Hurd wrote: > I'm associated with a somewhat stealthy start-up. ?Only teaser product > with some details out so far is a type of packet replicator. >From your description as well as from a quick look at the website, it looks and smells like a hub - I mean a dumb hub, like those which existed in the '90s before switching hubs (now called switches) took over. If so, then HPC might not be a good target for you, as it has long ago adopted switches for good reasons. > Primarily focused on low-latency > distribution of market data to multiple users as the port to port HPC usage is a mixture of point-to-point and collective communications; most (all?) MPI library use low level point-to-point communications to achieve collective ones over Ethernet.. Another important point is that the collective communications can be started by any of the nodes - it's not one particular node which generates data and then spreads it to the others; it's also relatively common that 2 or more nodes reach the point of collective communication at the same time, leading to a higher load on the interconnect, maybe congestion. What might be worth a try is a mixed network config where point-to-point communications go through one NIC connected to a switch and the collective communications that can use a broadcast go through another NIC connected to your packet replicator. However, IMHO it would only make sense if the packet replicator makes some guarantees about delivery: f.e. that it would accept a packet from node B even if a packet from node A is being broadcasted at that time; this packet from node B would be broadcasted immediately after the previous transmission has finished. This of course means that each link NIC-packet replicator needs to be duplex and some buffering should be present - this was not the case of the dumb hubs mentioned earlier. I think that such a setup would be enough for MPI_Barrier and MPI_Bcast. One other HPC related application that comes to my mind is distributed storage. One of the main problems is keeping redundant metadata to prevent the whole storage going down if one of the metadata servers goes down. With such a packet replicator, the active metadata server can broadcast it to the others; this would be just one operation - with a switched architecture, this would require N-1 operations (N being the total nr. of metadata servers) and would loose any pretence of atomicity and speed. > They suggested interest in bigger port counts and mentioned >1000 ports. Hmmm, if it's only like a dumb hub (no duplex, no buffering), then I have a hard time imagining how it would work at these port counts - the number of collisions would be huge... Cheers, Bogdan From hearnsj at googlemail.com Tue Oct 5 06:40:55 2010 From: hearnsj at googlemail.com (John Hearns) Date: Tue, 5 Oct 2010 14:40:55 +0100 Subject: [Beowulf] Broadcast - not for HPC - or is it? In-Reply-To: References: Message-ID: On 5 October 2010 14:23, Bogdan Costescu wrote: > > HPC usage is a mixture of point-to-point and collective > communications; most (all?) MPI library use low level point-to-point > communications to achieve collective ones over Ethernet.. Another > important point is that the collective communications can be started > by any of the nodes - it's not one particular node which generates > data and then spreads it to the others; it's also relatively common > that 2 or more nodes reach the point of collective communication at > the same time, leading to a higher load on the interconnect, maybe > congestion. True indeed. However this device might be very interesting if you redefine your parallel processing paradigm. How about problems where you send out identical datasets to (say) a farm of GPUs. From Glen.Beane at jax.org Tue Oct 5 06:54:47 2010 From: Glen.Beane at jax.org (Glen Beane) Date: Tue, 5 Oct 2010 09:54:47 -0400 Subject: [Beowulf] Begginers question # 1 In-Reply-To: References: <864070.13807.qm@web51103.mail.re2.yahoo.com> Message-ID: <5EE29C82-0DA7-42E3-B70E-B572DD2B879E@jax.org> On Oct 4, 2010, at 9:44 PM, Mark Hahn wrote: >> IN CLUSTER COMPUTING, IS THE AMOUNT OF CORE THAT COUNTS? > > no. it's the application that counts. > >> If I build a cluster with 8 motherboards with 1 single core each would it >> be the same as using just one motherboard but with two quad core >> processors? > > of course not. communication among cores on a single board > will certainly be faster than inter-board communication. > it's the application that matters: how frequently do threads/ranks > of the application communicate? are messages small or large? > can the app's communication be formulated as mostly-read sharing of data? > these are all very much properties of the application, > and they determine how suitable any particular hardware will be. > >> I wanna build one of these but wanna save money and space and >> if what counts is the amount of cores to process info I think fewer >> motherboards with dual six-core processors is definitely cheaper just >> because I wont be needing that many mothers power supplies etc. thanks > > power supplies aren't your main concern, since good ones are about 93% > efficient. but going with more-core systems is, in general, a good idea. > mainly for amortization reasons: probably fewer disks, extraneous sutff > like video interfaces, fewer parts to fail, fewer systems to administer, etc. > there can be disadvantages to more-core systems too, since some of the parts > being shared (amortized) may be performance bottlenecks. > > the sweet spots depends on what systems are in volume production - > right now, 2-socket systems are the right building block in most cases. > 4-socket systems would be attractive, but they tend to ship in so much > lower volume that their price is nonlinearly high. 1-socket servers > tend to cost more than half a 2-socket (where "server" means at least > "has ECC memory" - that is, not a desktop.) the price point of the 4-socket Magny Cours systems are pretty attractive. Now that AMD did away with having to pay a premium for CPUs that were compatible with quad socket systems I think you can get more cores for the same amount of money by going quad socket Magny Cours. I purchased a small cluster mid summer, and went with 4-socket 32 core nodes. From bcostescu at gmail.com Tue Oct 5 08:38:40 2010 From: bcostescu at gmail.com (Bogdan Costescu) Date: Tue, 5 Oct 2010 17:38:40 +0200 Subject: [Beowulf] Broadcast - not for HPC - or is it? In-Reply-To: References: Message-ID: On Tue, Oct 5, 2010 at 3:40 PM, John Hearns wrote: > However this device might be very interesting if you redefine your > parallel processing paradigm. I would say that's always true :-) If you can change the way you do things, then adapting to the most appropriate hardware/software combination makes sense to get the maximum of performance. Wait, this has already happened... and then there was chaos... and then came MPI ;-) > How about problems where you send out identical datasets to (say) a > farm of GPUs. You still need to send point-to-point data forth (what to do with the bulk data) and back (the results). Then you have the big problem of synchronizing the 2 channels: the broadcast one and the point-to-point one. This can come from either GPUs taking different time to finish or from point-to-point communication delays/congestion. And then the efficiency brought by broadcast might just go away... Cheers, Bogdan From jlforrest at berkeley.edu Tue Oct 5 09:31:16 2010 From: jlforrest at berkeley.edu (Jon Forrest) Date: Tue, 05 Oct 2010 09:31:16 -0700 Subject: [Beowulf] Begginers question # 1 In-Reply-To: References: <864070.13807.qm@web51103.mail.re2.yahoo.com> Message-ID: <4CAB52D4.4080800@berkeley.edu> On 10/4/2010 11:05 PM, John Hearns wrote: > The 8 motherboard system will help you learn about cluster install > techniques - how to install the smae image on many systems, or how to > run systems with no disks, and also will teach a lot about networking > - as you fundamentally have to have a network to get it running, and > will have to o network troubleshooting. For people who want to learn about cluster install techniques and other non-production cluster-related issues, you might want to look at the Rocks cluster email list. I posted a description of how to create a whole Rocks cluster on one physical machine using VirtualBox. I call this Rocks-in-the-Box. You'd never want to use this method for production work but doing things this way makes it very easy to create test clusters. You could also use these test clusters to dabble in parallel programming. Cordially, -- Jon Forrest Research Computing Support College of Chemistry 173 Tan Hall University of California Berkeley Berkeley, CA 94720-1460 510-643-1032 jlforrest at berkeley.edu From lindahl at pbm.com Tue Oct 5 13:34:42 2010 From: lindahl at pbm.com (Greg Lindahl) Date: Tue, 5 Oct 2010 13:34:42 -0700 Subject: [Beowulf] Broadcast - not for HPC - or is it? In-Reply-To: References: Message-ID: <20101005203442.GF16174@bx9.net> On Fri, Sep 24, 2010 at 08:21:55PM +1000, Matt Hurd wrote: > This was not designed for HPC but for low-latency trading as it beats > a switch in terms of speed. Primarily focused on low-latency > distribution of market data to multiple users as the port to port > latency is in the range of 5-7 nanoseconds as it is pretty passive > device with optical foo at the core. No rocket science here, just > convenient opto-electrical foo. If you go read up about the Blue Gene series of machines' networks, one of them is a "Eureka" network for global broadcasts. It's only a minor aspect of most scientific computations, though. There was even a very low cost, low-latency broadcast network out of Purdue called PAPERS that used the unused parallel port that used to be available in most servers. It was pretty amazing what they could do for so little $$, but I don't think they found that many applications. Presumably your customers are mostly using this for their stock tickers. Another application might be ad network; the broadcast stream would be ad opportunities... The mention of distributing lots of identical data to nodes would probably work better with bittorrent than this sort of gizmo. -- greg From james.p.lux at jpl.nasa.gov Tue Oct 5 14:58:28 2010 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Tue, 5 Oct 2010 14:58:28 -0700 Subject: [Beowulf] Broadcast - not for HPC - or is it? In-Reply-To: <20101005203442.GF16174@bx9.net> References: <20101005203442.GF16174@bx9.net> Message-ID: > -----Original Message----- > From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Greg Lindahl > Sent: Tuesday, October 05, 2010 1:35 PM > To: beowulf at beowulf.org > Subject: Re: [Beowulf] Broadcast - not for HPC - or is it? > > On Fri, Sep 24, 2010 at 08:21:55PM +1000, Matt Hurd wrote: > > > This was not designed for HPC but for low-latency trading as it beats > > a switch in terms of speed. Primarily focused on low-latency > > distribution of market data to multiple users as the port to port > > latency is in the range of 5-7 nanoseconds as it is pretty passive > > device with optical foo at the core. No rocket science here, just > > convenient opto-electrical foo. > > If you go read up about the Blue Gene series of machines' networks, > one of them is a "Eureka" network for global broadcasts. It's only a > minor aspect of most scientific computations, though. There was even a > very low cost, low-latency broadcast network out of Purdue called > PAPERS that used the unused parallel port that used to be available in > most servers. It was pretty amazing what they could do for so little > $$, but I don't think they found that many applications. PAPERS was pretty neat, but these days, there are fewer motherboards with a parallel port, and even fewer with a "well behaved" parallel port suitable for PAPERing.. You'd also have a tough time getting latencies down in the sub microsecond range, since the parallel port is fundamentally intended to talk to a "Centronics" printer interface, with 1 microsecond setup, 5 microsecond strobe, and 1 microsecond hold time, as I recall. (Plenty fast running to that line printer at 400 characters/second, eh?) The EPP and/or ECP found in more modern equipment runs at maybe a megatransfer/second. You're still limited by the equivalent of LS244 and LS374 kinds of speeds and loads. From jack at crepinc.com Mon Oct 4 10:52:30 2010 From: jack at crepinc.com (Jack Carrozzo) Date: Mon, 4 Oct 2010 13:52:30 -0400 Subject: [Beowulf] Begginers question # 1 In-Reply-To: <864070.13807.qm@web51103.mail.re2.yahoo.com> References: <864070.13807.qm@web51103.mail.re2.yahoo.com> Message-ID: It's all about the interconnects and how your application communicates. If you want to calculate a problem you can split N ways without each process having to exchange boundary data, then it doesn't matter the speed of your interconnects or how many cores are on how many boards. However, most problems involved exchange of data between processes. If your app is written to take advantage of multi-core machines, then it will put processes on the same machine which talk to each other a lot, and split low-talking processes onto other machines between which the network is slow (communication between processes on the same machine is of course fast). With this in mind, most clusters are task-built - what are you trying to solve? That will define what hardware you need. -Jack Carrozzo On Sat, Sep 25, 2010 at 1:07 PM, gabriel lorenzo wrote: > IN CLUSTER COMPUTING, IS THE AMOUNT OF CORE THAT COUNTS? > If I build a cluster with 8 motherboards with 1 single core each would it > be the same as using just one motherboard but with two quad core processors? > I wanna build one of these but wanna save money and space and if what counts > is the amount of cores to process info I think fewer motherboards with dual > six-core processors is definitely cheaper just because I wont be needing > that many mothers power supplies etc. thanks > > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vanw+beowulf at sabalcore.com Mon Oct 4 11:27:34 2010 From: vanw+beowulf at sabalcore.com (Kevin Van Workum) Date: Mon, 4 Oct 2010 14:27:34 -0400 Subject: [Beowulf] Broadcast - not for HPC - or is it? In-Reply-To: References: Message-ID: On Fri, Sep 24, 2010 at 6:21 AM, Matt Hurd wrote: > I'm associated with a somewhat stealthy start-up. ?Only teaser product > with some details out so far is a type of packet replicator. > > Designed 24 port ones, but settled on 16 and 48 port 1RU designs as > this seemed to reflect the users needs better. > > This was not designed for HPC but for low-latency trading as it beats > a switch in terms of speed. ?Primarily focused on low-latency > distribution of market data to multiple users as the port to port > latency is in the range of 5-7 nanoseconds as it is pretty passive > device with optical foo at the core. ?No rocket science here, just > convenient opto-electrical foo. > > One user has suggested using them for their cluster but, as they are > secretive about what they do, I don't understand their use case. ?They > suggested interest in bigger port counts and mentioned >1000 ports. > > Hmmm, we could build such a thing at about 8-9 ns latency but I don't > quite get the point just being used to embarrassingly parallel stuff > myself. ?Would have thought this opticast thing doesn't replace an > existing switch framework and would just be an additional cost rather > than helping too much. ?If it has a use, may we should build one with > a lot of ports though 1024 ports seems a bit too big. > > Any ideas on the list about use of low latency broadcast for specific > applications in HPC? ?Are there codes that would benefit? > > Regards, > > Matt. Maybe they're doing a Monte Carlo forecast based on real-time market data; broadcasting the data to 1000+ processes where each process is using a different random seed to generate independent points in phase-space. Of course they would then have to send the updated phase-space somewhere in order to update their likelihoods and issue a reaction. I suppose if communication was the primary bottleneck, doubling of the performance would be an upper limit. -Kevin > _________________ > www.zeptonics.com > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > -- Kevin Van Workum, PhD Sabalcore Computing Inc. Run your code on 500 processors. Sign up for a free trial account. www.sabalcore.com 877-492-8027 ext. 11 From souamisarah at gmail.com Mon Oct 4 11:53:03 2010 From: souamisarah at gmail.com (Sarah souami) Date: Mon, 4 Oct 2010 19:53:03 +0100 Subject: [Beowulf] problem with mpdboot Message-ID: *Good morning, I have a problem to run mpd in more than one site, mpd can be run independently in each site, and I can establish a ssh connection between sites without a password, I disabled the Firewal but,* ** *When I do mpdboot -n x -f mpd.hosts. I get a message like* ** *mpdboot_Fedora1 (handle_mpd_output 420): from mpd on Fedora2, invalid port info: * *Please help me to solve this problem thank you,* -------------- next part -------------- An HTML attachment was scrubbed... URL: From oper.ml at gmail.com Mon Oct 4 12:06:48 2010 From: oper.ml at gmail.com (mlsops) Date: Mon, 04 Oct 2010 16:06:48 -0300 Subject: [Beowulf] Build a Beowulf Cluster from zero. Message-ID: <4CAA25C8.9060208@gmail.com> Hi guys, first of all, congrats to all of you for this mailing list. I would like to ask if any of you have a URL or a tutorial of How to Build (from zero) a Beowulf Cluster and with what OS to Use (Debian or CentOS). I need any material that would help me to build this cluster from zero, because I already tryied once and had problems compiling the kernel, so I stopped and now i'm trying again. Thanks to all of you, guys. Bug hugs and regards, Tony Miranda. -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthurd at acm.org Mon Oct 4 23:34:07 2010 From: matthurd at acm.org (Matt Hurd) Date: Tue, 5 Oct 2010 17:34:07 +1100 Subject: [Beowulf] Broadcast - not for HPC - or is it? In-Reply-To: References: Message-ID: Kevin, >> Any ideas on the list about use of low latency broadcast for specific >> applications in HPC? Are there codes that would benefit? >> > > Maybe they're doing a Monte Carlo forecast based on real-time market > data; broadcasting the data to 1000+ processes where each process is > using a different random seed to generate independent points in > phase-space. Of course they would then have to send the updated > phase-space somewhere in order to update their likelihoods and issue a > reaction. I suppose if communication was the primary bottleneck, > doubling of the performance would be an upper limit. Thanks, that makes sense. I guess what you're saying is a special case of a broadcast oriented MAP then local refinement followed by REDUCE. Can't think of anything much in that problem space else myself. Thought maybe a distributed shared memory (DSM) might make sense but I'm not clever enough to know about that. As opposed to SIMD and MIMD, not sure MISD (==broadcast?) is really a valid thing which low latency broadcast could help with. Is MISD a useful thing in the 21st century? Not sure it is and shaving such small amounts of nanos at extra cost doesn't quite seem to fit a beowult style budget anyhow. Just for HFTs and financial exchanges I guess... Regards, --Matt. > > -Kevin > > >> _________________ >> www.zeptonics.com >> _______________________________________________ >> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing >> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf >> > > > > -- > Kevin Van Workum, PhD > Sabalcore Computing Inc. > Run your code on 500 processors. > Sign up for a free trial account. > www.sabalcore.com > 877-492-8027 ext. 11 > From matthurd at acm.org Tue Oct 5 17:23:36 2010 From: matthurd at acm.org (Matt Hurd) Date: Wed, 6 Oct 2010 11:23:36 +1100 Subject: [Beowulf] Broadcast - not for HPC - or is it? In-Reply-To: References: Message-ID: > From your description as well as from a quick look at the website, it > looks and smells like a hub - I mean a dumb hub, like those which > existed in the '90s before switching hubs (now called switches) took > over. If so, then HPC might not be a good target for you, as it has > long ago adopted switches for good reasons. Not as clever as a hub, as a hub goes from any one of N to any one or all of N with collision sense/detect relying on back off. This thing just goes from port A to port B1 ... port Bn using a simple optical coupler in the core. No contention as the paths are direct. I can't see it being too useful for HPC myself but I guess as Kevin pointed out perhaps there is a corner case or two. It does allow one of the B ports to be bi-directional so that a trader could set up a subsciption to a multicast group to be used by all ports. However, allowing no client to server is a security benefit and I guess if an exchange used such a thing they should just broadcast or some such and disable the bi-direction. - Hide quoted text - >> Primarily focused on low-latency >> distribution of market data to multiple users as the port to port > > HPC usage is a mixture of point-to-point and collective > communications; most (all?) MPI library use low level point-to-point > communications to achieve collective ones over Ethernet.. Another > important point is that the collective communications can be started > by any of the nodes - it's not one particular node which generates > data and then spreads it to the others; it's also relatively common > that 2 or more nodes reach the point of collective communication at > the same time, leading to a higher load on the interconnect, maybe > congestion. > > What might be worth a try is a mixed network config where > point-to-point communications go through one NIC connected to a switch > and the collective communications that can use a broadcast go through > another NIC connected to your packet replicator. However, IMHO it > would only make sense if the packet replicator makes some guarantees > about delivery: f.e. that it would accept a packet from node B even if > a packet from node A is being broadcasted at that time; this packet > from node B would be broadcasted immediately after the previous > transmission has finished. This of course means that each link > NIC-packet replicator needs to be duplex and some buffering should be > present - this was not the case of the dumb hubs mentioned earlier. I > think that such a setup would be enough for MPI_Barrier and MPI_Bcast. > > One other HPC related application that comes to my mind is distributed > storage. One of the main problems is keeping redundant metadata to > prevent the whole storage going down if one of the metadata servers > goes down. With such a packet replicator, the active metadata server > can broadcast it to the others; this would be just one operation - > with a switched architecture, this would require N-1 operations (N > being the total nr. of metadata servers) and would loose any pretence > of atomicity and speed. Not a bad thought the storage thought, but again I reckon that a sub micro switch would be a winner there on the functionality front. Switches, like the Fulcrum based ones, are pretty impressive and not too expensive. Along those lines, it's not a HPC app, at least in my head, but replication has uses for being able to do small fault tolerant quorums with microsecond oriented failover. >> They suggested interest in bigger port counts and mentioned >1000 ports. > > Hmmm, if it's only like a dumb hub (no duplex, no buffering), then I > have a hard time imagining how it would work at these port counts - > the number of collisions would be huge... Nope, not a dumb hub, even dumber ;-) No collisions just a tree of optical couplers frantically splitting the photon streams. The only real trick, albeit pretty minor, is ensuring the signal integrity is within budget and suitable for non-thinking plug and play. Regards, --Matt. From macglobalus at yahoo.com Wed Oct 6 10:43:26 2010 From: macglobalus at yahoo.com (gabriel lorenzo) Date: Wed, 6 Oct 2010 10:43:26 -0700 (PDT) Subject: [Beowulf] Begginers question # 1 Message-ID: <795432.69974.qm@web51106.mail.re2.yahoo.com> First of all thanks to everyone who have answered my question. The goal is a rendering farm. An 8 node prototype would be the star point but eventually a 40 node unit # 1 ( rack ) is the actual project. thanks d.g.i gabriel o lorenzo Message: 1 Date: Mon, 4 Oct 2010 16:12:20 -0400 From: Michael Di Domenico Subject: Re: [Beowulf] Begginers question # 1 To: beowulf at beowulf.org Message-ID: Content-Type: text/plain; charset=ISO-8859-1 To answer your question directly, the answer is, no the performance is not the same. BUT, you've asked a very workload dependent question, but have not told us anything about what you're trying to do, so deciding which is the right choice is pretty hard. It would be unwise to make this decision solely on price, without understanding the trade offs in productivit From peter.st.john at gmail.com Wed Oct 6 13:53:39 2010 From: peter.st.john at gmail.com (Peter St. John) Date: Wed, 6 Oct 2010 16:53:39 -0400 Subject: [Beowulf] Build a Beowulf Cluster from zero. In-Reply-To: <4CAA25C8.9060208@gmail.com> References: <4CAA25C8.9060208@gmail.com> Message-ID: Tony, Usual, short advice is to start with RGB's Beowulf web page, which links to other things but particularly his own online docs: http://www.phy.duke.edu/~rgb/Beowulf/beowulf.php For starting out thinking about this subject, the wiki article: http://en.wikipedia.org/wiki/Beowulf_(computing) Then you can write back to us with more specifics, e.g. what kind of application you have in mind (weather forecasting? Controlling a huge grid of flat screens? minimizing FPS for WoW?) and get more specific, directed advice. Also building one, budget matters alot (4096 dual socket quad core 4GHz liquid cooled? or half a dozen $150 ARM miniboards cobbled together in a suitcase?) Good luck, Peter On Mon, Oct 4, 2010 at 3:06 PM, mlsops wrote: > Hi guys, > > first of all, congrats to all of you for this mailing list. > I would like to ask if any of you have a URL or a tutorial of How to > Build (from zero) a Beowulf Cluster and with what OS to Use (Debian or > CentOS). > I need any material that would help me to build this cluster from zero, > because I already tryied once and had problems compiling the kernel, so I > stopped and now i'm trying again. > Thanks to all of you, guys. > > Bug hugs and regards, > Tony Miranda. > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From joshua_mora at usa.net Wed Oct 6 18:06:35 2010 From: joshua_mora at usa.net (Joshua mora acosta) Date: Wed, 06 Oct 2010 20:06:35 -0500 Subject: [Beowulf] Begginers question # 1 Message-ID: <368oJgBFj6144S02.1286413595@web02.cms.usa.net> Hi Gabriel. If your app is something single threaded (ie. runs on single core) that works on a per frame basis and it is fairly cache friendly, then the more cores the better from ecconomical point of view without hurting necessarily on performance. A fat node would do as well as a bunch of tiny nodes but it will save you for sure a good chunk of money on interconnect and maintenance headaches. Power consumption wise will be also better (more efficient usage of electricity). A fat node will agglomerate linearly the computing power of tiny nodes. Again under the assumption of single threaded apps running concurrently and _being_cache_friendly_. Also you do not need in that case a high speed interconnect with low latency since processes are not "talking" to each other. If you got a lot of data to transfer in and out of the nodes , then 10Gigabit may increase your productivity from the point of view of getting data in and out of the nodes faster, but it better be that part about 20% of the whole thing to really compensate the cost and shrinking that portion only significantly. You may also want to have fast local file system since all cores are dumping(reading/writing) the processed frames more or less concurrently. Number of cores per memory controller is something you want to look at when running your app in "rate" or embarrasingly parallel mode in order to figure out if you are running out of bw. Do not confuse it with memory capacity, which you will need proportionally to the number of cores plus a bunch of it for file system buffers to accelerate the File I/O. Hope it helps to start defining your rendering farm. Regards, Joshua ------ Original Message ------ Received: 01:03 PM CDT, 10/06/2010 From: gabriel lorenzo To: beowulf at beowulf.org Subject: Re: [Beowulf] Begginers question # 1 > First of all thanks to everyone who have answered my question. The goal is a rendering farm. An 8 node prototype would be the star point but eventually a 40 node unit # 1 ( rack ) is the actual project. > thanks > d.g.i gabriel o lorenzo > > > > > > Message: 1 > Date: Mon, 4 Oct 2010 16:12:20 -0400 > From: Michael Di Domenico > Subject: Re: [Beowulf] Begginers question # 1 > To: beowulf at beowulf.org > Message-ID: > > Content-Type: text/plain; charset=ISO-8859-1 > > To answer your question directly, the answer is, no the performance is > not the same. BUT, you've asked a very workload dependent question, > but have not told us anything about what you're trying to do, so > deciding which is the right choice is pretty hard. It would be unwise > to make this decision solely on price, without understanding the trade > offs in productivit > > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From eagles051387 at gmail.com Wed Oct 6 22:22:27 2010 From: eagles051387 at gmail.com (Jonathan Aquilina) Date: Thu, 7 Oct 2010 07:22:27 +0200 Subject: [Beowulf] Begginers question # 1 In-Reply-To: <368oJgBFj6144S02.1286413595@web02.cms.usa.net> References: <368oJgBFj6144S02.1286413595@web02.cms.usa.net> Message-ID: in regards to the application you can try use the cluster with i noticed the latest 2.5 version of blender has a feature that you can set it up to use slave nodes etc. the problem would be needing to learn blender and make a decent sized animation to test. -------------- next part -------------- An HTML attachment was scrubbed... URL: From eagles051387 at gmail.com Wed Oct 6 22:22:27 2010 From: eagles051387 at gmail.com (Jonathan Aquilina) Date: Thu, 7 Oct 2010 07:22:27 +0200 Subject: [Beowulf] Begginers question # 1 In-Reply-To: <368oJgBFj6144S02.1286413595@web02.cms.usa.net> References: <368oJgBFj6144S02.1286413595@web02.cms.usa.net> Message-ID: in regards to the application you can try use the cluster with i noticed the latest 2.5 version of blender has a feature that you can set it up to use slave nodes etc. the problem would be needing to learn blender and make a decent sized animation to test. -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.kidger at bull.co.uk Thu Oct 7 08:10:20 2010 From: daniel.kidger at bull.co.uk (Daniel Kidger) Date: Thu, 07 Oct 2010 16:10:20 +0100 Subject: [Beowulf] Broadcast - not for HPC - or is it? In-Reply-To: References: Message-ID: <4CADE2DC.2070204@bull.co.uk> Matt, Are you really claiming <9ns port to port ? (Quadrics used to think they were leading edge with 40ns latency port to port latency on their switches) At <9ns for the 'switch' then surely the speed of light in copper (a massive 1ns per foot) will dominate over the switch itself? Plus as others say it is not the broadcast that is the hard bit - it is getting the consolidated acks back. Daniel > I'm associated with a somewhat stealthy start-up. Only teaser product > with some details out so far is a type of packet replicator. > > Designed 24 port ones, but settled on 16 and 48 port 1RU designs as > this seemed to reflect the users needs better. > > This was not designed for HPC but for low-latency trading as it beats > a switch in terms of speed. Primarily focused on low-latency > distribution of market data to multiple users as the port to port > latency is in the range of 5-7 nanoseconds as it is pretty passive > device with optical foo at the core. No rocket science here, just > convenient opto-electrical foo. > > One user has suggested using them for their cluster but, as they are > secretive about what they do, I don't understand their use case. They > suggested interest in bigger port counts and mentioned>1000 ports. > > Hmmm, we could build such a thing at about 8-9 ns latency but I don't > quite get the point just being used to embarrassingly parallel stuff > myself. Would have thought this opticast thing doesn't replace an > existing switch framework and would just be an additional cost rather > than helping too much. If it has a use, may we should build one with > a lot of ports though 1024 ports seems a bit too big. > > Any ideas on the list about use of low latency broadcast for specific > applications in HPC? Are there codes that would benefit? > > Regards, > > Matt. > _________________ > www.zeptonics.com > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > > -- Bull, Architect of an Open World TM Dr. Daniel Kidger, HPC Technical Consultant daniel.kidger at bull.co.uk +44 (0) 7966822177 From hahn at mcmaster.ca Thu Oct 7 08:33:03 2010 From: hahn at mcmaster.ca (Mark Hahn) Date: Thu, 7 Oct 2010 11:33:03 -0400 (EDT) Subject: [Beowulf] Broadcast - not for HPC - or is it? In-Reply-To: <4CADE2DC.2070204@bull.co.uk> References: <4CADE2DC.2070204@bull.co.uk> Message-ID: > Are you really claiming <9ns port to port ? it sounds like they are, but it's nothing like a switch: http://www.zeptonics.com/Home/opticast-faq > Plus as others say it is not the broadcast that is the hard bit - it is > getting the consolidated acks back. the device is really just a hub afaikt. From james.p.lux at jpl.nasa.gov Thu Oct 7 09:38:53 2010 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Thu, 7 Oct 2010 09:38:53 -0700 Subject: [Beowulf] Broadcast - not for HPC - or is it? In-Reply-To: <4CADE2DC.2070204@bull.co.uk> References: <4CADE2DC.2070204@bull.co.uk> Message-ID: > -----Original Message----- > From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Daniel Kidger > Sent: Thursday, October 07, 2010 8:10 AM > To: Matt Hurd > Cc: beowulf at beowulf.org > Subject: Re: [Beowulf] Broadcast - not for HPC - or is it? > > Matt, > > Are you really claiming <9ns port to port ? > (Quadrics used to think they were leading edge with 40ns latency port to > port latency on their switches) That's a fairly impressive spec in itself. > > At <9ns for the 'switch' then surely the speed of light in copper (a > massive 1ns per foot) will dominate over the switch itself? 1 ns/foot in copper would be a mighty fine accomplishment. More realistic is something along the lines of .65-.70c for unshielded twisted pair (e.g. the ubiquitous Cat 5/6 wiring has a delay spec of 570-536ns/100m depending on frequency) Even in optical fiber, you're looking at somewhat lower than 3E8 m/s... the refractive index is around 1.5... call it .65 to .7c (there's a reason it's the same general magnitude as copper) On the PCB, you're probably looking at propagation speeds of 0.5c. And, in a *real* system, there are delays in every transition from one mode of propagation or widget to another. Charging up the lead capacitance takes non-zero time. Filling the junction in a VCSEL transmitter takes non-zero time. The devices and designs may have huge bandwidths (GHz), but they're like a pipeline, and it takes some time for stuff to get from one place to another. (a colleague has a copy of a book called "High-Speed Signal Propagation, Advanced Black Magic", by Johnson & Graham that is a great compendium of all sorts of handy rules of thumb and design principles. His version is about 10 years old. I wonder if there's a newer one out.) > > > Plus as others say it is not the broadcast that is the hard bit - it is > getting the consolidated acks back. > > Daniel > > > > distribution of market data to multiple users as the port to port > > latency is in the range of 5-7 nanoseconds as it is pretty passive > > device with optical foo at the core. No rocket science here, just > > convenient opto-electrical foo. > > > > > > Hmmm, we could build such a thing at about 8-9 ns latency but I don't > > quite get the point just being used to embarrassingly parallel stuff I can believe a "signal at pin on chip to signal on another pin on same chip" latencies of a few ns: that's pretty standard high speed MSI logic. On a well designed ASIC, you can probably get few ns kind of delay for pins that are close to each other, if the package isn't too big. But "connector on box to connector on box" < 10ns... that's an accomplishment. From mwheeler at startext.co.uk Thu Oct 7 15:27:24 2010 From: mwheeler at startext.co.uk (Martin Wheeler) Date: Thu, 7 Oct 2010 23:27:24 +0100 (BST) Subject: [Beowulf] Broadcast - not for HPC - or is it? In-Reply-To: References: <4CADE2DC.2070204@bull.co.uk> Message-ID: On Thu, 7 Oct 2010, Lux, Jim (337C) wrote: > (a colleague has a copy of a book called "High-Speed Signal Propagation, > Advanced Black Magic", by Johnson & Graham that is a great compendium of > all sorts of handy rules of thumb and design principles. His version > is about 10 years old. I wonder if there's a newer one out.) # Hardcover: 800 pages # Publisher: Prentice Hall; 1 edition (24 Feb 2003) # Language English # ISBN-10: 013084408X # ISBN-13: 978-0130844088 # Product Dimensions: 23.9 x 17.8 x 4.6 cm Looks like it's still in its first edition. -- Martin Wheeler - G5FM - Glastonbury - BA6 9PH - England mwheeler at martinwheeler.co.uk From pcgrid2011 at gmail.com Tue Oct 5 03:43:52 2010 From: pcgrid2011 at gmail.com (Eric Heien) Date: Tue, 5 Oct 2010 03:43:52 -0700 Subject: [Beowulf] PCGrid 2011 Call for Papers Message-ID: <872f23b4b514cca6985285bbed89231e@heien.org> Apologies if you receive this multiple times. CALL FOR PAPERS Fifth Workshop on Desktop Grids and Volunteer Computing Systems (PCGrid 2011) held in conjunction with the IEEE International Parallel & Distributed Processing Symposium (IPDPS) May 16-20, 2011 Submission deadline: November 1, 2010 Anchorage, Alaska, USA web site: http://pcgrid.imag.fr/ Keynote speaker Prof. Henri Casanova University of Hawaii at Manoa, USA ###################################################################### *********************** CALL FOR PAPERS *********************** OVERVIEW/SCOPE: Desktop grids and volunteer computing systems (DGVCS's) utilize the free resources available in Intranet or Internet environments for supporting large-scale computation and storage. For over a decade, DGVCS's have been one of the largest and most powerful distributed computing systems in the world, offering a high return on investment for applications from a wide range of scientific domains (including computational biology, climate prediction, and high-energy physics). While DGVCS's sustain up to PetaFLOPS of computing power from hundreds of thousands to millions of resources, fully leveraging the platform's computational power is still a major challenge because of the immense scale, high volatility, and extreme heterogeneity of such systems. The purpose of the workshop is to provide a forum for discussing recent advances and identifying open issues for the development of scalable, fault-tolerant, and secure DGVCS's. The workshop seeks to bring desktop grid researchers together from theoretical, system, and application areas to identify plausible approaches for supporting applications with a range of complexity and requirements on desktop environments. This year's workshop will have special emphasis on DGCVS's relationship and integration with Clouds. We invite submissions on DGVCS topics including the following: - cloud computing over unreliable enterprise or Internet resources - DGVCS middleware and software infrastructure (including management), with emphasis on virtual machines - incorporation of DGVCS's with Grid infrastructures - DGVCS programming environments and models - modeling, simulation, and emulation of large-scale, volatile environments - resource management and scheduling - resource measurement and characterization - novel DGVCS applications - data management (strategies, protocols, storage) - security on DGVCS's (reputation systems, result verification) - fault-tolerance on shared, volatile resources - peer-to-peer (P2P) algorithms or systems applied to DGVCS's With regard to the last topic, we strongly encourage authors of P2P-related paper submissions to emphasize the applicability to DGVCS's in order to be within the scope of the workshop. The workshop proceedings will be published through the IEEE Computer Society Press as part of the IPDPS CD-ROM. ###################################################################### IMPORTANT DATES Manuscript submission deadline: November 1, 2010 Acceptance Notification: December 28, 2010 Camera-ready paper deadline: February 1, 2011 Workshop: May 20, 2011 ###################################################################### SUBMISSIONS Manuscripts will be evaluated based on their originality, technical strength, quality of presentation, and relevance to the workshop scope. Only manuscripts that have neither appeared nor been submitted previously for publication are allowed. Authors are invited to submit a manuscript of up to 8 pages in IEEE format (10pt font, two-columns, single-spaced). The procedure for electronic submissions will be posted at: http://pcgrid.imag.fr/submission.html ##################################################################### ORGANIZATION General Chairs Derrick Kondo, INRIA, France Gilles Fedak, INRIA, France Program Chair Eric Heien, University of California, Davis, USA Program Committee David Abramson, Monash University, Australia David Anderson, University of California at Berkeley, USA Artur Andrzejak, Zuse Institute of Berlin, Germany Filipe Araujo, University of Coimbra, Portugal Henri Bal, Vrije Universiteit, The Netherlands Zoltan Balaton, SZTAKI, Hungary Adam Beberg, Stanford University, USA Francisco Brasileiro, Federal University of Campina Grande, Brazil Massimo Canonico, University of Piemonte Orientale, Italy Henri Casanova, University of Hawaii at Manoa, USA Abhishek Chandra, University of Minnesota, USA Edgar Gabriel, University of Houston, USA Haiwu He, INRIA, France Bahman Javadi, University of Melbourne, Australia Yang-Suk Kee, University of Southern California, USA Arnaud Legrand, CNRS, France Grzegorz Malewicz, University of Alabama, USA Alan Sussman, University of Maryland, USA Michela Taufer, University of Delaware, USA David Toth, Merrimack College, USA Bernard Traversat, Oracle Corporation, USA Carlos Varela, Rensselaer Polytechnic Institute, USA Sebastien Varrette, University of Luxembourg, Luxembourg Jon Weissman, University of Minnesota, USA Zhiyuan Zhan, Microsoft, USA -- If you do not want to receive any more PCGrid related news, http://heien.org/lists/?p=unsubscribe&uid=70f91e9c5cd0e58c4b0109095ea2760d -- Powered by PHPlist, www.phplist.com -- From matthurd at acm.org Thu Oct 7 17:55:12 2010 From: matthurd at acm.org (Matt Hurd) Date: Fri, 8 Oct 2010 11:55:12 +1100 Subject: [Beowulf] Broadcast - not for HPC - or is it? In-Reply-To: <4CADE2DC.2070204@bull.co.uk> References: <4CADE2DC.2070204@bull.co.uk> Message-ID: Daniel, > Are you really claiming <9ns port to port ? Yep, hard to measure without using an oscilliscope. 10G Endace timers give about 20ns of accuracy on packets which is not quite enough. As an aside, I would love to know about any more convenient methods for measuring sub 10ns latency. > (Quadrics used to think they were leading edge with 40ns latency port to > port latency on their switches) > > At <9ns for the 'switch' then surely the speed of light in copper (a massive > 1ns per foot) will dominate over the switch itself? Quadrics products did much more useful things than Opticast which is not even a hub, just port A to port B1...Bn replication, one way. It would indeed be difficult to do any two way serdes at all in sub 10ns if you needed to look inside the packets or deal with contention. It is just a n-way coupler at the core, splitting the photon stream. The optical fibre path internally is just a box with 0.3m tails, for 0.6m = 3ns of fibre path. My experience corresponds with Jim's comments: 5 micros per km for copper and fibre roughly; we measure 0.65 to 0.69 C for a variety of twisted pair and optic fibre media. Have a handy 200m fibre as a 1 microsecond timing sanity checker ;-) The opto electronics modules are a bit under a nanosecond in propagation time to polish up the signal and you end up going through four of those from the cable in to the cable out on the box. The thing that makes it work is the fact that the signal integrity on optics is much more flexible than electrical as a 64-way split via optics gives you a bit under a 21dB loss in a link budget of 39dB or so. If you control all aspects of the link, such as putting a small link in a box, it leaves enough head room to clean things up and represent. Something like that is a lot harder in the electrical domain as a 21dB loss is a bit nasty. It is fun to put it all together into a box of convenience with a single digit nanosecond time against it, even if it is only moderately useful. Certainly makes sense for a stock exchange to take the load of their network instructure and also speed things up. > > Plus as others say it is not the broadcast that is the hard bit - it is > getting the consolidated acks back. Indeed. It's been an interesting thread, but I think I've come to the conclusion that, except for a few financial market uses, such a device is not really useful for bewoulf or HPC as the MISD model doesn't seem to be of much practical use unless you can get something cute for virtually free to suit occaisonal use like those mentioned earlier such as the PAPERS or integrated Blue Gene. --Matt. From daniel.kidger at bull.co.uk Fri Oct 8 04:36:16 2010 From: daniel.kidger at bull.co.uk (Daniel Kidger) Date: Fri, 08 Oct 2010 12:36:16 +0100 Subject: [Beowulf] Broadcast - not for HPC - or is it? In-Reply-To: References: <4CADE2DC.2070204@bull.co.uk> Message-ID: <4CAF0230.2070302@bull.co.uk> > It's been an interesting thread, but I think I've come to the > conclusion that, except for a few financial market uses, such a device > is not really useful for bewoulf or HPC as the MISD model doesn't seem > to be of much practical use unless you can get something cute for > virtually free to suit occaisonal use like those mentioned earlier > such as the PAPERS or integrated Blue Gene. > > --Matt. > So if you want another offbeat idea on how to do broadcast on a large cluster - what about wireless ? (although again, you can't get realistically reliable transport with checking acks ) Daniel -- Bull, Architect of an Open World TM Dr. Daniel Kidger, HPC Technical Consultant daniel.kidger at bull.co.uk +44 (0) 7966822177 From reuti at staff.uni-marburg.de Fri Oct 8 04:48:15 2010 From: reuti at staff.uni-marburg.de (Reuti) Date: Fri, 8 Oct 2010 13:48:15 +0200 Subject: [Beowulf] problem with mpdboot In-Reply-To: References: Message-ID: <52397CA2-89EA-4976-96EE-7E9B26B40864@staff.uni-marburg.de> Hi, Am 04.10.2010 um 20:53 schrieb Sarah souami: > Good morning, I have a problem to run mpd in more than one site, mpd can be run independently in each site, and I can establish a ssh connection between sites without a password, I disabled the Firewal but, in principle it should work, but I wouldn't be surprised when it's slow between two sites. All machines have a public TCP/IP address and are not in a private subnet? All machines have the same version of MPICH2? -- Reuti > When I do mpdboot -n x -f mpd.hosts. I get a message like > > mpdboot_Fedora1 (handle_mpd_output 420): from mpd on Fedora2, invalid port info: > > Please help me to solve this problem > > thank you, > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From matthurd at acm.org Fri Oct 8 05:40:28 2010 From: matthurd at acm.org (Matt Hurd) Date: Fri, 8 Oct 2010 23:40:28 +1100 Subject: [Beowulf] Broadcast - not for HPC - or is it? In-Reply-To: <4CAF0230.2070302@bull.co.uk> References: <4CADE2DC.2070204@bull.co.uk> <4CAF0230.2070302@bull.co.uk> Message-ID: >> > So if you want another offbeat idea on how to do broadcast on a large > cluster ?- what about wireless ? > (although again, you can't get realistically reliable transport with > checking acks ) > > Daniel > Nice idea. Perhaps not so offbeat. Though bandwidth and radio don't go too well together normally. Know some guys here in Australia that are doing extremely accurate timing with wireless, not for timing's sake but to measure movement in dam walls and other infrastructure. Kind of like a localised GPS. They have some very neat and accurate stuff. Had a play with some radio foo and managed to get about 880ns + air time on bit to bit from tx to rx but haven't quite figured out how to get super low latency yet. I think there may be a product in there for wireless PPS dissemination for accurate timing to a cluster like the guys do with the dam walls but I'm not sure if people really need much more than what ptp can already do. --Matt. From james.p.lux at jpl.nasa.gov Fri Oct 8 07:34:15 2010 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Fri, 8 Oct 2010 07:34:15 -0700 Subject: [Beowulf] Broadcast - not for HPC - or is it? In-Reply-To: Message-ID: Wireless for broadcast is generically a good idea (heck, I make my living doing such stuff) but tricky.. Latency is a challenge if you want off-the-shelf hardware. The problem is that readily available wireless protocols (e.g. 802.11, 802.14, 802.16) are packet oriented, and more to the point, optimized for things other than precision timing and low latency across a small area. 802.11, for instance, is basically half duplex (e.g. Node doesn't transmit and receive at the same time) but is bidirectional, so that means you have to synchronize in each packet (just like ethernet.. There's a sync pattern at the beginning of each packet), and packets are fairly long, so that the sync is a small overall fraction: optimized for overall throughput. Most wireless protocols choose a packet length that is convenient for the data rate, and the expected propagation delay, and the bandwidth of the transmitter/receivers. If you want low latency, precision timing, you need a more classical "broadcast" scheme.. Keep one transmitter on all the time, so that the receivers have plenty of time to synchronize to it, and then transmit your data at whatever rate you want (1 Gbps is easy). If you're interested in low latency, you're going to want a high symbol rate. Contrast to high bit rate.. Most high speed wireless links encode more than one bit per symbol: 64QAM, for instance, encodes 6 bits in every symbol, so the symbol rate is 1/6th that of the bit rate. 1 Gsymbol/second will, to a first order, need a GHz or two of bandwidth, which isn't a challenge if you're operating at some reasonable microwave frequency (5GHz, for instance). You'll also need to deal with the decidedly bizarre propagation (multipath, etc.), and unfortunately, you can't rely on adaptive equalization and such, because you're interested in low latency, so you want to reject "late arriving echos". Various UltraWideBand (UWB) might be a viable way to do this.. Think of them as a sort of "radar" pulse, and every receiver just listens for the pulse. You could transmit 1 of N different kinds of pulses (say, different frequencies) and transmit log2(N) bits for each pulse. You're stuck with the "nanosecond/foot" free space propagation, still.. So if you're radiating over your cluster of 1000 processors, physical size becomes an issue. (this is a fundamental physics limit on computational speed, if the problem can't be pipelined or done in a systolic array.. Hence ideas of highly integrated multiple processors immersed in liquid coolant... Get lots of computation in a small volume).. The people doing free space optical interconnects have a lot of clever ideas here. Imagine a sphere with processing nodes on the surface, and an optical terminal sticking into the sphere. The terminal has many transmitters and receivers, with microlenses integrated, so you essentially have a point to point link between each possible pair of nodes. On 10/8/10 5:40 AM, "Matt Hurd" wrote: >> > So if you want another offbeat idea on how to do broadcast on a large > cluster - what about wireless ? > (although again, you can't get realistically reliable transport with > checking acks ) > > Daniel > Nice idea. Perhaps not so offbeat. Though bandwidth and radio don't go too well together normally. Know some guys here in Australia that are doing extremely accurate timing with wireless, not for timing's sake but to measure movement in dam walls and other infrastructure. Kind of like a localised GPS. They have some very neat and accurate stuff. Had a play with some radio foo and managed to get about 880ns + air time on bit to bit from tx to rx but haven't quite figured out how to get super low latency yet. I think there may be a product in there for wireless PPS dissemination for accurate timing to a cluster like the guys do with the dam walls but I'm not sure if people really need much more than what ptp can already do. --Matt. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -------------- next part -------------- An HTML attachment was scrubbed... URL: From dnlombar at ichips.intel.com Fri Oct 8 13:23:14 2010 From: dnlombar at ichips.intel.com (David N. Lombard) Date: Fri, 8 Oct 2010 13:23:14 -0700 Subject: [Beowulf] Broadcast - not for HPC - or is it? In-Reply-To: References: <4CADE2DC.2070204@bull.co.uk> <4CAF0230.2070302@bull.co.uk> Message-ID: <20101008202313.GA12110@nlxcldnl2.cl.intel.com> On Fri, Oct 08, 2010 at 05:40:28AM -0700, Matt Hurd wrote: > > Know some guys here in Australia that are doing extremely accurate > timing with wireless, not for timing's sake but to measure movement in > dam walls and other infrastructure. Kind of like a localised GPS. > They have some very neat and accurate stuff. > > Had a play with some radio foo and managed to get about 880ns + air > time on bit to bit from tx to rx but haven't quite figured out how to > get super low latency yet. I think there may be a product in there > for wireless PPS dissemination for accurate timing to a cluster like > the guys do with the dam walls but I'm not sure if people really need > much more than what ptp can already do. Yes, PTP could enable fairly tight time sync across a cluster, if it were used. -- David N. Lombard, Intel, Irvine, CA I do not speak for Intel Corporation; all comments are strictly my own. From james.p.lux at jpl.nasa.gov Fri Oct 8 14:07:37 2010 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Fri, 8 Oct 2010 14:07:37 -0700 Subject: [Beowulf] Broadcast - not for HPC - or is it? In-Reply-To: <20101008202313.GA12110@nlxcldnl2.cl.intel.com> Message-ID: PTP (IEEE 1588) needs custom interface hardware, though. The Ethernet interface has added hardware to time stamp when the packet is sent/received, and that's what the software uses to do the precision measurements. I suppose one could (try to) use PTP over a wireless link. That would take an even more exotic wireless interface that had the PTP measurement hardware built into the interface. I don't know that anyone is actually making such hardware. There are some papers that turn up in a search, but that's about it. I've tried to do something similar, using PC-ethernet-wireless AP :over the air: wireless AP-ethernet-PC, about 5 years ago, and it wasn't particularly great. But there, I'm pretty sure the limiting issue was that there wasn't any deterministic timing between the ethernet wired interface and the 802.11a wireless interface (that is, the boxes I was using were really wireless bridges and had internal storage) On 10/8/10 1:23 PM, "David N. Lombard" wrote: On Fri, Oct 08, 2010 at 05:40:28AM -0700, Matt Hurd wrote: > > Know some guys here in Australia that are doing extremely accurate > timing with wireless, not for timing's sake but to measure movement in > dam walls and other infrastructure. Kind of like a localised GPS. > They have some very neat and accurate stuff. > > Had a play with some radio foo and managed to get about 880ns + air > time on bit to bit from tx to rx but haven't quite figured out how to > get super low latency yet. I think there may be a product in there > for wireless PPS dissemination for accurate timing to a cluster like > the guys do with the dam walls but I'm not sure if people really need > much more than what ptp can already do. Yes, PTP could enable fairly tight time sync across a cluster, if it were used. -- David N. Lombard, Intel, Irvine, CA I do not speak for Intel Corporation; all comments are strictly my own. _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -------------- next part -------------- An HTML attachment was scrubbed... URL: From samuel at unimelb.edu.au Thu Oct 14 19:37:07 2010 From: samuel at unimelb.edu.au (Christopher Samuel) Date: Fri, 15 Oct 2010 13:37:07 +1100 Subject: [Beowulf] 10Gb/s iperf test point (TCP) available ? Message-ID: <4CB7BE53.2020803@unimelb.edu.au> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi there, Apologies if this is off topic, but I'm trying to check what speeds the login nodes to our cluster and BlueGene can talk at and the only 10Gb/s iperf server I've been given access to so far (run by AARNET) showed me just under 1Gb/s. I've already demonstrated I can talk between 2 boxes on the same Force10 switch at 9.3Gb/s with iperf but the ITS network people don't have any test systems to help me pin down where the bottleneck is. So if anyone on the list had a system that they could run an iperf server on at a mutually convenient time I'd be most grateful! The code is on SourceForge and is easy to compile (and is packaged already in Debian and Ubuntu). cheers, Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computational Initiative Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAky3vlIACgkQO2KABBYQAh8mEwCdEGxWYgkVDmENN2GfqxL61MrC INkAn36fmmEyNTIS+tZa/Hz+WykJETQc =nOYR -----END PGP SIGNATURE----- From atchley at myri.com Fri Oct 15 04:01:50 2010 From: atchley at myri.com (Scott Atchley) Date: Fri, 15 Oct 2010 07:01:50 -0400 Subject: [Beowulf] 10Gb/s iperf test point (TCP) available ? In-Reply-To: <4CB7BE53.2020803@unimelb.edu.au> References: <4CB7BE53.2020803@unimelb.edu.au> Message-ID: <23C5266A-50DD-4D4A-8FFC-0E07CC8E3580@myri.com> On Oct 14, 2010, at 10:37 PM, Christopher Samuel wrote: > Apologies if this is off topic, but I'm trying to check > what speeds the login nodes to our cluster and BlueGene > can talk at and the only 10Gb/s iperf server I've been > given access to so far (run by AARNET) showed me just > under 1Gb/s. Have you tried netperf? I have read that iperf calls gettimeofday() before and after each read/write which might mean you are measuring the BG syscall time more than network throughput time. Scott From samuel at unimelb.edu.au Sun Oct 17 19:20:23 2010 From: samuel at unimelb.edu.au (Christopher Samuel) Date: Mon, 18 Oct 2010 13:20:23 +1100 Subject: [Beowulf] 10Gb/s iperf test point (TCP) available ? In-Reply-To: <23C5266A-50DD-4D4A-8FFC-0E07CC8E3580@myri.com> References: <4CB7BE53.2020803@unimelb.edu.au> <23C5266A-50DD-4D4A-8FFC-0E07CC8E3580@myri.com> Message-ID: <4CBBAEE7.8000109@unimelb.edu.au> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 15/10/10 22:01, Scott Atchley wrote: > Have you tried netperf? Nope, AARNET suggested I use iperf (and that was what the test point they provided was running) so I went with that. I'm not having any luck with tracking down someone who can help, either within Australia or outside. :-( cheers, Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computational Initiative Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAky7rucACgkQO2KABBYQAh+iMACfZwS/J8S/LU/WchGTEsKlyna/ kfoAmQHify5Yv7cDzuSpibdkRl6EAk9c =BffF -----END PGP SIGNATURE----- From robl at mcs.anl.gov Tue Oct 19 07:25:37 2010 From: robl at mcs.anl.gov (Rob Latham) Date: Tue, 19 Oct 2010 09:25:37 -0500 Subject: [Beowulf] MPI-IO + nfs - alternatives? In-Reply-To: <1285777453.1665.170.camel@moelwyn> References: <1285777453.1665.170.camel@moelwyn> Message-ID: <20101019142537.GC20063@mcs.anl.gov> On Wed, Sep 29, 2010 at 05:24:13PM +0100, Robert Horton wrote: > 1) Does anyone have any hints for improving the nfs performance under > these circumstances? I've tried using jumbo frames, different > filesystems, having the log device on an SSD and increasing the nfs > block size to 1MB, none of which have any significant effect. The challenge with MPI-IO and NFS is achieving correct performance. NFS consistency semantics make it quite difficult, so we turn off the attribute cache. the MPI-IO library also locks around every I/O operation in an attempt to flush the client cache. Even those steps do not always work. > 2) Are there any reasonable alternatives to nfs in this situation? The > main possibilities seem to be: > > - PVFS or similar with a single IO server. Not sure what performance I > should expect from this though, and it's a lot more complex than nfs. You should expect pretty solid performance: First, your NFS server can go back to enabling caches for your other workloads Second, the MPI-IO library is pretty well tuned to PVFS. No extraneous locks. Third, if you decide one day you need a bit more performance, set up a PVFS volume with more servers. Presto-chango, clients will go faster without any changes. Fourth, is it really that much more complex? I'm myopic on this point, having worked with PVFS for a decade, but it's 90% userspace with a small kernel module. There's also a bunch of helpful people on the mailing lists, which would be where we should take any further PVFS discussions. > - Sharing a block device via iSCSI and using GFS, although this is also > going to be somewhat complex and I can't find any evidence that MPI-IO > will even work with GFS. I haven't used GFS in a decade. Back then, it only supported entire-file locks, making parallel access difficult. GFS has had more fine-grained locking, so it might work well. If you thought standing up PVFS was complicated, wait until you check out GFS (set up quorum, set up lock manager, set up kernel modules, etc). ==rob -- Rob Latham Mathematics and Computer Science Division Argonne National Lab, IL USA From michf at post.tau.ac.il Tue Oct 19 15:56:47 2010 From: michf at post.tau.ac.il (Micha) Date: Wed, 20 Oct 2010 00:56:47 +0200 Subject: [Beowulf] Looking for references for parallelization and optimization Message-ID: <20101020005647.44e6e2db@vivalunalitshi.luna.local> A bit off topic, so sorry, but it looks like a place where people who learned these things at some point hand out ... I've been asked to write a course on the subject of optimizing code. As it's hard to translate knowledge into an actual course, I was wondering if anyone here has references to either books, online tutorials or course syllabuses on the subjects of parallelization (OpenMP, MPI, also matlabs parallel computing toolbox) and optimization (sse, caches, memory access patterns, etc.) . It's less on the subject of this list, but I also need references regarding testing (unit and project), design and profiling. I trying to build a coherent syllabus, and having some reference texts really helps the process, and all my uni course materials are long dead. Thanks From jcownie at cantab.net Wed Oct 20 09:55:51 2010 From: jcownie at cantab.net (James Cownie) Date: Wed, 20 Oct 2010 17:55:51 +0100 Subject: [Beowulf] Looking for references for parallelization and optimization References: Message-ID: <755D3D60-ED86-4EF2-985E-40127B9F7A0C@cantab.net> On 19 Oct 2010, at 23:56, Micha wrote: > A bit off topic, so sorry, but it looks like a place where people who learned > these things at some point hand out ... > > I've been asked to write a course on the subject of optimizing code. As it's > hard to translate knowledge into an actual course, I was wondering if anyone > here has references to either books, online tutorials or course syllabuses on > the subjects of parallelization (OpenMP, MPI, also matlabs parallel computing > toolbox) and optimization (sse, caches, memory access patterns, etc.) . It's less on > the subject of this list, but I also need references regarding testing (unit > and project), design and profiling. > > I trying to build a coherent syllabus, and having some reference texts really > helps the process, and all my uni course materials are long dead. > Intel has a bunch of stuff at their "Academic Community" site http://software.intel.com/en-us/academic/ There are certainly also discussions on code optimization and tools on the Intel SW site. (Disclaimer: I work for Intel.) -- -- Jim -- James Cownie -------------- next part -------------- An HTML attachment was scrubbed... URL: From eugen at leitl.org Thu Oct 21 03:43:23 2010 From: eugen at leitl.org (Eugen Leitl) Date: Thu, 21 Oct 2010 12:43:23 +0200 Subject: [Beowulf] how Google warps your brain Message-ID: <20101021104323.GV28998@leitl.org> http://matt-welsh.blogspot.com/2010/10/computing-at-scale-or-how-google-has.html Tuesday, October 19, 2010 Computing at scale, or, how Google has warped my brain A number of people at Google have stickers on their laptops that read "my other computer is a data center." Having been at Google for almost four months, I realize now that my whole concept of computing has radically changed since I started working here. I now take it for granted that I'll be able to run jobs on thousands of machines, with reliable job control and sophisticated distributed storage readily available. Most of the code I'm writing is in Python, but makes heavy use of Google technologies such as MapReduce, BigTable, GFS, Sawzall, and a bunch of other things that I'm not at liberty to discuss in public. Within about a week of starting at Google, I had code running on thousands of machines all over the planet, with surprisingly little overhead. As an academic, I have spent a lot of time thinking about and designing "large scale systems", though before coming to Google I rarely had a chance to actually work on them. At Berkeley, I worked on the 200-odd node NOW and Millennium clusters, which were great projects, but pale in comparison to the scale of the systems I use at Google every day. A few lessons and takeaways from my experience so far... The cloud is real. The idea that you need a physical machine close by to get any work done is completely out the window at this point. My only machine at Google is a Mac laptop (with a big honking monitor and wireless keyboard and trackpad when I am at my desk). I do all of my development work on a virtual Linux machine running in a datacenter somewhere -- I am not sure exactly where, not that it matters. I ssh into the virtual machine to do pretty much everything: edit code, fire off builds, run tests, etc. The systems I build are running in various datacenters and I rarely notice or care where they are physically located. Wide-area network latencies are low enough that this works fine for interactive use, even when I'm at home on my cable modem. In contrast, back at Harvard, there are discussions going on about building up new resources for scientific computing, and talk of converting precious office and lab space on campus (where space is extremely scarce) into machine rooms. I find this idea fairly misdirected, given that we should be able to either leverage a third-party cloud infrastructure for most of this, or at least host the machines somewhere off-campus (where it would be cheaper to get space anyway). There is rarely a need for the users of the machines to be anywhere physically close to them anymore. Unless you really don't believe in remote management tools, the idea that we're going to displace students or faculty lab space to host machines that don't need to be on campus makes no sense to me. The tools are surprisingly good. It is amazing how easy it is to run large parallel jobs on massive datasets when you have a simple interface like MapReduce at your disposal. Forget about complex shared-memory or message passing architectures: that stuff doesn't scale, and is so incredibly brittle anyway (think about what happens to an MPI program if one core goes offline). The other Google technologies, like GFS and BigTable, make large-scale storage essentially a non-issue for the developer. Yes, there are tradeoffs: you don't get the same guarantees as a traditional database, but on the other hand you can get something up and running in a matter of hours, rather than weeks. Log first, ask questions later. It should come as no surprise that debugging a large parallel job running on thousands of remote processors is not easy. So, printf() is your friend. Log everything your program does, and if something seems to go wrong, scour the logs to figure it out. Disk is cheap, so better to just log everything and sort it out later if something seems to be broken. There's little hope of doing real interactive debugging in this kind of environment, and most developers don't get shell access to the machines they are running on anyway. For the same reason I am now a huge believer in unit tests -- before launching that job all over the planet, it's really nice to see all of the test lights go green. From james.p.lux at jpl.nasa.gov Thu Oct 21 07:13:48 2010 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Thu, 21 Oct 2010 07:13:48 -0700 Subject: [Beowulf] how Google warps your brain In-Reply-To: <20101021104323.GV28998@leitl.org> Message-ID: Comment inserted below On 10/21/10 3:43 AM, "Eugen Leitl" wrote: > > In contrast, back at Harvard, there are discussions going on about building > up new resources for scientific computing, and talk of converting precious > office and lab space on campus (where space is extremely scarce) into machine > rooms. I find this idea fairly misdirected, given that we should be able to > either leverage a third-party cloud infrastructure for most of this, or at > least host the machines somewhere off-campus (where it would be cheaper to > get space anyway). There is rarely a need for the users of the machines to be > anywhere physically close to them anymore. There *is* a political reason and a funding stream reason. When you use a remote resource, then someone is measuring the use of that resource, and typically one has a budget allocated for that resource. Perhaps at google, computing resources are free, but that's not the case at most places. So, someone who has been given X amount of resources to do task Y can't on the spur of the moment use some fraction of that to do task Z (and, in fact, if you're consuming government funds, using resources allocated for Y to do Z is illegal). However, if you've used the dollars to buy a local computer, typically, the "accounting for use" stops at that point, and nobody much cares what you use that computer for, as long as Y gets done. In the long term, yes, there will be an evaluation of whether you bought too much or too little for the X amount of resources, but in the short run, you've got some potential "free" excess resources. This is a bigger deal than you might think. Let's take a real life example. You have a small project, funded at, say, $150k for a year (enough to support a person working maybe 1/3 time, plus resources) for a couple years. You decide to use an institutionally provided desktop computer and store all your terabytes of data on an institutional server and pay the nominal $500/month (which pays for backups, etc. and all the admin stuff you shouldn't really be fooling with anyway). You toil happily for the year (spending around $6k of your budget on computing resources), and then the funding runs out, a little earlier than you had hoped (oops, the institution decided to retroactively change the chargeback rates, so now that monthly charge is $550). And someone comes to you and says: Hey, you are out of money, we're deleting the data you have stored in the cloud, and by the way, give back that computer on your desk. You're going to need to restart your work next year, when next year's money arrives (depending on the funding agency's grant cycle, there is a random delay in this.. Maybe they're waiting for Congress to pass a continuing resolution or the California Legislature to pass the budget, or whatever..), but in the mean time, you're out of luck. And yes, a wise project manager (even for this $300k task) would have set aside some reserves, etc. But that doesn't always happen. At least if you OWN the computing resources, you have the option of mothballing, deferring maintenance, etc. to ride through a funding stream hiccup. Unless you really don't believe in > remote management tools, the idea that we're going to displace students or > faculty lab space to host machines that don't need to be on campus makes no > sense to me. > > The tools are surprisingly good. > Log first, ask questions later. It should come as no surprise that debugging > a large parallel job running on thousands of remote processors is not easy. > So, printf() is your friend. This works in a "resources are free" environment. But what if you are paying for every byte of storage for all those log messages? What if you're paying for compute cycles to scan those logs? Remote computing on a large scale works *great* if the only cost is a "connectivity endpoint" Look at the change in phone costs over the past few decades. Back in the 70s, phone call and data line cost was (roughly) proportional to distance, because you were essentially paying for a share of the physical copper (or equivalent) along the way. As soon, though, as there was substantial fiber available, there was a huge oversupply of capacity, so the pricing model changed to "pay for the endpoint" (or "point of presence/POP"), leading to "5c/min long distance anywhere in the world". I was at a talk by a guy from AT&T in 1993 and he mentioned that the new fiber link across the Atlantic cost about $3/phone line (in terms of equivalent capacity, e.g. 64kbps), and that was the total lifecycle cost.. The financial model was: if you paid $3, you'd have 64kbbps across the atlantic in perpetuity, or close to it. Once you'd paid your $3, nobody cared if it was busy or idle, etc. So the "incremental cost" to use the circuit was essentially zero. Compare this to the incredibly expensive physical copper wires with limited bandwidth, where they could charge dollars/minute, which is pretty close to the actual cost to provide the service. If you go back through the archives of this list, this kind of "step function in costs" has been discussed a lot. You've already got someone sysadmin'ing a cluster with 32 nodes, and they're not fully busy, so adding another cluster only increases your costs by the hardware purchase (since the electrical and HVAC costs are covered by overheads). But the approach of "low incremental cost to consume excess capacity" only lasts so long: when you get to sufficiently large scales, there is *no* excess capacity, because you're able to spend your money in sufficiently small granules (compared to overall size). Or, returning to my original point, the person giving you your money is able to account for your usage in sufficiently small granules that you have no "hidden excess" to "play with". Rare is the cost sensitive organization that voluntarily allocates resources to unconstrained "fooling around". Basically, it's the province of patronage. Log everything your program does, and if > something seems to go wrong, scour the logs to figure it out. Disk is cheap, > so better to just log everything and sort it out later if something seems to > be broken. There's little hope of doing real interactive debugging in this > kind of environment, and most developers don't get shell access to the > machines they are running on anyway. For the same reason I am now a huge > believer in unit tests -- before launching that job all over the planet, it's > really nice to see all of the test lights go green. From Bill.Rankin at sas.com Thu Oct 21 08:19:06 2010 From: Bill.Rankin at sas.com (Bill Rankin) Date: Thu, 21 Oct 2010 15:19:06 +0000 Subject: [Beowulf] how Google warps your brain In-Reply-To: References: <20101021104323.GV28998@leitl.org> Message-ID: <76097BB0C025054786EFAB631C4A2E3C0948BCAB@MERCMBX03D.na.SAS.com> Good points by Jim, and while I generally try and avoid "me too" posts, I just wanted to add my two cents. In my previous life I worked on building a central HPC cluster facility at Duke. The single biggest impediment to creating this resource was actually trying to justify its expense and put a actual number on the cost savings of having a centrally managed system. This was extremely difficult to do given the way the university tracked its infrastructure and IT costs. If a research group bought a rack or two of nodes then they were usually hosted in the local school/department facilities and supported by local IT staff. The cost of power/cooling and staff time became part of a larger departmental budget and effectively disappeared from the financial radar. They were not tracked at that level of granularity. They were effectively invisible. Put all those systems together into a shared facility and all of a sudden those costs become very visible. You can track the power and cooling costs. You now have salaries for dedicated IT/HPC staff. And ultimately you have one person having to cut some very large checks. And because of the university funding model and the associated politics it is extremely difficult, if not impossible, to actually recoup funds from the departments or even the research groups who would be saving money. In order to make it work, you really need the senior leadership of the university to commit to making central HPC infrastructure an absolute requirement, and sticking to that commitment when it comes budget time and the politics are running hot and heavy over who gets how much. Now to most of us this is a rehash of a conversation that we have had often before. And with clusters and HPC pretty much established as a necessity for any major research university, the development of central facilities would seem to be the obvious solution. I find it somewhat concerning that institutions like Harvard are apparently still dealing with this issue. -bill From landman at scalableinformatics.com Thu Oct 21 09:10:35 2010 From: landman at scalableinformatics.com (Joe Landman) Date: Thu, 21 Oct 2010 12:10:35 -0400 Subject: [Beowulf] how Google warps your brain In-Reply-To: <76097BB0C025054786EFAB631C4A2E3C0948BCAB@MERCMBX03D.na.SAS.com> References: <20101021104323.GV28998@leitl.org> <76097BB0C025054786EFAB631C4A2E3C0948BCAB@MERCMBX03D.na.SAS.com> Message-ID: <4CC065FB.1060905@scalableinformatics.com> On 10/21/2010 11:19 AM, Bill Rankin wrote: [...] > In order to make it work, you really need the senior leadership of > the university to commit to making central HPC infrastructure an > absolute requirement, and sticking to that commitment when it comes > budget time and the politics are running hot and heavy over who gets > how much. Agreed. These battles can be fierce. And if you add in a decidedly non-HPC IT organization as a player, HPC can take some different directions. Especially on some of the baseline infrastructure decisions. > Now to most of us this is a rehash of a conversation that we have had > often before. And with clusters and HPC pretty much established as a > necessity for any major research university, the development of > central facilities would seem to be the obvious solution. I find it > somewhat concerning that institutions like Harvard are apparently > still dealing with this issue. Its a long list of major research institutions, which have a less powerful admin and more powerful departments. These don't even tend to be a federated set of clusters (or clouds). More like clusters/HPC in isolation. Its a good idea to share the need/costs/benefits among many departments ... lowers costs, increases utility, etc. The issue is, as Bill noted, come budget time, when the long knives are out, whether or not HPC survives intact, or is carved up and allocated in a less than (nearly) optimal manner. Maybe HPC needs to be its own department, then it can fight for its own budgets. Have a supercomputing institute (ala MSI at U Minnesota) provide leadership, infrastructure, space, resources. Its a model that could work at different scales for other universities. -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics Inc. email: landman at scalableinformatics.com web : http://scalableinformatics.com http://scalableinformatics.com/jackrabbit phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 From rgb at phy.duke.edu Thu Oct 21 09:37:32 2010 From: rgb at phy.duke.edu (Robert G. Brown) Date: Thu, 21 Oct 2010 12:37:32 -0400 (EDT) Subject: [Beowulf] how Google warps your brain In-Reply-To: <4CC065FB.1060905@scalableinformatics.com> References: <20101021104323.GV28998@leitl.org> <76097BB0C025054786EFAB631C4A2E3C0948BCAB@MERCMBX03D.na.SAS.com> <4CC065FB.1060905@scalableinformatics.com> Message-ID: On Thu, 21 Oct 2010, Joe Landman wrote: > Maybe HPC needs to be its own department, then it can fight for its own > budgets. Have a supercomputing institute (ala MSI at U Minnesota) provide > leadership, infrastructure, space, resources. Its a model that could work at > different scales for other universities. Of course (he chimes in from the bleachers where he is quietly sitting and wishing he were drinking a beer while watching students -- also wishing they were drinking a beer -- take a physics exam:-) this simply puts the world back where it was so long ago before the beowulf concept was invented... The real beauty of clusters (to me) has always been at least partly the fact that you could build YOURSELF a cluster, just for your own research, without having to have major leadership, infrastructure, space, or other resources. Just something to think about...;-) Now, back to the bleachers. No, I'm not dead, only teaching an enormous and time consuming physics course and not doing any computing at all at the moment, not even the computing I would LIKE to be doing.... rgb Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu From hahn at mcmaster.ca Thu Oct 21 11:30:54 2010 From: hahn at mcmaster.ca (Mark Hahn) Date: Thu, 21 Oct 2010 14:30:54 -0400 (EDT) Subject: [Beowulf] how Google warps your brain In-Reply-To: References: <20101021104323.GV28998@leitl.org> <76097BB0C025054786EFAB631C4A2E3C0948BCAB@MERCMBX03D.na.SAS.com> <4CC065FB.1060905@scalableinformatics.com> Message-ID: > The real beauty of clusters (to me) has always been at least partly the > fact that you could build YOURSELF a cluster, just for your own > research, without having to have major leadership, infrastructure, > space, or other resources. sure, but the question is: under what circumstances would you want to? doing beowulf is indeed easy, and for certain scales and cost structures, cheap. if you have a fairly constant personal demand for resources (always have 80-100 cores busy), then doing it yourself may still make sense. but the impetus towards centralization is sharing: if your usage is bursty, having your own cluster would result in low utilization. and if the same funding went towards a large, shared facility, your bursts could be higher. of course, there's still the issue of autonomy - you control your own cluster. but in a sense, that's really just reflecting (un)responsiveness on the part of whoever manages the shared resource... I'm pretty convinced that, ignoring granularity or political issues, shared resources save a lot in leadership, infrastructure, space, etc. From rgb at phy.duke.edu Thu Oct 21 11:38:21 2010 From: rgb at phy.duke.edu (Robert G. Brown) Date: Thu, 21 Oct 2010 14:38:21 -0400 (EDT) Subject: [Beowulf] how Google warps your brain In-Reply-To: References: <20101021104323.GV28998@leitl.org> <76097BB0C025054786EFAB631C4A2E3C0948BCAB@MERCMBX03D.na.SAS.com> <4CC065FB.1060905@scalableinformatics.com> Message-ID: On Thu, 21 Oct 2010, Mark Hahn wrote: >> The real beauty of clusters (to me) has always been at least partly the >> fact that you could build YOURSELF a cluster, just for your own >> research, without having to have major leadership, infrastructure, >> space, or other resources. > > sure, but the question is: under what circumstances would you want to? > doing beowulf is indeed easy, and for certain scales and cost structures, > cheap. if you have a fairly constant personal demand for resources > (always have 80-100 cores busy), then doing it yourself may still make sense. > but the impetus towards centralization is sharing: if your usage is bursty, > having your own cluster would result in low utilization. and if the same > funding went towards a large, shared facility, your bursts could be higher. > > of course, there's still the issue of autonomy - you control your own > cluster. but in a sense, that's really just reflecting (un)responsiveness > on the part of whoever manages the shared resource... > > I'm pretty convinced that, ignoring granularity or political issues, shared > resources save a lot in leadership, infrastructure, space, etc. No real argument -- I just was pointing out the irony...;-) rgb Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu From landman at scalableinformatics.com Thu Oct 21 11:58:20 2010 From: landman at scalableinformatics.com (Joe Landman) Date: Thu, 21 Oct 2010 14:58:20 -0400 Subject: [Beowulf] how Google warps your brain In-Reply-To: References: <20101021104323.GV28998@leitl.org> <76097BB0C025054786EFAB631C4A2E3C0948BCAB@MERCMBX03D.na.SAS.com> <4CC065FB.1060905@scalableinformatics.com> Message-ID: <4CC08D4C.7020901@scalableinformatics.com> On 10/21/2010 02:38 PM, Robert G. Brown wrote: >> I'm pretty convinced that, ignoring granularity or political issues, >> shared resources save a lot in leadership, infrastructure, space, etc. > > No real argument -- I just was pointing out the irony...;-) Didn't have a chance to respond before ... I think the real major issue compared to the "olden days" is that if things work out correctly a) the actual incremental cost of adding HPC computing/storage capability is *far* less than it was before, and as importantly, b) you are not locked into a vendor or a technology the way you used to be in the "bad old days." The second is said somewhat tongue-in-cheek as the rise of more powerful IT groups have sometimes thwarted nascent HPC groups from buying what they want/need. If your university has a purchase agreement with a big tier-1 vendor, you fall squarely in this mix. You can't easily buy what you need, but you can easily buy from a particular vendor. Which might serve an administration deploying lots of web servers well, but might not be the right approach to building an HPC infrastructure. It gets ... erm ... interesting ... when you need to explain the utility of infinband, or why stacked switches are a *very bad idea* or why SANs aren't what most HPC shops need for fast storage ... list goes on. But thats a conversation for a different thread. -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics Inc. email: landman at scalableinformatics.com web : http://scalableinformatics.com http://scalableinformatics.com/jackrabbit phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 From james.p.lux at jpl.nasa.gov Thu Oct 21 12:07:23 2010 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Thu, 21 Oct 2010 12:07:23 -0700 Subject: [Beowulf] how Google warps your brain In-Reply-To: References: <20101021104323.GV28998@leitl.org> <76097BB0C025054786EFAB631C4A2E3C0948BCAB@MERCMBX03D.na.SAS.com> <4CC065FB.1060905@scalableinformatics.com> Message-ID: > -----Original Message----- > From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Mark Hahn > Sent: Thursday, October 21, 2010 11:31 AM > To: Robert G. Brown > Cc: Beowulf Mailing List > Subject: Re: [Beowulf] how Google warps your brain > > > The real beauty of clusters (to me) has always been at least partly the > > fact that you could build YOURSELF a cluster, just for your own > > research, without having to have major leadership, infrastructure, > > space, or other resources. > > sure, but the question is: under what circumstances would you want to? > doing beowulf is indeed easy, and for certain scales and cost structures, > cheap. if you have a fairly constant personal demand for resources > (always have 80-100 cores busy), then doing it yourself may still make sense. > but the impetus towards centralization is sharing: if your usage is bursty, > having your own cluster would result in low utilization. and if the same > funding went towards a large, shared facility, your bursts could be higher. > > of course, there's still the issue of autonomy - you control your own > cluster. but in a sense, that's really just reflecting (un)responsiveness > on the part of whoever manages the shared resource... > > I'm pretty convinced that, ignoring granularity or political issues, shared > resources save a lot in leadership, infrastructure, space, etc. OTOH, it's just those granularity and cost accounting issues that led to Beowulfs being built in the first place. I suspect (nay, I know, but just can't cite the references) that this sort of issue is not unique to HPC, or even computing and IT. Consider libraries, which allow better utilization of books, at the cost of someone else deciding which books to have in stock. And consider the qualitatively different experience of "browsing in the stacks" vs "entering the call number in the book retrieval system".. the former leads to serendipity as you turn down the wrong aisle or find a mis-shelved volume; the latter is faster and lower cost as far as a "information retrieval" function. To me, this is the essence of personal computing. It's not that it's more cost effective to have a computer on/under my desk (almost certainly not, esp with cheap bandwidth to a server), it's that I can do things that are somewhat ill defined and require creativity, and after all, that *is* what I get paid for. And this is because they've bought a certain amount of computational resources for me, and leave it up to me to use or not, as I see fit. Compare to a front line customer service agent in a cube farm where they typically have a locked down configuration and are tightly scripted. That computing environment is hardly personal (whether it's a screen and keyboard or an actual computer) and is essentially an early 21st century assembly line or battery farm. From hahn at mcmaster.ca Thu Oct 21 15:01:23 2010 From: hahn at mcmaster.ca (Mark Hahn) Date: Thu, 21 Oct 2010 18:01:23 -0400 (EDT) Subject: [Beowulf] how Google warps your brain In-Reply-To: References: <20101021104323.GV28998@leitl.org> <76097BB0C025054786EFAB631C4A2E3C0948BCAB@MERCMBX03D.na.SAS.com> <4CC065FB.1060905@scalableinformatics.com> Message-ID: >> I'm pretty convinced that, ignoring granularity or political issues, shared >> resources save a lot in leadership, infrastructure, space, etc. > > OTOH, it's just those granularity and cost accounting issues that led to > Beowulfs being built in the first place. I'm not really sure I understand what you mean. by "granularity", I just meant that you can't really have fractional sysadmins, and a rack with 1 node consumes as much floor space as a full rack. in some sense, smaller clusters have their costs "rounded down" - there's a size beneath which you tend to avoid paying for power, cooling, etc. perhaps that's what you meant by cost- accounting. but do you think these were really important at the beginning? to me, beowulf is "attack of the killer micro" applied to parallelism. that is, mass-market computers that killed the traditional glass-house boxes: vector supers, minis, eventually anything non-x86. the difference was fundamental (much cheaper cycles), rather than these secondary issues. > I suspect (nay, I know, but just can't cite the references) that this sort >of issue is not unique to HPC, or even computing and IT. Consider >libraries, which allow better utilization of books, at the cost of someone >else deciding which books to have in stock. well, HPC is unique in scale of bursting. even if you go on a book binge, there's no way you can consume orders of magnitude more books as I can, or compared to your trailing-year average. but that's the big win for HPC centers - if everyone had a constant demand, a center would deliver only small advantages, not even much better than a colo site. > And consider the qualitatively >different experience of "browsing in the stacks" vs "entering the call >number in the book retrieval system".. the former leads to serendipity as >you turn down the wrong aisle or find a mis-shelved volume; the latter is >faster and lower cost as far as a "information retrieval" function. heh, OK. I think that's a bit of a stretch, since your serendipity would not scale with the size of the library, but mainly with its messiness ;) >get paid for. And this is because they've bought a certain amount of >computational resources for me, and leave it up to me to use or not, as I >see fit. I find myself using my desktop more and more as a terminal - I hardly ever run anything but xterm and google chrome. as such, I don't mind that it's a terrible old recycled xeon from a 2003 project. it would seem like a waste of money to buy something modern, (and for me to work locally) since there are basically infinite resources 1ms away as the packet flies... regards, mark. From hahn at MCMASTER.CA Thu Oct 21 15:04:44 2010 From: hahn at MCMASTER.CA (Mark Hahn) Date: Thu, 21 Oct 2010 18:04:44 -0400 (EDT) Subject: [Beowulf] how Google warps your brain In-Reply-To: <20101021104323.GV28998@leitl.org> References: <20101021104323.GV28998@leitl.org> Message-ID: > parallel jobs on massive datasets when you have a simple interface like > MapReduce at your disposal. Forget about complex shared-memory or message > passing architectures: that stuff doesn't scale, and is so incredibly brittle > anyway (think about what happens to an MPI program if one core goes offline). this is a bit unfair - the more honest comment would be that for data-parallel workloads, it's relatively easy to replicate the work a bit, and gain substantially in robustness. you _could_ replicate the work in a traditional HPC application (CFD, chem/md, etc), but it would take a lot of extra bookkeeping because the dataflow patterns are complex and iterative. > The other Google technologies, like GFS and BigTable, make large-scale > storage essentially a non-issue for the developer. Yes, there are tradeoffs: well, I think storage is the pivot here: it's because disk storage is so embarassingly cheap that Goggle can replicate everything (3x?). once you've replicated your data, replicating work almost comes along for free. > So, printf() is your friend. Log everything your program does, and if > something seems to go wrong, scour the logs to figure it out. Disk is cheap, > so better to just log everything and sort it out later if something seems to this is OK for data-parallel, low-logic kinds of workflows (like Goggle's). it's a long way from being viable for any sort of traditional HPC, where there's far too much communication and everything runs too long to log everything. interestingly, logging might work if the norm for HPC clusters were something like gigabit-connected uni-core nodes, each with 4x 3TB disks. so in a sense we're talking across a cultural gulf: disk/data-oriented vs compute/communication-oriented. From james.p.lux at jpl.nasa.gov Thu Oct 21 15:55:12 2010 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Thu, 21 Oct 2010 15:55:12 -0700 Subject: [Beowulf] how Google warps your brain In-Reply-To: References: <20101021104323.GV28998@leitl.org> <76097BB0C025054786EFAB631C4A2E3C0948BCAB@MERCMBX03D.na.SAS.com> <4CC065FB.1060905@scalableinformatics.com> Message-ID: > -----Original Message----- > From: Mark Hahn [mailto:hahn at mcmaster.ca] > Sent: Thursday, October 21, 2010 3:01 PM > To: Lux, Jim (337C) > Cc: Beowulf Mailing List > Subject: RE: [Beowulf] how Google warps your brain > > >> I'm pretty convinced that, ignoring granularity or political issues, shared > >> resources save a lot in leadership, infrastructure, space, etc. > > > > OTOH, it's just those granularity and cost accounting issues that led to > > Beowulfs being built in the first place. > > I'm not really sure I understand what you mean. by "granularity", I just > meant that you can't really have fractional sysadmins, and a rack with 1 node > consumes as much floor space as a full rack. in some sense, smaller clusters > have their costs "rounded down" - there's a size beneath which you tend to > avoid paying for power, cooling, etc. perhaps that's what you meant by cost- > accounting. That's exactly what I meant. In any organization, there's a certain level of detail below which they don't generally require reporting. Likewise, there's a certain threshold value for the signature chain. For institutionally provided services on a chargeback basis (e.g. phone calls, cpu seconds on the mainframe, etc.) the expectation is that costs are tracked to the penny (or, as the federal rules have it, you have to make sure that they are allocable, accountable, and allowable). For things bought in chunks, the resolution requirement is typically at the "purchase" level (e.g. nobody makes me allocate a $1000 computer to 15 different cost accounts, but I would have to account for disk space on the institutional server at that level). (because it's a "diminishing returns" issue.. it's cheap to define which cost accounts pay for which disk directories, it's not cheap to split Purchase Orders between accounts) > > but do you think these were really important at the beginning? to me, > beowulf is "attack of the killer micro" applied to parallelism. that is, > mass-market computers that killed the traditional glass-house boxes: > vector supers, minis, eventually anything non-x86. the difference was > fundamental (much cheaper cycles), rather than these secondary issues. I think this was.. If you needed horsepower, you could either go fight the budget battle to buy cpu seconds on the big iron OR you could buy your own supercomputer and not have to worry about the significant administrative time setting up and reconciling and reporting on those cpu seconds across all your projects. And, because the glass house box is very visible and high value, there is a lot of oversight to "make sure that we are effectively using the asset" and "that the operations cost is fairly allocated among the users". Particularly in places where there is strict cost accounting on things and not so strict on labor (e.g. your salary is paid for already by some generic bucket) this could be a big driver: you could spend your own time essentially for free. > > > I suspect (nay, I know, but just can't cite the references) that this sort > >of issue is not unique to HPC, or even computing and IT. Consider > >libraries, which allow better utilization of books, at the cost of someone > >else deciding which books to have in stock. > > well, HPC is unique in scale of bursting. even if you go on a book binge, > there's no way you can consume orders of magnitude more books as I can, > or compared to your trailing-year average. but that's the big win for HPC > centers - if everyone had a constant demand, a center would deliver only > small advantages, not even much better than a colo site. Yes.. that's why the library/book model isn't as good as it could be. > > > And consider the qualitatively > >different experience of "browsing in the stacks" vs "entering the call > >number in the book retrieval system".. the former leads to serendipity as > >you turn down the wrong aisle or find a mis-shelved volume; the latter is > >faster and lower cost as far as a "information retrieval" function. > > heh, OK. I think that's a bit of a stretch, since your serendipity would > not scale with the size of the library, but mainly with its messiness ;) > > >get paid for. And this is because they've bought a certain amount of > >computational resources for me, and leave it up to me to use or not, as I > >see fit. > > I find myself using my desktop more and more as a terminal - I hardly > ever run anything but xterm and google chrome. as such, I don't mind > that it's a terrible old recycled xeon from a 2003 project. it would seem > like a waste of money to buy something modern, (and for me to work locally) > since there are basically infinite resources 1ms away as the packet flies... And as long as there's not a direct cost to you (or your budget) of incremental use of those remote resources, then what you say is entirely true. But if you were paying for traffic, you'd think differently. When I was in Rome a year ago for a couple weeks, I had one of those USB data modems. You pay by the kilobyte, so you do all your work offline, fire up the modem, transfer your stuff, and shut it down. Shades of dial-up and paying by the minute of connect time. All of a sudden you get real interested in how much bandwidth all that javascript and cool formatting stuff flying back and forth to make the pretty website for email is. And the convenient "let's send packets to keep the VPN tunnel alive" feature is really unpleasant, because you can literally watch the money meter add up while you're sitting there thinking. With a cluster of my own, the entire cost is essentially fixed, whether I use it or not, so I can "fool around" and not worry about whether I'm being efficient. Which gets back to CS classes in the 70s, where you had a limited number of runs/seconds for the quarter, so great emphasis is put on "desk checking" as opposed to interactive development... I'm not sure that one didn't have higher quality code back then, but the overall productivity was lower, and I'd hate to give up Matlab (and I loved APL on an IBM5100). But then, I'm in what is essentially a perpetual prototyping environment. That is, a good part of the computational work I need is iterations of the algorithm implementation, more than the outputs of that algorithm. If I were like my wife who does banking IT or the folks doing science data processing from satellites, in a production environment, I'd probably say the big data center is a MUCH better way to go. They need the efficiency, they have large enough volume to justify the fine grained accounting (because 1% of 100 million dollars is a lot bigger in absolute terms than 10% of $100k, so you can afford to put a full time person on it for the big job) So, my wife needs the HPC data center and a staff of minions. I want the personal supercomputer which makes my life incrementally easier, but without having to spend time dealing with accounting for tens of dollars. From ellis at runnersroll.com Thu Oct 21 20:40:23 2010 From: ellis at runnersroll.com (Ellis H. Wilson III) Date: Thu, 21 Oct 2010 23:40:23 -0400 Subject: [Beowulf] how Google warps your brain In-Reply-To: <20101021104323.GV28998@leitl.org> References: <20101021104323.GV28998@leitl.org> Message-ID: <4CC107A7.2050606@runnersroll.com> On 10/21/10 06:43, Eugen Leitl wrote: > The cloud is real. The idea that you need a physical machine close by to get > any work done is completely out the window at this point. My only machine at > Google is a Mac laptop (with a big honking monitor and wireless keyboard and > trackpad when I am at my desk). I do all of my development work on a virtual > Linux machine running in a datacenter somewhere -- I am not sure exactly > where, not that it matters. I ssh into the virtual machine to do pretty much > everything: edit code, fire off builds, run tests, etc. The systems I build > are running in various datacenters and I rarely notice or care where they are > physically located. Wide-area network latencies are low enough that this > works fine for interactive use, even when I'm at home on my cable modem. The fact that the author is using a Mac and doing development work on a virtual Linux machine in an unknown location highlights the underlying theme of the article, the resultant thread, and perhaps even this entire mailing list: Different setups work better for different workloads. Clearly, the author feels that the inconvenience incurred by having to use a virtual Linux machine to perform his development is less than the inconvenience of running Linux as his main OS. Otherwise, he would simply use Linux on his machine and sit at home in his pajamas, sipping a hot cup of Earl Grey and working out his HPC problem locally. Nonetheless, there are numerous examples of workloads in the scientific community (used here in reference to the physical sciences) and in HPC development, which unfortunately do not play nicely with such a remote and fluctuating setup. For instance, in my research, it is far easier to own the machines one runs on (or at least have root access) to develop and test breakthroughs in systems development. Often messing with the kernel, toolchain, or arbitrary libraries in the distribution is required to effect and test the change in which one is interested. It goes without saying that we have quite a bit of difficulty convincing "IT" types (even being computer science persons ourselves) that this is a reasonable thing to do on the average cluster, even in a university setting. Certainly, in "the cloud" alterations at this level are not tolerated. Further, it is extremely rare to find clusters tailored to system development such that they have master nodes that reboot all the slave nodes with new images and new file systems (and however many more system specific parameters, specified by the developer) for every run. That said, I do recognize that system development is a small sector in HPC and not by any means the most influential customer. However, I do feel it furthers the mantra that we should not all be pigeon-holed into one particularly "efficient" setup just because it works well for the common case (or because Google "invented it"). As totally off-topic points: it is great to see that RGB-Bot has been rebooted (even if only with limited bandwidth), and I absolutely have no idea why Eugen Leitl posted a blog entry from Matt Welsh. I scanned and scanned for some comment from Eugen or the way in which it somehow wrapped into recent conversation, but at this point I'm lost on why it originally got posted (besides being quite the fire-starter). Best, ellis From rchang.lists at gmail.com Thu Oct 21 22:18:32 2010 From: rchang.lists at gmail.com (Richard Chang) Date: Fri, 22 Oct 2010 10:48:32 +0530 Subject: [Beowulf] Anybody using Redhat HPC Solution in their Beowulf Message-ID: <4CC11EA8.8030602@gmail.com> Hello List, My University is going for a new HPC System. I was using Rocks + CentOS until now but someone suggested to use Redhat HPC Solution with the new system. I am not able to find good documentation to setup and use Redhat HPC. It seems, Redhat uses Platform Computing's Platform Cluster Manager re-branded with their(Redhat's) logo, though I may be wrong. For that matter, does anyone use Platform Cluster Manager also?. Thanks, Richard. From eagles051387 at gmail.com Thu Oct 21 22:54:25 2010 From: eagles051387 at gmail.com (Jonathan Aquilina) Date: Fri, 22 Oct 2010 07:54:25 +0200 Subject: [Beowulf] Anybody using Redhat HPC Solution in their Beowulf In-Reply-To: <4CC11EA8.8030602@gmail.com> References: <4CC11EA8.8030602@gmail.com> Message-ID: i have seen in repositories of other distros the red hat cluster suite. im not sure if that is the same thing as you mentioned below. On Fri, Oct 22, 2010 at 7:18 AM, Richard Chang wrote: > Hello List, > My University is going for a new HPC System. I was using Rocks + CentOS > until now but someone suggested to use Redhat HPC Solution with the new > system. > > I am not able to find good documentation to setup and use Redhat HPC. It > seems, Redhat uses Platform Computing's Platform Cluster Manager re-branded > with their(Redhat's) logo, though I may be wrong. For that matter, does > anyone use Platform Cluster Manager also?. > > Thanks, > Richard. > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -- Jonathan Aquilina -------------- next part -------------- An HTML attachment was scrubbed... URL: From scrusan at UR.Rochester.edu Fri Oct 22 00:01:08 2010 From: scrusan at UR.Rochester.edu (Steve Crusan) Date: Fri, 22 Oct 2010 03:01:08 -0400 Subject: [Beowulf] Anybody using Redhat HPC Solution in their Beowulf In-Reply-To: <4CC11EA8.8030602@gmail.com> Message-ID: I've tried it, and honestly, I wasn't impressed. The solution isn't *bad*, but I think if you want more flexibility; go with a combination of xcat + torque + maui. I guess it is a matter of preference. I haven't been doing this work more than a few years, so a senior contributor might have better guidance, but using those tools, building an HPC Cluster isn't like building the pyramids of Egypt. It really all depends on what the governing bodies of your organization want... PS: I looked for docs on that same RH cluster solution also, and had very little luck finding any. On 10/22/10 1:18 AM, "Richard Chang" wrote: > Hello List, > My University is going for a new HPC System. I was using Rocks + CentOS until > now but someone suggested to use Redhat HPC Solution with the new system. > > I am not able to find good documentation to setup and use Redhat HPC. It > seems, Redhat uses Platform Computing's Platform Cluster Manager re-branded > with their(Redhat's) logo, though I may be wrong. For that matter, does > anyone use Platform Cluster Manager also?. > > Thanks, > Richard. > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf ---------------------- Steve Crusan System Administrator Center for Research Computing University of Rochester https://www.crc.rochester.edu/ From john.hearns at mclaren.com Fri Oct 22 01:30:29 2010 From: john.hearns at mclaren.com (Hearns, John) Date: Fri, 22 Oct 2010 09:30:29 +0100 Subject: [Beowulf] Anybody using Redhat HPC Solution in their Beowulf In-Reply-To: <4CC11EA8.8030602@gmail.com> References: <4CC11EA8.8030602@gmail.com> Message-ID: <68A57CCFD4005646957BD2D18E60667B12154605@milexchmb1.mil.tagmclarengroup.com> > Hello List, > My University is going for a new HPC System. I was using Rocks + CentOS > until now but someone suggested to use Redhat HPC Solution with the new > system. Please, please don't "roll your own" system (pun intended). There are lots of companies out there who will provide you with a high quality, supported Beowulf cluster. Heck - some of them are respected controbutors to this list! Seriously - you do not want to be making a decision up front on the cluster management stack before bringing vendors in. You can bring vendors in, let them present what their cluster offering does, and indeed you might learn a lot from them at that stage. The cluster management utilities might then influence who goes forward to the next stage of tendering. And what do I mean by support - well first off hardware - when a node fails it will be fixed or replaced. Second off software - your vendor will help in setting up queueing systems and getting codes running. Third off - storage and drivers. If you get any problems with storage, or with kernel-type problems the vendor will sort them out for you. The contents of this email are confidential and for the exclusive use of the intended recipient. If you receive this email in error you should not copy it, retransmit it, use it or disclose its contents but should return it to the sender immediately and delete your copy. From john.hearns at mclaren.com Fri Oct 22 04:10:37 2010 From: john.hearns at mclaren.com (Hearns, John) Date: Fri, 22 Oct 2010 12:10:37 +0100 Subject: [Beowulf] Bechtolsheim and Arista Networks Message-ID: <68A57CCFD4005646957BD2D18E60667B12154725@milexchmb1.mil.tagmclarengroup.com> http://www.theregister.co.uk/2010/10/21/arista_speed/ He says Arista is in a race to zero latency. That is what's needed in high-frequency financial trading where "every micro-second counts", he says. Hell guys - we can do better. We've got some serious physics talent on this list. Zero-latency for making more money on the exchanges than the other brokers? Heck what we need is less than zero latency. We need switches which anticipate what the next packets are, and have already sent them on to the outgoing port. A bit of time travel - shouldn't be too hard for this list now should it? John Hearns | CFD Hardware Specialist | McLaren Racing Limited McLaren Technology Centre, Chertsey Road, Woking, Surrey GU21 4YH, UK T: +44 (0) 1483 261000 D: +44 (0) 1483 262352 F: +44 (0) 1483 261010 E: john.hearns at mclaren.com W: www.mclaren.com The contents of this email are confidential and for the exclusive use of the intended recipient. If you receive this email in error you should not copy it, retransmit it, use it or disclose its contents but should return it to the sender immediately and delete your copy. From kilian.cavalotti.work at gmail.com Fri Oct 22 04:58:29 2010 From: kilian.cavalotti.work at gmail.com (Kilian CAVALOTTI) Date: Fri, 22 Oct 2010 13:58:29 +0200 Subject: [Beowulf] Bechtolsheim and Arista Networks In-Reply-To: <68A57CCFD4005646957BD2D18E60667B12154725@milexchmb1.mil.tagmclarengroup.com> References: <68A57CCFD4005646957BD2D18E60667B12154725@milexchmb1.mil.tagmclarengroup.com> Message-ID: Hi John, On Fri, Oct 22, 2010 at 1:10 PM, Hearns, John wrote: > Zero-latency for making more money on the exchanges than the other > brokers? Ah, what a waste, when all you really need is twitter. http://www.wired.com/wiredscience/2010/10/twitter-crystal-ball/ Cheers, -- Kilian From rgb at phy.duke.edu Fri Oct 22 05:38:38 2010 From: rgb at phy.duke.edu (Robert G. Brown) Date: Fri, 22 Oct 2010 08:38:38 -0400 (EDT) Subject: [Beowulf] how Google warps your brain In-Reply-To: References: <20101021104323.GV28998@leitl.org> <76097BB0C025054786EFAB631C4A2E3C0948BCAB@MERCMBX03D.na.SAS.com> <4CC065FB.1060905@scalableinformatics.com> Message-ID: On Thu, 21 Oct 2010, Mark Hahn wrote: > I find myself using my desktop more and more as a terminal - I hardly > ever run anything but xterm and google chrome. as such, I don't mind that > it's a terrible old recycled xeon from a 2003 project. it would seem > like a waste of money to buy something modern, (and for me to work locally) > since there are basically infinite resources 1ms away as the packet flies... Again, an ancient (well, as much as anything in computing ever is:-) paradigm. The interesting thing is that people have been engineering, designing, selling lightweight/thin computing models in the personal computer game since maybe 1983 or 1984. I bought one of the very first ones -- it was a straight up PC motherboard-in-a-box with a custom (and enormously expensive) coax-based network interface that ran back to the PC. It leeched all of the devices and some of the software off of the PC. Then there were the Sun offerings -- SLC and ELC diskless machines (or really, any Sun system you liked run diskless) on real networks, where they still booted their OS over the network as well as the software they ran, but were consistent with the "network is the computer" slogan. There was yet another burst of enthusiasm around the time of the release of java -- java was supposed to enable a new kind of thin appliance (and in fact did IIRC -- a few were sold but were a commercial failure). However, none of these models succeeded in the long run. The only thin/remote computing model that has persisted is the xterm/rsh/ssh model on top of Unix (with its many enhancements and variations, including for the most part beowulfery, which with a few exceptions relies on e.g. ssh for remote job distribution and control). I think that this has finally changed. Google in particular is intent on fundamentally changing it and >>really<< making the network (or rather, remote computing cloud) into the computer. Finally, I think the conditions are right for them to succeed where everybody else has failed. It's interesting to think about the conditions that enable this to work (and how they differ from those that faced people in the 80's, 90's, even 00's). a) Computers are now fast enough that it is possible to create a DOUBLE breakout to isolate software from both the hardware (which is what operating sytems were supposed to do) and from the operating system itself, which hasn't done so since people learned that they could make a ton of money selling operating systems and controlling the software market. Up until the last decade, at least some of what people wanted to do required "native" code, written for and compiled for a particular operating system and often for a particular hardware environment underneath the operating system. That is still true for a few things (notably high end games) but very little of the rest of what people do isn't accessible with interpreted or emulated pseudocode. b) Networking is no longer much of a bottleneck. As you say, things are a few ms away as the packet flies. Or, as I sit here, 16 ms ping time away from my desktop at Duke, where I'm sitting at home inside a network with a 7 Mbps pipe to the world. Slashdot has Google preparing to build a 1 Gbps broadband network for Stanford undergrads. TWC and other communications companies are furiously laying fiber to neighborhoods if not homes. It's easy to see their motivation. c) HTML, which was never REALLY intended for it, has morphed into a device independent presentation layer. Browsers, which were never at all intended for it, have morphed into a de facto user extensible psuedo-operating system, capable of downloading and running software both transient and permanent (plug-in extensions as well as e.g. straight up programs). The software for this isn't all quite hardware layer independent yet, but a lot of it is and there is a SEPARATION between the hardware sensitive part and the interface that if nothing else makes it easy to write things that will run on top of plug-ins, not the actual operating system, in an operating system independent way. d) Servers were once expensive and represented a massive investment barrier to remote computing. Only crazed, uberhacker-skilled individuals would set up servers at home, for example. Those services that were remote-offered in home environments or small offices were trivia -- click-controlled shared printer or file access. Only Unix (and ssh/rsh) provided a real remote login/execution environment, and even a Unix tyro was uberhacker compared to a Windows Joe User or an Apple Semi-Luddite User. Providing MORE resources to an unskilled user desktop than the desktop itself could provide to the user by simply spending money on local software required an enormous investment in hardware and near-genius systems engineers -- in other words, resources that only existed inside large corporations, universities, governments, and of course crazed hacker households (like many of ours:-). Google in particular engineered a truly scalable cheap superserver, patiently building the infrastructure from the metal up so that it was virtually infinitely extensible at linear cost. I can't imagine what their server-to-human ratio must be, but I'm guessing thousands to tens of thousands -- orders of magnitude better than the best of the supercomputing centers or corporate or government or household server collectives. No doubt it was expensive to get it all started, but at this point they are reaping the benefits of infinite scalability and it isn't clear to me that ANYONE is going to ever be able to touch them in the foreseeable future. Put all of these together -- oh, and let's not forget to throw in e), the advent of phones and pads and ebooks and so on that are little more than a browser on top of a minimal software stack and a network -- and things truly have changed. Who cares if you are running Linux or Windows or MacOS any more if you are running Google Chrome and it "contains" an integrated office suite, manages your browsing, plays your music and videos, lets you run a wide range of games, and does it all transparently identically, for free, on top of any operating system? Google, and Mozilla/Firefox in direct competition, have basically replaced the operating system with an OPERATING system, because computers are finally fast enough to make 99% of all users happy with a fully emulated/isolated translation layer, because the reliance of the environment on the network is no longer bottlenecked so that many compute-intensive tasks are executed transparently remotely (with the user not knowing or caring what is done where), because the environment is powerful enough to do anything they really care about doing including playing lots of games, because they will soon be able to do most of it on a wide range of handhelds without altering their environment. Indeed, even storage isn't an issue -- Google will cheerfully provide you with as much as you are likely to ever need more cheaply than you can provide it for yourself in exchange for subtle and inobtrusive access to your heart and mind. Which they already have. An anecdote. I am shopping for a telescope, since I have a PDA at Duke that I have to spend down before next semester lest it hit the 'have to give some back' threshold and I'm teaching astronomy these days. A good telescope -- I'm planning to spend ballpark of $2600 for an 8" Celestron Schmidt-Casselgrain capable of decent astrophotography, some good lenses, a cheap (starter) CCD camera. So I've googled/browsed several vendors looking at their offerings. In my igoogle page, guess what ad is insidiously placed somewhere on my screen in one of my add ons, every day? When I visit remote sites that have nothing (on the surface) to do with Google but that have adds placed on the screen, guess what ads are there? It's really remarkable to pay attention to this, because my own entrepreneurial activities have often been related to predictive marketing and the paradox that however much we dislike SPAM and direct marketing advertising in general, it is really because it is all noise, little signal. Google's mobile Orion telescope ad is not noise. It is indeed directly focused on what I'm interested in buying. It isn't lingerie (hmmm, buying that might be fun too:-) or machine tools, or video cameras -- although I'll bet I could stimulate these to appear instead with the right bit of browsing. It is the most expensive (highest margin) thing I'm actually looking at and for. I can't tell if Google is the second coming, arriving at last to kick the butt of the Microsoft antichrist and usher in the millenium, or if they are the antichrist who is simply preparing to eliminate all of the lesser devils and bring about the apocalypse. The scary thing is that Google is a significant portion of my brain -- with its new type-and-it-appears answering system, all that's missing is a neural interface and the ability to back up my memories to a remote silo and I might not even notice my own dying. I cannot imagine living and working without it, but it is starting to remind me of some very scary science fiction novels as what could possibly provide a better opportunity for mind control than an interface that is effectively part of your mind? So what can one do? Google is offering up Chrome-crack with the lavish and unspoken promise -- that I have no doubt that they will keep -- that it will be the last operating environment you ever, um, don't actually buy, that inside a year or two we'll see Chromeputers that may well run linux underneath -- but no one will know or care. That through its magic window you will be able to get to all of your music and movies and personal or professional data (efficiently and remotely stored, backed up and sort-of-secure). That within it PDFs will "just display", movies and music will "just play", email will move, news will be read, documents will be word processed, games will be played, and if you borrow a friend's computer for a day or use your phone or your pad, everything will be right there with nothing more than the inconvenience or convenience of the particular hardware interface to surmount or exploit. It won't end there. Who can provide remote computing resources even for actual computations cheaper than Google? For them, adding a server costs what, five FTE person minutes plus the cost of the cheapest possible hardware itself -- assembly line server prep plus somebody plugging it in? Who can provide server management at their ratio of humans to servers? Who can fund/subsidize most of the power and management cost for your tiny slice of this resource for the right to insert subtle little advertising messages into your brain that are NOT noise, they are indeed things you are likely to buy and hence pure gold for the advertiser? Microsoft is only now starting to realize that Windows 7 might well be the last Windows ever released and is scrabbling to cut a too-little, too-late deal with Yahoo and/or Adobe to try to transform themselves into something they only dimly perceive and understand and cannot now duplicate in time. One thing that has often been discussed on this list is marketing the supercomputer center. People have proposed setting up a big supercomputer center and renting it out to companies or universities that need this sort of resource. In general, the few times this has been tried it has failed, for all sorts of good reasons. As Bill noted, it is difficult enough to set up a center WITHIN a closed environment with captive users and real cash flow -- even though beowulfish clusters are quite scalable, only rarely do they achieve the 1000 node/systems person scaling limit (and then there is the infrastructure cost, depreciation and maintenance and replacement and programming support and the fact that a general purpose center achieves generality at the expense of across-the-board price-performance compromise). Google, OTOH, could do it. In fact, they could do it almost as an afterthought, as a side effect. Inside a decade, I can see Google quite literally owning the data Universe, dwarfing Microsoft and Apple combined and making both of them pretty much irrelevant if not bankrupt. And not just in the United States -- worldwide. Few things in computing have actually scared me. Microsoft is pretty scary, but it is the scariness of a clown -- its monopoly was never really stable once Linux was invented and I think it may have peaked and at long last be on the long road do oblivion. Apple isn't scary -- it is genuinely innovative for which I salute them, but its innovations provide at best a transient advantage and its vision has been too local to take over the world. Even Linux with its avowed goal of world domination hasn't been scary, because ultimately linux belongs to the world and as long as the computers being run on also belong to the world, control remains where it belongs, with the people of the world. Google scares me. It has quietly ACHIEVED world domination, and is about to transform the world in a way that will be shocking, amazing, dangerous, liberating, captivating -- and supremely beyond the control of anybody but the people running Google. Be afraid, be very afraid. Happy Halloween! rgb P.S. -- C'mon, haven't y'all missed my 10K essays? Admit it...;-) Alas, now it is off to grade papers and disappear again. Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu From mdidomenico4 at gmail.com Fri Oct 22 05:51:20 2010 From: mdidomenico4 at gmail.com (Michael Di Domenico) Date: Fri, 22 Oct 2010 08:51:20 -0400 Subject: [Beowulf] Anybody using Redhat HPC Solution in their Beowulf In-Reply-To: <4CC11EA8.8030602@gmail.com> References: <4CC11EA8.8030602@gmail.com> Message-ID: Is there something Rocks is not providing that you need? I haven't used Rocks in a few years, but from what I can recall it's a decent solution for someone just starting out. There certainly are other solutions on the market. I think it comes down to which flavor of HPC software you feel most comfortable administering, not necessarily what other people "think" you should be running. On Fri, Oct 22, 2010 at 1:18 AM, Richard Chang wrote: > Hello List, > My University is going for a new HPC System. I was using Rocks + CentOS until now but someone suggested to use Redhat HPC Solution with the new system. > > I am not able to find good documentation to setup and use Redhat HPC. It seems, Redhat uses Platform Computing's Platform Cluster Manager re-branded with their(Redhat's) logo, though I may be wrong. ?For that matter, does anyone use Platform Cluster Manager also?. > > Thanks, > Richard. > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > From rchang.lists at gmail.com Fri Oct 22 09:23:00 2010 From: rchang.lists at gmail.com (Richard Chang) Date: Fri, 22 Oct 2010 21:53:00 +0530 Subject: [Beowulf] Anybody using Redhat HPC Solution in their Beowulf In-Reply-To: References: <4CC11EA8.8030602@gmail.com> Message-ID: <4CC1BA64.5050200@gmail.com> On 10/22/2010 6:21 PM, Michael Di Domenico wrote: > Is there something Rocks is not providing that you need? I haven't > used Rocks in a few years, but from what I can recall it's a decent > solution for someone just starting out. There certainly are other > solutions on the market. I think it comes down to which flavor of HPC > software you feel most comfortable administering, not necessarily what > other people "think" you should be running. > Thanks Michael, I am satisfied with Rocks, but my group people wanted something well supported that means, RHEL instead of CentOS and "some other PAID" cluster manager instead of Rocks, especially, when we have the budget. Somehow, I have not been able to convince my folks that Rocks is a much better solution than the PAID ones. Thanks for your suggestion anyway!!. regards, Richard. From john.hearns at mclaren.com Fri Oct 22 10:22:49 2010 From: john.hearns at mclaren.com (Hearns, John) Date: Fri, 22 Oct 2010 18:22:49 +0100 Subject: [Beowulf] Anybody using Redhat HPC Solution in their Beowulf In-Reply-To: <4CC1BA64.5050200@gmail.com> References: <4CC11EA8.8030602@gmail.com> <4CC1BA64.5050200@gmail.com> Message-ID: <68A57CCFD4005646957BD2D18E60667B12154977@milexchmb1.mil.tagmclarengroup.com> > > > > Thanks Michael, > > I am satisfied with Rocks, but my group people wanted something well > supported that means, RHEL instead of CentOS I don't think that follows - other people should comment here. and "some other PAID" > cluster manager instead of Rocks, especially, when we have the budget. > > Somehow, I have not been able to convince my folks that Rocks is a much > better solution than the PAID ones. There are plenty of PAID cluster management stacks out there - as I say go talk to some vendors. Maybe take a trip to Supercomputing? Also give youself time to think about SuSE and SLES (or SLES desktop) on the nodes. Redhat is not the only game in town. (Hint - I've put together many Suse clusters for two cluster integrators) The contents of this email are confidential and for the exclusive use of the intended recipient. If you receive this email in error you should not copy it, retransmit it, use it or disclose its contents but should return it to the sender immediately and delete your copy. From rgb at phy.duke.edu Fri Oct 22 13:12:51 2010 From: rgb at phy.duke.edu (Robert G. Brown) Date: Fri, 22 Oct 2010 16:12:51 -0400 (EDT) Subject: [Beowulf] how Google warps your brain In-Reply-To: <011801cb720f$96b08da0$c411a8e0$@comcast.net> References: <011801cb720f$96b08da0$c411a8e0$@comcast.net> Message-ID: On Fri, 22 Oct 2010, William Harman wrote: > Rgb makes many good points (and should change his profession to that of a > futurist (compliment intended)) but one thing I believe needs to be put in > place, for the omnipresence of this type of technological world - in a word > - power. Whatever device you use and wherever you use it, you need a source > of power. Devices or appliances that have power for a few hours or days > will not cut it. I still prefer good old hard cover books to ebooks, which > I can read after the evening meal and outside with some fresh air, no > extension cord needed to keep my notebook juiced up. Now if I had a cold > fusion battery pack that lasted for years, (or extracted power from the > ether) I could take my notebook, netbook or any other device and go and live > happily ever after :-) You can read a kindle for maybe two or three weeks on a charge. e-ink consumes no power except when pages change. I don't know why they don't sell it with a solar cell covering the back. It would run forever if it recharged and stayed charged every time you put it down with the cell facing the light. A kindle holds well over 1000 books. Try jamming those into your backback when you go camping or on vacation. And this is passe -- if they'd built the kindle with a SD slot, they'd hold an infinite number of books -- a rather large book, formatted, with some pictures, is a MB. Who can read 64,000 to 100,000 books (what a Kindle with a 64 GB static memory would hold)? Finally, you can fill your Kindle for free -- most of the greatest works of human literature are out of copyright and available for free at project Gutenberg and elsewhere. If there is one more motivation required, if you do manage to read the last book stored, you can power up its wireless and suddenly it is a bookstore, and in two minutes you can be reading the latest bestseller, usually for less money than it would cost in paper. I've bought books on long bus rides, right there from my seat on the bus in motion. The Sony and Nook have basically the same advantages. Ipads I agree don't have the longevity, but there are lots of people working on ultra-low power, fast, color displays to compete with or replace relatively slow E-ink. I love books. I have a personal library with well over 1000 novels (it fills four or five full size bookshelves, most of the shelves stacked two deep with paperbacks and with stacks left out all over the floor in one of the rooms of my house. But books are deader than a doorknob. I >>wish<< I could put them all on a single device and have my library with me, the same way that my entire music collection is sitting next to my right elbow at this moment, playing Live Dead at the Fillmore, instead of being on perishable media that deteriorates over time, is easy to break or lose, and that you have to repurchase every time somebody fiddles the distribution/playback mechanism. If I could feed them into a scanner that would translate every one into latex and thence into formatted, readable pdf or epub, one page at a time, I'd be in there tearing them to pieces to feed the maw. Except for old hardbacks and books of particular value. Maybe. rgb > > - cheers > > Bill Harman, > P - (801) 572-9252; F - (801) 571-4927 > billharman at comcast.net > billharman1027 at gmail.com > skype name: harman8015729252 > skype phone: +1 (801) 938-4764 > > -----Original Message----- > From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On > Behalf Of Robert G. Brown > Sent: Friday, October 22, 2010 6:39 AM > To: Beowulf Mailing List > Subject: RE: [Beowulf] how Google warps your brain > > On Thu, 21 Oct 2010, Mark Hahn wrote: > >> I find myself using my desktop more and more as a terminal - I hardly >> ever run anything but xterm and google chrome. as such, I don't mind that > >> it's a terrible old recycled xeon from a 2003 project. it would seem >> like a waste of money to buy something modern, (and for me to work > locally) >> since there are basically infinite resources 1ms away as the packet > flies... > > Again, an ancient (well, as much as anything in computing ever is:-) > paradigm. The interesting thing is that people have been engineering, > designing, selling lightweight/thin computing models in the personal > computer game since maybe 1983 or 1984. I bought one of the very first > ones -- it was a straight up PC motherboard-in-a-box with a custom (and > enormously expensive) coax-based network interface that ran back to the > PC. It leeched all of the devices and some of the software off of the > PC. Then there were the Sun offerings -- SLC and ELC diskless machines > (or really, any Sun system you liked run diskless) on real networks, > where they still booted their OS over the network as well as the > software they ran, but were consistent with the "network is the > computer" slogan. There was yet another burst of enthusiasm around the > time of the release of java -- java was supposed to enable a new kind of > thin appliance (and in fact did IIRC -- a few were sold but were a > commercial failure). > > However, none of these models succeeded in the long run. The only > thin/remote computing model that has persisted is the xterm/rsh/ssh > model on top of Unix (with its many enhancements and variations, > including for the most part beowulfery, which with a few exceptions > relies on e.g. ssh for remote job distribution and control). > > I think that this has finally changed. Google in particular is intent > on fundamentally changing it and >>really<< making the network (or > rather, remote computing cloud) into the computer. Finally, I think the > conditions are right for them to succeed where everybody else has > failed. > > It's interesting to think about the conditions that enable this to work > (and how they differ from those that faced people in the 80's, 90's, > even 00's). > > a) Computers are now fast enough that it is possible to create a > DOUBLE breakout to isolate software from both the hardware (which is > what operating sytems were supposed to do) and from the operating system > itself, which hasn't done so since people learned that they could make a > ton of money selling operating systems and controlling the software > market. Up until the last decade, at least some of what people wanted > to do required "native" code, written for and compiled for a particular > operating system and often for a particular hardware environment > underneath the operating system. That is still true for a few things > (notably high end games) but very little of the rest of what people do > isn't accessible with interpreted or emulated pseudocode. > > b) Networking is no longer much of a bottleneck. As you say, things > are a few ms away as the packet flies. Or, as I sit here, 16 ms ping > time away from my desktop at Duke, where I'm sitting at home inside a > network with a 7 Mbps pipe to the world. Slashdot has Google preparing > to build a 1 Gbps broadband network for Stanford undergrads. TWC and > other communications companies are furiously laying fiber to > neighborhoods if not homes. It's easy to see their motivation. > > c) HTML, which was never REALLY intended for it, has morphed into a > device independent presentation layer. Browsers, which were never at > all intended for it, have morphed into a de facto user extensible > psuedo-operating system, capable of downloading and running software > both transient and permanent (plug-in extensions as well as e.g. > straight up programs). The software for this isn't all quite hardware > layer independent yet, but a lot of it is and there is a SEPARATION > between the hardware sensitive part and the interface that if nothing > else makes it easy to write things that will run on top of plug-ins, not > the actual operating system, in an operating system independent way. > > d) Servers were once expensive and represented a massive investment > barrier to remote computing. Only crazed, uberhacker-skilled > individuals would set up servers at home, for example. Those services > that were remote-offered in home environments or small offices were > trivia -- click-controlled shared printer or file access. Only Unix > (and ssh/rsh) provided a real remote login/execution environment, and > even a Unix tyro was uberhacker compared to a Windows Joe User or an > Apple Semi-Luddite User. Providing MORE resources to an unskilled user > desktop than the desktop itself could provide to the user by simply > spending money on local software required an enormous investment in > hardware and near-genius systems engineers -- in other words, resources > that only existed inside large corporations, universities, governments, > and of course crazed hacker households (like many of ours:-). > > Google in particular engineered a truly scalable cheap superserver, > patiently building the infrastructure from the metal up so that it was > virtually infinitely extensible at linear cost. I can't imagine what > their server-to-human ratio must be, but I'm guessing thousands to tens > of thousands -- orders of magnitude better than the best of the > supercomputing centers or corporate or government or household server > collectives. No doubt it was expensive to get it all started, but at > this point they are reaping the benefits of infinite scalability and it > isn't clear to me that ANYONE is going to ever be able to touch them in > the foreseeable future. > > Put all of these together -- oh, and let's not forget to throw in e), > the advent of phones and pads and ebooks and so on that are little more > than a browser on top of a minimal software stack and a network -- and > things truly have changed. > > Who cares if you are running Linux or Windows or MacOS any more if you > are running Google Chrome and it "contains" an integrated office suite, > manages your browsing, plays your music and videos, lets you run a wide > range of games, and does it all transparently identically, for free, on > top of any operating system? Google, and Mozilla/Firefox in direct > competition, have basically replaced the operating system with an > OPERATING system, because computers are finally fast enough to make 99% > of all users happy with a fully emulated/isolated translation layer, > because the reliance of the environment on the network is no longer > bottlenecked so that many compute-intensive tasks are executed > transparently remotely (with the user not knowing or caring what is > done where), because the environment is powerful enough to do anything > they really care about doing including playing lots of games, because > they will soon be able to do most of it on a wide range of handhelds > without altering their environment. Indeed, even storage isn't an issue > -- Google will cheerfully provide you with as much as you are likely to > ever need more cheaply than you can provide it for yourself in exchange > for subtle and inobtrusive access to your heart and mind. > > Which they already have. An anecdote. I am shopping for a telescope, > since I have a PDA at Duke that I have to spend down before next > semester lest it hit the 'have to give some back' threshold and I'm > teaching astronomy these days. A good telescope -- I'm planning to > spend ballpark of $2600 for an 8" Celestron Schmidt-Casselgrain capable > of decent astrophotography, some good lenses, a cheap (starter) CCD > camera. So I've googled/browsed several vendors looking at their > offerings. > > In my igoogle page, guess what ad is insidiously placed somewhere on my > screen in one of my add ons, every day? When I visit remote sites that > have nothing (on the surface) to do with Google but that have adds > placed on the screen, guess what ads are there? It's really remarkable > to pay attention to this, because my own entrepreneurial activities have > often been related to predictive marketing and the paradox that however > much we dislike SPAM and direct marketing advertising in general, it is > really because it is all noise, little signal. Google's mobile Orion > telescope ad is not noise. It is indeed directly focused on what I'm > interested in buying. It isn't lingerie (hmmm, buying that might be fun > too:-) or machine tools, or video cameras -- although I'll bet I could > stimulate these to appear instead with the right bit of browsing. It is > the most expensive (highest margin) thing I'm actually looking at and > for. > > I can't tell if Google is the second coming, arriving at last to kick > the butt of the Microsoft antichrist and usher in the millenium, or if > they are the antichrist who is simply preparing to eliminate all of the > lesser devils and bring about the apocalypse. The scary thing is that > Google is a significant portion of my brain -- with its new > type-and-it-appears answering system, all that's missing is a neural > interface and the ability to back up my memories to a remote silo and I > might not even notice my own dying. I cannot imagine living and working > without it, but it is starting to remind me of some very scary science > fiction novels as what could possibly provide a better opportunity for > mind control than an interface that is effectively part of your mind? > > So what can one do? Google is offering up Chrome-crack with the lavish > and unspoken promise -- that I have no doubt that they will keep -- that > it will be the last operating environment you ever, um, don't actually > buy, that inside a year or two we'll see Chromeputers that may well run > linux underneath -- but no one will know or care. That through its > magic window you will be able to get to all of your music and movies and > personal or professional data (efficiently and remotely stored, backed > up and sort-of-secure). That within it PDFs will "just display", movies > and music will "just play", email will move, news will be read, > documents will be word processed, games will be played, and if you > borrow a friend's computer for a day or use your phone or your pad, > everything will be right there with nothing more than the inconvenience > or convenience of the particular hardware interface to surmount or > exploit. > > It won't end there. Who can provide remote computing resources even for > actual computations cheaper than Google? For them, adding a server > costs what, five FTE person minutes plus the cost of the cheapest > possible hardware itself -- assembly line server prep plus somebody > plugging it in? Who can provide server management at their ratio of > humans to servers? Who can fund/subsidize most of the power and > management cost for your tiny slice of this resource for the right to > insert subtle little advertising messages into your brain that are NOT > noise, they are indeed things you are likely to buy and hence pure gold > for the advertiser? Microsoft is only now starting to realize that > Windows 7 might well be the last Windows ever released and is scrabbling > to cut a too-little, too-late deal with Yahoo and/or Adobe to try to > transform themselves into something they only dimly perceive and > understand and cannot now duplicate in time. > > One thing that has often been discussed on this list is marketing the > supercomputer center. People have proposed setting up a big > supercomputer center and renting it out to companies or universities > that need this sort of resource. In general, the few times this has > been tried it has failed, for all sorts of good reasons. As Bill noted, > it is difficult enough to set up a center WITHIN a closed environment > with captive users and real cash flow -- even though beowulfish clusters > are quite scalable, only rarely do they achieve the 1000 node/systems > person scaling limit (and then there is the infrastructure cost, > depreciation and maintenance and replacement and programming support > and the fact that a general purpose center achieves generality at the > expense of across-the-board price-performance compromise). > > Google, OTOH, could do it. In fact, they could do it almost as an > afterthought, as a side effect. Inside a decade, I can see Google quite > literally owning the data Universe, dwarfing Microsoft and Apple > combined and making both of them pretty much irrelevant if not bankrupt. > And not just in the United States -- worldwide. > > Few things in computing have actually scared me. Microsoft is pretty > scary, but it is the scariness of a clown -- its monopoly was never > really stable once Linux was invented and I think it may have peaked and > at long last be on the long road do oblivion. Apple isn't scary -- it > is genuinely innovative for which I salute them, but its innovations > provide at best a transient advantage and its vision has been too local > to take over the world. Even Linux with its avowed goal of world > domination hasn't been scary, because ultimately linux belongs to the > world and as long as the computers being run on also belong to the > world, control remains where it belongs, with the people of the world. > > Google scares me. It has quietly ACHIEVED world domination, and is > about to transform the world in a way that will be shocking, amazing, > dangerous, liberating, captivating -- and supremely beyond the control > of anybody but the people running Google. > > Be afraid, be very afraid. > > Happy Halloween! > > rgb > > P.S. -- C'mon, haven't y'all missed my 10K essays? Admit it...;-) > > Alas, now it is off to grade papers and disappear again. > > Robert G. Brown http://www.phy.duke.edu/~rgb/ > Duke University Dept. of Physics, Box 90305 > Durham, N.C. 27708-0305 > Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > -- > This message has been scanned for viruses and > dangerous content by MailScanner, and is > believed to be clean. > Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu From jlb17 at duke.edu Fri Oct 22 15:26:11 2010 From: jlb17 at duke.edu (Joshua Baker-LePain) Date: Fri, 22 Oct 2010 18:26:11 -0400 (EDT) Subject: [Beowulf] Anybody using Redhat HPC Solution in their Beowulf In-Reply-To: <68A57CCFD4005646957BD2D18E60667B12154605@milexchmb1.mil.tagmclarengroup.com> References: <4CC11EA8.8030602@gmail.com> <68A57CCFD4005646957BD2D18E60667B12154605@milexchmb1.mil.tagmclarengroup.com> Message-ID: On Fri, 22 Oct 2010 at 9:30am, Hearns, John wrote >> Hello List, >> My University is going for a new HPC System. I was using Rocks + > CentOS >> until now but someone suggested to use Redhat HPC Solution with the > new >> system. > > Please, please don't "roll your own" system (pun intended). > There are lots of companies out there who will provide you with a high > quality, supported > Beowulf cluster. Heck - some of them are respected controbutors to this > list! IMO, in the right set of circumstances there's absolutely nothing wrong with rolling your own cluster. For example, say you aleady have the expertise in house or are budgeting a FTE into the cluster costs (not to mention "free" undergrad or grad student labor). In those cases, why pay extra for integration you can do yourself? There's certainly a market for cluster vendors, but there's also a place for DIY. -- Joshua "of course, I could be biased" Baker-LePain QB3 Shared Cluster Sysadmin UCSF From samuel at unimelb.edu.au Fri Oct 22 16:04:39 2010 From: samuel at unimelb.edu.au (Chris Samuel) Date: Sat, 23 Oct 2010 10:04:39 +1100 Subject: [Beowulf] Anybody using Redhat HPC Solution in their Beowulf In-Reply-To: <4CC1BA64.5050200@gmail.com> References: <4CC11EA8.8030602@gmail.com> <4CC1BA64.5050200@gmail.com> Message-ID: <201010231004.41287.samuel@unimelb.edu.au> On Sat, 23 Oct 2010 03:23:00 am Richard Chang wrote: > I am satisfied with Rocks, but my group people wanted > something well supported that means, RHEL instead of > CentOS and "some other PAID" cluster manager instead > of Rocks, especially, when we have the budget. I believe you can pay IBM and others for support for xCAT. -- Christopher Samuel Senior Systems Administrator VLSCI - Victorian Life Sciences Computational Initiative Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ From rchang.lists at gmail.com Fri Oct 22 21:11:40 2010 From: rchang.lists at gmail.com (Richard Chang) Date: Sat, 23 Oct 2010 09:41:40 +0530 Subject: [Beowulf] Anybody using Redhat HPC Solution in their Beowulf In-Reply-To: References: <4CC11EA8.8030602@gmail.com> Message-ID: <4CC2607C.40605@gmail.com> On 10/22/2010 11:26 PM, Alex Chekholko wrote: > The RH HPC mailing list suggests this project is inactive: > https://www.redhat.com/archives/rhel-hpc-list/ > Thanks Alex, I didn't check that. I never knew that an inactive mailing list means an in-active project. > You can get an "evaluation" version of the product, I think, have you > tried that? I'm not sure it provides you with any better > functionality/support than ROCKS. I have downloaded the evaluation version and I don't have free machines or enough memory in my system to check it in a virtual machine. I was planning to have a check out the Beowulf mailing list to be see if anyone else is using it. >From what I see, no one has or is using such a solution. regards, Richard. From daniel.kidger at bull.co.uk Mon Oct 25 02:54:19 2010 From: daniel.kidger at bull.co.uk (Daniel Kidger) Date: Mon, 25 Oct 2010 10:54:19 +0100 Subject: [Beowulf] how Google warps your brain In-Reply-To: References: <011801cb720f$96b08da0$c411a8e0$@comcast.net> Message-ID: <4CC553CB.8070606@bull.co.uk> An HTML attachment was scrubbed... URL: From john.hearns at mclaren.com Mon Oct 25 04:24:28 2010 From: john.hearns at mclaren.com (Hearns, John) Date: Mon, 25 Oct 2010 12:24:28 +0100 Subject: [Beowulf] how Google warps your brain In-Reply-To: <4CC553CB.8070606@bull.co.uk> References: <011801cb720f$96b08da0$c411a8e0$@comcast.net> <4CC553CB.8070606@bull.co.uk> Message-ID: <68A57CCFD4005646957BD2D18E60667B12154B9E@milexchmb1.mil.tagmclarengroup.com> Ok - so this is a bit off-topic but in my opinion the *only* music format that will be guaranteed readable in say 100 years time is vinyl and the only document format that endures will be ink on paper. SD cards, CDs, DVDs et al. will all become obsolete as technology progresses, and even if they didn't then they will suffer from bit rot. Academics are already finding that the CDs they burnt of their research a few years ago are no longer readable. The same goes for a lot of digital data. We might still have the medium (tapes, optical disks etc) but the physical drives are long gone, or the computers with the OSes to hook up to the drives are gone. I have a copy of my thesis on a DEC TK50 cartridge. I haven't seen one for ages. They are still around - I spotted a MicroVAX at a major UK aerospace manufacturer a couple of years ago, but in a few years TK50 drives will be history. Same goes for the medical imaging data I used to look after on real glass magneto-optical drives - I'd guess they are all in the bin by now! The contents of this email are confidential and for the exclusive use of the intended recipient. If you receive this email in error you should not copy it, retransmit it, use it or disclose its contents but should return it to the sender immediately and delete your copy. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rgb at phy.duke.edu Mon Oct 25 07:53:58 2010 From: rgb at phy.duke.edu (Robert G. Brown) Date: Mon, 25 Oct 2010 10:53:58 -0400 (EDT) Subject: [Beowulf] how Google warps your brain In-Reply-To: <4CC553CB.8070606@bull.co.uk> References: <011801cb720f$96b08da0$c411a8e0$@comcast.net> <4CC553CB.8070606@bull.co.uk> Message-ID: On Mon, 25 Oct 2010, Daniel Kidger wrote: > Ok - so this is a bit off-topic but in my opinion the *only* music format > that will be guaranteed readable in say 100 years time is vinyl and the only > document format that endures will be ink on paper. > > SD cards, CDs, DVDs et al. will all become obsolete as technology > progresses, and even if they didn't then they will suffer from bit rot. > Academics are already finding that the CDs they burnt of their research a > few years ago are no longer readable. > > Also electronic copies of old books do not carry the depth of information > that the original had. Not just that the formatting gets changed but you > also lose the smell of an old book, the yellowing of the pages, odd pencil > notes in the margins (*) that give that work its character and depth. > > The only alternative for longevity is to post our writings on the Internet - > such posts will last until the end of our civilization (**) Actually, I think this (interesting) hypothesis is subject to information theoretic analysis. The probability of reliable transmission of a given message is related to a mix of the lifetime of the physical copy plus how often it is replicated and translated. You're arguing that vinyl records have a 100 year lifetime (although I think that this has yet to be proven and seems dubious, certain not without information degradation). Books do have a long lifetime, but the mean lifetime of most books is actually quite short. I'm remarkable in that I've got a library with lots of books that are 40-110 years old, where "lots" means "way, way fewer than I have new books". And one gets to where one doesn't dare to actually read (say) my copy of The Count of Monte Cristo from the 1800's as the pages are so yellowed and brittle, or my first edition Tarzan of the Apes. They don't suffer "bit errors" per se -- they lose whole pages of text, corners, and more. Print fades, dirt corrupts, insects eat. There are countless books that have simply vanished forever. One run of a few thousand copies, and nobody kept any. Gone. This is especially evident in the preservation of manuscript copies from the last 2500 or so years. Things that increase the probable life of information are: a) Multiple copies. Passenger pigeons may be robust, but once the number of copies drops below a critical point, they are gone. E. Coli we will always have with us (possibly in a constantly changing form) because there are so very many copies, so very widely spread. b) Robust encoding. Again, manuscript and typeset books are typically not error free, and suffer from all of the usual transmission errors (including the "transmission" from the past to the future) studied by Shannon on the appropriate basis of the media in question. There are all sorts of methods designed to reduce transmission errors (one of the best of which is multiple copies on top of any others methods used). Lossless encoding is obviously preferrable to lossy encoding -- in music with things like mp3 and ogg this is an issue, but... c) Open standards for encoding mechanisms minimize the likelihood of losing the rosetta stone that allows even lossy formats to be decoded, and hence useful. If you like, one has to also preserve the encoding scheme along with the encoded information. At the moment, the internet has if anything VASTLY INCREASED a, b and c for every single document in the public domain that has been ported to, e.g. Project Gutenberg. Right now, I'm sitting on a cache of "Saint" books, by Leslie Charteris (who was a great favorite of mine growing up and still is). Try to find a copy. I'm sure my copies aren't the ONLY copies extant of any of these books, but fewer and fewer copies can be found, and only with great effort and at great price. There were never that many hard covers, and the paperbacks just went through a few printings (and are now around 40-50 years old and weren't designed to last). Nobody is going to reprint the Saint stories. They are a gay fantasy from another time, a swashbuckling series with a delightful conceit and innnocent heart. The only way they will ever be preserved for posterity is if they would come out of copyright so people like me could throw them out there into the Internet. Then indeed, as you say, they might well last to the end of civilization. Replicate them a few million times, PERPETUATE them from generation to generation by renewing the copies, and backing them up, and recopying them in formats where they are still useful. If you rely on the few hundred copies of some of these books that likely still exist in the world on paper, well -- don't. It was cheap paper. > (*) remember Fermat's margin comment in his copy of Diophantus's Arithmetica > - would he have written that if he had a Kimble? Or, to put it differently, suppose every single human on the planet had access to the modern equivalent of Diophantus's Arithmetica on their computer, their Kindle, their Ipad -- as in fact they do -- how many more Fermats might we have had over the centuries? And the ability to scrawl marginal notes is mere software, and not that far away. rgb > > (**) which has a faint chance of lasting those 100 years. > > Daniel > > -- > Bull, Architect of an Open World TM > > Dr. Daniel Kidger, HPC Technical Consultant > daniel.kidger at bull.co.uk > +44 (0) 7966822177 > > Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu From peter.st.john at gmail.com Mon Oct 25 08:24:09 2010 From: peter.st.john at gmail.com (Peter St. John) Date: Mon, 25 Oct 2010 11:24:09 -0400 Subject: [Beowulf] how Google warps your brain In-Reply-To: <4CC553CB.8070606@bull.co.uk> References: <011801cb720f$96b08da0$c411a8e0$@comcast.net> <4CC553CB.8070606@bull.co.uk> Message-ID: Just want to add that some thousands of years ago, we had the same issue moving from stone+chisel to paper+ink. Ink fades, paper mildews and worse, paper is flammable. The many burnings of the library at Alexandria (and practically every other ancient major library) could be seen as proof that we should have stuck with stone. But paper is vastly more transmissable (as RGB emphasizes), so we have vastly more authentic ancient texts (copied) on paper, than we do from monumental inscriptions. It may be awkward and not cheap to read a disk with a defunct format, but we certainly can if we want to. It's not lost technology like the recipe for purple dye :-) it's just not worth mass producing cheap readers for our 360K floppies. Everything worthwhile from that time has already been uploaded to the net somewhere. But you can find a lab with a magnet that can read anything in your basement. And in another generation we'll be able to download a 360K floppy controller onto our desktop rapid prototyper. Peter On Mon, Oct 25, 2010 at 5:54 AM, Daniel Kuidger wrote: > > I love books. I have a personal library with well over 1000 novels (it > fills four or five full size bookshelves, most of the shelves stacked > two deep with paperbacks and with stacks left out all over the floor in > one of the rooms of my house. But books are deader than a doorknob. > I wish I could put them all on a single device and have my library > with me, the same way that my entire music collection is sitting next to > my right elbow at this moment, playing Live Dead at the Fillmore, > instead of being on perishable media that deteriorates over time, is > easy to break or lose, and that you have to repurchase every time > somebody fiddles the distribution/playback mechanism. > > > Ok - so this is a bit off-topic but in my opinion the *only* music format > that will be guaranteed readable in say 100 years time is vinyl and the only > document format that endures will be ink on paper. > > SD cards, CDs, DVDs et al. will all become obsolete as technology > progresses, and even if they didn't then they will suffer from bit rot. > Academics are already finding that the CDs they burnt of their research a > few years ago are no longer readable. > > Also electronic copies of old books do not carry the depth of information > that the original had. Not just that the formatting gets changed but you > also lose the smell of an old book, the yellowing of the pages, odd pencil > notes in the margins (*) that give that work its character and depth. > > > The only alternative for longevity is to post our writings on the Internet > - such posts will last until the end of our civilization (**) > > > (*) remember Fermat's margin comment in his copy of Diophantus's *Arithmetica > -* would he have written that if he had a Kimble? > > (**) which has a faint chance of lasting those 100 years. > > Daniel > > -- > Bull, Architect of an Open World TM > > Dr. Daniel Kidger, HPC Technical Consultantdaniel.kidger at bull.co.uk > +44 (0) 7966822177 > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.hearns at mclaren.com Mon Oct 25 08:26:30 2010 From: john.hearns at mclaren.com (Hearns, John) Date: Mon, 25 Oct 2010 16:26:30 +0100 Subject: [Beowulf] how Google warps your brain In-Reply-To: References: <011801cb720f$96b08da0$c411a8e0$@comcast.net><4CC553CB.8070606@bull.co.uk> Message-ID: <68A57CCFD4005646957BD2D18E60667B12154CC6@milexchmb1.mil.tagmclarengroup.com> As usual, a highly insightful post from RGB. > a) Multiple copies. Passenger pigeons may be robust, but once the number of copies drops below a critical point, they are gone. E. Coli we will always have > with us (possibly in a constantly changing form) because there are so very many copies, so very widely spread. I probably shouldn't mention Wikileaks here... > > At the moment, the internet has if anything VASTLY INCREASED a, b and c > for every single document in the public domain that has been ported to, > e.g. Project Gutenberg. > > Right now, I'm sitting on a cache of "Saint" books, by Leslie Charteris > (who was a great favorite of mine growing up and still is). > > Nobody is going to reprint the Saint stories. They are a gay fantasy > from another time, Simon Templar? Gay? Cough. Next you will be telling me that there are gay undertones in Top Gun, the film with the sexiest astrophysicist ever. > might well last to the end of civilization. Replicate them a few > million times, PERPETUATE them from generation to generation by > renewing > the copies, and backing them up, and recopying them in formats where > they are still useful. The cloud backup providers will be keeping copies of data on geographically spread sites. However, we should at this stage be asking what are the mechanisms for cloud storage companies for *) living wills - what happens when the company goes bust *) what are the strategies for migrating the data onto new storage formats > > Or, to put it differently, suppose every single human on the planet had > access to the modern equivalent of Diophantus's Arithmetica on their > computer, their Kindle, their Ipad I believe that was the original intent for the Web. Still under development! The contents of this email are confidential and for the exclusive use of the intended recipient. If you receive this email in error you should not copy it, retransmit it, use it or disclose its contents but should return it to the sender immediately and delete your copy. From kilian.cavalotti.work at gmail.com Mon Oct 25 08:56:20 2010 From: kilian.cavalotti.work at gmail.com (Kilian CAVALOTTI) Date: Mon, 25 Oct 2010 17:56:20 +0200 Subject: [Beowulf] how Google warps your brain In-Reply-To: References: <011801cb720f$96b08da0$c411a8e0$@comcast.net> <4CC553CB.8070606@bull.co.uk> Message-ID: On Mon, Oct 25, 2010 at 4:53 PM, Robert G. Brown wrote: > the mean lifetime of most books With all due respect (and a lot is due), using HTML tags to mark emphasis using a console-only email client, *this* /is/ quite _twisted_. :) > ?c) Open standards for encoding mechanisms minimize the likelihood of > losing the rosetta stone that allows even lossy formats to be decoded, > and hence useful. ?If you like, one has to also preserve the encoding > scheme along with the encoded information. I would add that being able to easily reconstruct a (physical or logical) codec system is a mandatory requirement, but that being able to decode content with no other physical device than the mere support is a big plus. That's precisely the huge advantage of printed books over any electronical support you can imagine: you don't need anything but your eyes and candle light to extract content from them. It's true that languages and grammar evolve, and that content from printed books can also be lost, if the language they're printed in disappears. But that's the case for any kind of text, whatever the support is. About physical supports: optical media gets scratched, magnetic media gets demagnetized, electronic media gets obsoleted, paper media degrades, stone engraved media takes a lot of room on your shelves. There's no such thing as a universally good and eternal support. Ever-going duplication is probably the only way to preserve content on the long run. Cheers, -- Kilian From james.p.lux at jpl.nasa.gov Mon Oct 25 09:03:08 2010 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Mon, 25 Oct 2010 09:03:08 -0700 Subject: [Beowulf] how Google warps your brain In-Reply-To: Message-ID: On 10/25/10 7:53 AM, "Robert G. Brown" wrote: > the encoded information. > Nobody is going to reprint the Saint stories. They are a gay fantasy > from another time, a swashbuckling series with a delightful conceit and > innnocent heart. The only way they will ever be preserved for posterity > is if they would come out of copyright so people like me could > throw them out there into the Internet. A bona-fide library can make single copies of a "hard to get" work, even in copyright. However, they can't "throw it out on the internet". And, of course, you could scan the book for your own amusement, and make arrangements that the original book (should it survive) and your scans are passed on to a single person, which I think would be legal under the first sale doctrine. And, then, assuming that sometime in the future, there isn't a repeat of the "preserve the Disney copyright forever" act, the book *would* fall out of copyright. It's interesting: I just got an iPad a few weeks ago, mostly as a reader/web-browser device, and I've been reading a variety of out-of-copyright works: H. Rider Haggard, Joseph Conrad, Mark Twain. Thank you Gutenberg Project! And, since I am sitting/lying here with a very sore back from moving boxes of books around this weekend looking for that book that I *know* is in there somewhere, the prospect of some magic box that would scan all my books into a format usable into eternity would be quite nice. I might even think that a personal "print on demand" would be nice that could generate a cheap/quick copy for reading in bed(yes, the iPad and Kindle, etc., are nice, but there's affordances provided by the paper edition that is nice.. But I don't need hardcover or, even, any cover..) (or, even better, a service that has scanned all the books for me, e.g. Google, and that upon receiving some proof of ownership of the physical book, lets me have an electronic copy of the same... I'd gladly pay some nominal fee for such a thing, providing it wasn't for some horrible locked, time limited format which depends on the original vendor being in business 20 years from now. I also recognize the concern about how "once in digital form, copying becomes very cheap" which I think is valid.) From kilian.cavalotti.work at gmail.com Mon Oct 25 09:13:10 2010 From: kilian.cavalotti.work at gmail.com (Kilian CAVALOTTI) Date: Mon, 25 Oct 2010 18:13:10 +0200 Subject: [Beowulf] how Google warps your brain In-Reply-To: <68A57CCFD4005646957BD2D18E60667B12154CC6@milexchmb1.mil.tagmclarengroup.com> References: <011801cb720f$96b08da0$c411a8e0$@comcast.net> <4CC553CB.8070606@bull.co.uk> <68A57CCFD4005646957BD2D18E60667B12154CC6@milexchmb1.mil.tagmclarengroup.com> Message-ID: On Mon, Oct 25, 2010 at 5:26 PM, Hearns, John wrote: > Next you will be telling me that there are gay undertones in Top Gun, > the film with the sexiest astrophysicist ever. I beg to differ: the sexiest physicist (albeit not the astro- type, rather the nuclear physics one) is known to "only come once a year". [http://en.wikipedia.org/wiki/Christmas_Jones] Cheers, -- Kilian From eugen at leitl.org Mon Oct 25 09:25:10 2010 From: eugen at leitl.org (Eugen Leitl) Date: Mon, 25 Oct 2010 18:25:10 +0200 Subject: [Beowulf] how Google warps your brain In-Reply-To: References: Message-ID: <20101025162510.GI28998@leitl.org> On Mon, Oct 25, 2010 at 09:03:08AM -0700, Lux, Jim (337C) wrote: > And, since I am sitting/lying here with a very sore back from moving boxes > of books around this weekend looking for that book that I *know* is in there > somewhere, the prospect of some magic box that would scan all my books into > a format usable into eternity would be quite nice. I might even think that > a personal "print on demand" would be nice that could generate a cheap/quick > copy for reading in bed(yes, the iPad and Kindle, etc., are nice, but > there's affordances provided by the paper edition that is nice.. But I don't > need hardcover or, even, any cover..) I've heard of a rumor, that while certainly free-books.dontexist.c?m there's something out there, called Library Genesis. That mythical place purportedly has some 320 k volumes, the deluge coming pouring down with a torrent of rain, and only takes a native Southern Athabascan, a set of training wheels http://blogs.loveandnature.co.za/www/perl/php.jpg and a never-forgetting elephant, in the (about 4 tebibyte-sized) room to call your own. That's all pretty strange, I know. > (or, even better, a service that has scanned all the books for me, e.g. > Google, and that upon receiving some proof of ownership of the physical > book, lets me have an electronic copy of the same... I'd gladly pay some > nominal fee for such a thing, providing it wasn't for some horrible locked, > time limited format which depends on the original vendor being in business > 20 years from now. I also recognize the concern about how "once in digital > form, copying becomes very cheap" which I think is valid.) -- Eugen* Leitl leitl http://leitl.org ______________________________________________________________ ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org 8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE From rgb at phy.duke.edu Mon Oct 25 14:04:36 2010 From: rgb at phy.duke.edu (Robert G. Brown) Date: Mon, 25 Oct 2010 17:04:36 -0400 (EDT) Subject: [Beowulf] how Google warps your brain In-Reply-To: <68A57CCFD4005646957BD2D18E60667B12154CC6@milexchmb1.mil.tagmclarengroup.com> References: <011801cb720f$96b08da0$c411a8e0$@comcast.net><4CC553CB.8070606@bull.co.uk> <68A57CCFD4005646957BD2D18E60667B12154CC6@milexchmb1.mil.tagmclarengroup.com> Message-ID: On Mon, 25 Oct 2010, Hearns, John wrote: > > As usual, a highly insightful post from RGB. Aw, again. > Simon Templar? Gay? Cough. > > Next you will be telling me that there are gay undertones in Top Gun, > the film with the sexiest astrophysicist ever. What was it, one of the South Park episodes, the old geezer going on about how GAY all of the parties were in his youth, how GAY he and his friends were, what a GAY TIME Christmas was for all concerned...;-) > I believe that was the original intent for the Web. Still under > development! It's why I give money to wikipedia a couple of times a year, and a bit to project Gutenberg from time to time as well. Money where mouth is, a necessity. How are we going to lift the world out of ignorance and into enlightenment if information is still not free? But the counters of beans and protectors of profit have decreed now that every created work of the intellect is a property now, something that can be bought and sold in damn near perpetuity, long, long after the creator is dead. It won't work. There aren't enough fingers in the world to plug the leaking dyke. Oops, sorry. I mean "dike". ;-) Really(not)-Gay-Brown (but feeling a certain amount of solidarity today with my alternatively gendered brethren and sistren...:-) Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu From rgb at phy.duke.edu Mon Oct 25 14:08:13 2010 From: rgb at phy.duke.edu (Robert G. Brown) Date: Mon, 25 Oct 2010 17:08:13 -0400 (EDT) Subject: [Beowulf] how Google warps your brain In-Reply-To: References: Message-ID: On Mon, 25 Oct 2010, Lux, Jim (337C) wrote: > It's interesting: I just got an iPad a few weeks ago, mostly as a > reader/web-browser device, and I've been reading a variety of > out-of-copyright works: H. Rider Haggard, Joseph Conrad, Mark Twain. Thank > you Gutenberg Project! It is awesome, isn't it? > And, since I am sitting/lying here with a very sore back from moving boxes > of books around this weekend looking for that book that I *know* is in there > somewhere, the prospect of some magic box that would scan all my books into > a format usable into eternity would be quite nice. I might even think that > a personal "print on demand" would be nice that could generate a cheap/quick > copy for reading in bed(yes, the iPad and Kindle, etc., are nice, but > there's affordances provided by the paper edition that is nice.. But I don't > need hardcover or, even, any cover..) > > (or, even better, a service that has scanned all the books for me, e.g. > Google, and that upon receiving some proof of ownership of the physical > book, lets me have an electronic copy of the same... I'd gladly pay some > nominal fee for such a thing, providing it wasn't for some horrible locked, > time limited format which depends on the original vendor being in business > 20 years from now. I also recognize the concern about how "once in digital > form, copying becomes very cheap" which I think is valid.) What a killer idea. Acceptable use, doggone it! I'd ship them books by the boxful in exchange for a movable (even DRM controlled) image, a la Ipod music. I just don't want to rebuy them, like I've now bought most of my music collection TWICE (vinyl and CD). rgb Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu From rgb at phy.duke.edu Mon Oct 25 14:15:17 2010 From: rgb at phy.duke.edu (Robert G. Brown) Date: Mon, 25 Oct 2010 17:15:17 -0400 (EDT) Subject: [Beowulf] how Google warps your brain In-Reply-To: References: <011801cb720f$96b08da0$c411a8e0$@comcast.net> <4CC553CB.8070606@bull.co.uk> Message-ID: On Mon, 25 Oct 2010, Kilian CAVALOTTI wrote: > On Mon, Oct 25, 2010 at 4:53 PM, Robert G. Brown wrote: >> the mean lifetime of most books > > With all due respect (and a lot is due), using HTML tags to mark > emphasis using a console-only email client, *this* /is/ quite > _twisted_. :) Perhaps you'd prefer \emph{\latex}? Look, my fingers know only two kinds of markup at the point where the bot can touchtype them. And I'm trying to {\em retrain} the bot because {\em \bf certain kinds} of latexisms are so, like, yesterday and I'm getting in trouble with pandoc as it absolutely \textbf{chokes} on them. > I would add that being able to easily reconstruct a (physical or > logical) codec system is a mandatory requirement, but that being able > to decode content with no other physical device than the mere support > is a big plus. That's precisely the huge advantage of printed books > over any electronical support you can imagine: you don't need anything > but your eyes and candle light to extract content from them. Yes, if civilization ever collapses, we'll really miss AM radio and books and we might have to play actual music ourselves on analog devices like "trumpets" and "drums". > About physical supports: optical media gets scratched, magnetic media > gets demagnetized, electronic media gets obsoleted, paper media > degrades, stone engraved media takes a lot of room on your shelves. > There's no such thing as a universally good and eternal support. > Ever-going duplication is probably the only way to preserve content on > the long run. That seems to be the principle adopted by biological evolution, at any rate. Works great until you break the chain of transmission, and the accumulated bit errors can work to your advantage or against it, if you implement them in a suitable selective environment.... That would be a very interesting project, actually. What would a cross-breeding of Shakespeare and Rebel without a Cause look like? Oh, wait, that would be West Side Story. rgb > > Cheers, > -- > Kilian > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu From jmdavis1 at vcu.edu Mon Oct 25 14:27:11 2010 From: jmdavis1 at vcu.edu (Mike Davis) Date: Mon, 25 Oct 2010 17:27:11 -0400 Subject: [Beowulf] how Google warps your brain In-Reply-To: References: <011801cb720f$96b08da0$c411a8e0$@comcast.net><4CC553CB.8070606@bull.co.uk> <68A57CCFD4005646957BD2D18E60667B12154CC6@milexchmb1.mil.tagmclarengroup.com> Message-ID: <4CC5F62F.2060509@vcu.edu> Robert G. Brown wrote: > > But the counters of beans and protectors of profit have decreed now that > every created work of the intellect is a property now, something > that can be bought and sold in damn near perpetuity, long, long after > the creator is dead. > Melancholy Elephants! -- Mike Davis Technical Director (804) 828-3885 Center for High Performance Computing jmdavis1 at vcu.edu Virginia Commonwealth University "Never tell people how to do things. Tell them what to do and they will surprise you with their ingenuity." George S. Patton From ellis at runnersroll.com Mon Oct 25 16:00:10 2010 From: ellis at runnersroll.com (Ellis H. Wilson III) Date: Mon, 25 Oct 2010 19:00:10 -0400 Subject: [Beowulf] Anybody using Redhat HPC Solution in their Beowulf In-Reply-To: References: <4CC11EA8.8030602@gmail.com> <68A57CCFD4005646957BD2D18E60667B12154605@milexchmb1.mil.tagmclarengroup.com> Message-ID: <4CC60BFA.4080502@runnersroll.com> On 10/22/10 18:26, Joshua Baker-LePain wrote: > On Fri, 22 Oct 2010 at 9:30am, Hearns, John wrote > >>> Hello List, >>> My University is going for a new HPC System. I was using Rocks + >> CentOS >>> until now but someone suggested to use Redhat HPC Solution with the >> new >>> system. >> >> Please, please don't "roll your own" system (pun intended). >> There are lots of companies out there who will provide you with a high >> quality, supported >> Beowulf cluster. Heck - some of them are respected controbutors to this >> list! I don't think you could find a statement more orthogonal to the spirit of the Beowulf list than, "Please, please don't "roll your own" system..." Isn't Beowulfery about the drawing together of inexpensive components in an intelligent fashion suited just for your particular application while using standardized (and thereby cheap by the law of scale) hardware? I'm not suggesting Richard build his own NIC - but there is nothing wrong with using even a distribution of Linux not intended for HPC (so long as you're smart about it) and picking and choosing the software (queuing managers, tracers, etc) he finds works best. Also, I would argue if a company is selling you an HPC solution, it's either: 1. A true Beowulf in terms of using COTS hardware, in which case you are likely getting less than your money is worth or 2. Isn't a true Beowulf because it is using some custom hardware, firmware or software (stack or applications or both) that enables it to perform a specific task more efficiently than a simpler (COTS) solution would. In this case you'll be getting your monies worth, but you aren't buy a Beowulf - you're buying an HPC solution, big/medium/small iron, or some other terminology to represent HPC-in-a-box. There are certainly advantages (as Joshua mentions below) in getting HPC-in-a-box. But they aren't Beowulfs and they rarely are cheap. > IMO, in the right set of circumstances there's absolutely nothing wrong > with rolling your own cluster. For example, say you aleady have the > expertise in house or are budgeting a FTE into the cluster costs (not to > mention "free" undergrad or grad student labor). In those cases, why pay > extra for integration you can do yourself? There's certainly a market > for cluster vendors, but there's also a place for DIY. Agreed (though as a grad student, I simultaneously shiver and pat myself on the back for such cheap labor). I could be walking a server rack onto a limb here, but I personally don't think the "market for cluster vendors" is (or possibly more accurately, "should be," though it would make me sad) the Beowulf list. Depending on how large the HPC system is that you are considering, the costs involved in "high quality support," is often not worth it for nodes in the hundreds. A reasonably skilled sysadmin can setup and handle a couple hundred nodes without too much trouble if he's crafty enough to collect for him or herself the right scripts and applications to make administering it reasonably painless. Plus, if a chunk of hardware goes (since it's COTS) it's painless to get a new one online and put it in yourself when you get in. Now if you organization wants nodes in the thousands, tens of thousands, etc. then you probably don't want a "Beowulf" anyhow. Best, ellis From Bill.Rankin at sas.com Tue Oct 26 07:54:43 2010 From: Bill.Rankin at sas.com (Bill Rankin) Date: Tue, 26 Oct 2010 14:54:43 +0000 Subject: [Beowulf] how Google warps your brain In-Reply-To: References: Message-ID: <76097BB0C025054786EFAB631C4A2E3C0948F542@MERCMBX03D.na.SAS.com> Heading completely off-topic now, but the area of digital media and long-term archival/retrieval is something that I find very interesting. I'll leave it to Rob to somehow eventually tie this back into a discussion of COTs technology and HPC. > > It's interesting: I just got an iPad a few weeks ago, mostly as a > > reader/web-browser device, and I've been reading a variety of > > out-of-copyright works: H. Rider Haggard, Joseph Conrad, Mark Twain. > Thank > > you Gutenberg Project! > > It is awesome, isn't it? Amazon also carries many of the out-of-copyright works in their Kindle store for $0 (and gives credit to Gutenburg to a small extent). It was nice to be able to go pickup things like the Sherlock Holmes series, Homer's Illiad and some of Einstein's works (which I don't pretend to understand) and have them downloaded via 3G on Amazon's dime. I will say that because of this I tend to overlook their rather high (IMHO) price on current digital content and have probably purchased more e-books overall as a result. > > And, since I am sitting/lying here with a very sore back from moving boxes > > of books around this weekend looking for that book that I *know* is in there > > somewhere, the prospect of some magic box that would scan all my books into > > a format usable into eternity would be quite nice. I might even think that > > a personal "print on demand" would be nice that could generate a cheap/quick > > copy for reading in bed(yes, the iPad and Kindle, etc., are nice, but > > there's affordances provided by the paper edition that is nice.. But I don't > > need hardcover or, even, any cover..) There is just *something* about paper, isn't there? And while I don't have a library to the extent of RGBs or others, I do like having some books around (glancing at the two bookshelves in my office). On the other hand, I still have boxes of books sitting around unopened since we moved house 4-5 years ago. I certainly need a purge, lest I end up on one of those "hoarding" shows that seem to be popular as of late. At some point, I have to ask myself if I really *need* to have a old beat-up, falling apart copy of "Voyage of the Space Beagle" laying around. > > (or, even better, a service that has scanned all the books for me, e.g. > > Google, and that upon receiving some proof of ownership of the physical > > book, lets me have an electronic copy of the same... I'd gladly pay some > > nominal fee for such a thing, providing it wasn't for some horrible locked, > > time limited format which depends on the original vendor being in business > > 20 years from now. I also recognize the concern about how "once in digital > > form, copying becomes very cheap" which I think is valid. A scanning service would be wonderful for a lot of the books I have, mainly those I view as reference-type material. For current reference material, Safari Books Online has a reasonable usage model that allows for making hardcopy of their online content. Now if there was only a simple way to transcribe the same content for download to my Kindle I would be set (something beyond the OCR+PDF approach, which is awkward and inconsistent). > What a killer idea. Acceptable use, doggone it! I'd ship them books > by the boxful in exchange for a movable (even DRM controlled) image, a la > Ipod music. I just don't want to rebuy them, like I've now bought most > of my music collection TWICE (vinyl and CD). [let's not get started about vinyl collections - that's a whole 'nother set of unopened boxes] The problem is that many of the media houses are still waging an underground war on Fair Use, despite the legal decisions handed down by the courts. As an example, I recently had a email exchange with one of the customer service people at a major network. I was trying to locate additional interview footage from when my brother-in-law was on a certain hour-long Sunday evening news show. This person informed me that I did not have their "permission" to recorded the over-the-air broadcast of the show and burn it on a DVD to give to my sister, so what I was doing was not legal. This was news to me, since this usage model was clearly defined as permissible by the Supreme Court many years ago in the Sony v. Universal "Betamax Case". While the market for online music, video and written works have forced the various publishers to acknowledge to the need to provide content in digital form, to a great extent they had to be dragged kicking and screaming into the 21st century. A lot of progress has been made but there is still a lot of resistance towards efforts to open up availability and access even further. I would like see a service where I could take bins of old books to a used book store and somehow get credits towards the purchase of e-books online. I think that could break me of my paperback hoarding habit pretty quickly. -bill From deadline at eadline.org Tue Oct 26 07:59:25 2010 From: deadline at eadline.org (Douglas Eadline) Date: Tue, 26 Oct 2010 10:59:25 -0400 (EDT) Subject: [Beowulf] how Google warps your brain In-Reply-To: <68A57CCFD4005646957BD2D18E60667B12154CC6@milexchmb1.mil.tagmclarengro up.com> References: <011801cb720f$96b08da0$c411a8e0$@comcast.net><4CC553CB.8070606@bull.co.uk> <68A57CCFD4005646957BD2D18E60667B12154CC6@milexchmb1.mil.tagmclarengroup.com> Message-ID: <49886.192.168.93.213.1288105165.squirrel@mail.eadline.org> Not that there is anything wrong with that. > > As usual, a highly insightful post from RGB. > > > >> a) Multiple copies. Passenger pigeons may be robust, but once the > number of copies drops below a critical point, they are gone. E. Coli > we will always have >> with us (possibly in a constantly changing form) because there are so > very many copies, so very widely spread. > > I probably shouldn't mention Wikileaks here... > >> >> At the moment, the internet has if anything VASTLY INCREASED a, b and > c >> for every single document in the public domain that has been ported > to, >> e.g. Project Gutenberg. >> >> Right now, I'm sitting on a cache of "Saint" books, by Leslie > Charteris >> (who was a great favorite of mine growing up and still is). >> >> Nobody is going to reprint the Saint stories. They are a gay fantasy >> from another time, > > Simon Templar? Gay? Cough. > > Next you will be telling me that there are gay undertones in Top Gun, > the film with the sexiest astrophysicist ever. > > >> might well last to the end of civilization. Replicate them a few >> million times, PERPETUATE them from generation to generation by >> renewing >> the copies, and backing them up, and recopying them in formats where >> they are still useful. > > The cloud backup providers will be keeping copies of data on > geographically spread sites. > However, we should at this stage be asking what are the mechanisms for > cloud storage companies > for > *) living wills - what happens when the company goes bust > > *) what are the strategies for migrating the data onto new storage > formats > > >> >> Or, to put it differently, suppose every single human on the planet > had >> access to the modern equivalent of Diophantus's Arithmetica on their >> computer, their Kindle, their Ipad > I believe that was the original intent for the Web. Still under > development! > > > The contents of this email are confidential and for the exclusive use of > the intended recipient. If you receive this email in error you should not > copy it, retransmit it, use it or disclose its contents but should return > it to the sender immediately and delete your copy. > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > -- > This message has been scanned for viruses and > dangerous content by MailScanner, and is > believed to be clean. > -- Doug -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From john.hearns at mclaren.com Tue Oct 26 01:16:47 2010 From: john.hearns at mclaren.com (Hearns, John) Date: Tue, 26 Oct 2010 09:16:47 +0100 Subject: [Beowulf] Anybody using Redhat HPC Solution in their Beowulf In-Reply-To: <4CC60BFA.4080502@runnersroll.com> References: <4CC11EA8.8030602@gmail.com> <68A57CCFD4005646957BD2D18E60667B12154605@milexchmb1.mil.tagmclarengroup.com> <4CC60BFA.4080502@runnersroll.com> Message-ID: <68A57CCFD4005646957BD2D18E60667B12154E23@milexchmb1.mil.tagmclarengroup.com> > I don't think you could find a statement more orthogonal to the spirit > of the Beowulf list than, "Please, please don't "roll your own" > system..." Isn't Beowulfery about the drawing together of inexpensive > components in an intelligent fashion suited just for your particular > application while using standardized (and thereby cheap by the law of > scale) hardware? I'm not suggesting Richard build his own NIC - but > there is nothing wrong with using even a distribution of Linux not > intended for HPC (so long as you're smart about it) and picking and > choosing the software (queuing managers, tracers, etc) he finds works > best. > > Also, I would argue if a company is selling you an HPC solution, it's > either: > 1. A true Beowulf in terms of using COTS hardware, in which case you > are > likely getting less than your money is worth or Ellis, I am going to politely disagree with you - now there's a surprise! I have worked as an engineer for two HPC companies - Clustervision and Streamline. My slogan phrase on this issue is "Any fool can go down PC World and buy a bunch of PCs" By that I mean that CPU is cheap these days, but all you will get is a bunch of boxes on your loading bay. As you say, and you are right, you then have the option of installing Linux plus a cluster management stack and getting a cluster up and running. However, as regards price, I would say that actually you will be paying very, very little premium for getting a supported, tested and pre-assembled cluster from a vendor. Academic margins are razor thin - the companies are not growing fat over academic deals. They also can get special pricing from Intel/AMD if the project can be justified - probably ending up at a price per box near to what you pay at PC World. Or take (say) rack top switches. Do you want to have a situation where the company which supports your cluster has switches sitting on a shelf, so when a switch fails someone (me!) is sent out the next morning to deliver a new switch in a box, cable it in and get you running? Or do you want to deal direct with the returns department at $switch vendor, or even (shudder) take the route of using the same switches as the campus network - so you don't get to choose on the basis of performance or suitability, but just depend on the warm and fuzzies your campus IT people have. We then come to support - say you buy that heap of boxes from a Tier 1 - say it is the same company your campus IT folks have a campus wide deal with. You'll get the same type of support you get for general servers running Windows - and you'll deal with first line support staff on the phone every time. Me, I've been there, seen there, done it with tier 1 support like that. As a for instance, HPC workloads tend to stress the RAM in a system, and you get frequent ECC errors on a young system as it is bedding in. Try phoning support every time a light comes on, and get talked through the "have you run XXX diagnostic", it soon gets wearing. Before Tier 1 companies cry foul, of course both the above companies and all other cluster companies integrate Tier 1 servers - but that is a different scenario from getting boxes delivered through your campus agreement with $Tier1. The contents of this email are confidential and for the exclusive use of the intended recipient. If you receive this email in error you should not copy it, retransmit it, use it or disclose its contents but should return it to the sender immediately and delete your copy. From ellis at runnersroll.com Tue Oct 26 09:09:12 2010 From: ellis at runnersroll.com (Ellis H. Wilson III) Date: Tue, 26 Oct 2010 12:09:12 -0400 Subject: [Beowulf] Anybody using Redhat HPC Solution in their Beowulf In-Reply-To: <68A57CCFD4005646957BD2D18E60667B12154E23@milexchmb1.mil.tagmclarengroup.com> References: <4CC11EA8.8030602@gmail.com> <68A57CCFD4005646957BD2D18E60667B12154605@milexchmb1.mil.tagmclarengroup.com> <4CC60BFA.4080502@runnersroll.com> <68A57CCFD4005646957BD2D18E60667B12154E23@milexchmb1.mil.tagmclarengroup.com> Message-ID: <4CC6FD28.1050303@runnersroll.com> On 10/26/10 04:16, Hearns, John wrote: > I have worked as an engineer for two HPC companies - Clustervision and > Streamline. > My slogan phrase on this issue is "Any fool can go down PC World and buy > a bunch of PCs" Well if you are buying PCs in bulk at retail pricing, you are a fool anyway. Plus most PC World PCs won't have ECC RAM so I wasn't really referring to those as few of us tolerate random bit flips. > However, as regards price, I would say that actually you will be paying > very, very little premium > for getting a supported, tested and pre-assembled cluster from a vendor. > Academic margins are razor thin - the companies are not growing fat over > academic deals. > They also can get special pricing from Intel/AMD if the project can be > justified - probably ending > up at a price per box near to what you pay at PC World. Again, not comparing PC World to Tier 1 bulk purchases. I'm comparing Tier 1 bulk purchases w/o an OS (so you can DIY) with specialized HPC vendor purchases where you don't have to DIY. Even then, perhaps it breaks even the first year if you get a very, very good deal from the HPC vendor. However, to get the deal you are probably contracted into four or five years of support and when considering HPC, involving more humans are the fastest way to get a really inefficient and expensive cluster. After the first year and up until the lifetime of the cluster involving human support annually will add a large cost overhead you have to account for at the beginning (and probably buy less hardware because of which). > Or take (say) rack top switches. Do you want to have a situation where > the company which supports your cluster > has switches sitting on a shelf, so when a switch fails someone (me!) is > sent out the next morning to deliver > a new switch in a box, cable it in and get you running? That's probably a hell of a lot faster than waiting on a vendor to get you a new switch through some RMA process. Plus you know the cabling is done right :). Optimally IMHO, in university setups physical scientists create the need for HPC. These types shouldn't (as Kilian mentions) need to inherit all of the responsibilities and overheads of cluster management to use one (or pay cluster vendors annually for support). They should simply walk over to the CS department, find system guys (who would probably drool over the potential of administering a reasonably sized cluster) and work out an agreement where the physical science types can "just use it" and the systems/CS guys administer it and can once in a while trace workloads, test new load balancing mechanisms, try different kernel settings for performance, etc. This way the physical scientists get their work done on a well supported HPC system for no extra cash and computer scientists get great, non-toy traces and workloads to further their own research. Both parties win. Now in organizations that don't have a CS department I agree that HPC vendors are the way to go. ellis From kilian.cavalotti.work at gmail.com Tue Oct 26 02:18:56 2010 From: kilian.cavalotti.work at gmail.com (Kilian CAVALOTTI) Date: Tue, 26 Oct 2010 11:18:56 +0200 Subject: [Beowulf] Anybody using Redhat HPC Solution in their Beowulf In-Reply-To: <4CC60BFA.4080502@runnersroll.com> References: <4CC11EA8.8030602@gmail.com> <68A57CCFD4005646957BD2D18E60667B12154605@milexchmb1.mil.tagmclarengroup.com> <4CC60BFA.4080502@runnersroll.com> Message-ID: Hi, On Tue, Oct 26, 2010 at 1:00 AM, Ellis H. Wilson III wrote: > Also, I would argue if a company is selling you an HPC solution, it's > either: > 1. A true Beowulf in terms of using COTS hardware, in which case you are > likely getting less than your money is worth or Well, depends on how you value your time and the required expertise to put all those COTS and OSS pieces together to make them run smoothly and efficiently. Most scientists and HPC systems users are not professional sysadmins (which is good, they have a job to do), and the value of trained, experienced, skilled individuals who can put together a reliable and useful HPC system is sometimes overlooked (ie. undervalued). I agree with your later statement, though: > I personally don't think the "market for cluster vendors" is [...] > the Beowulf list. Cheers, -- Kilian From james.p.lux at jpl.nasa.gov Wed Oct 27 09:32:43 2010 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Wed, 27 Oct 2010 09:32:43 -0700 Subject: [Beowulf] Anybody using Redhat HPC Solution in their Beowulf In-Reply-To: <4CC6FD28.1050303@runnersroll.com> References: <4CC11EA8.8030602@gmail.com> <68A57CCFD4005646957BD2D18E60667B12154605@milexchmb1.mil.tagmclarengroup.com> <4CC60BFA.4080502@runnersroll.com> <68A57CCFD4005646957BD2D18E60667B12154E23@milexchmb1.mil.tagmclarengroup.com> <4CC6FD28.1050303@runnersroll.com> Message-ID: > > Optimally IMHO, in university setups physical scientists create the need > for HPC. These types shouldn't (as Kilian mentions) need to inherit all > of the responsibilities and overheads of cluster management to use one > (or pay cluster vendors annually for support). They should simply walk > over to the CS department, find system guys (who would probably drool > over the potential of administering a reasonably sized cluster) and work > out an agreement where the physical science types can "just use it" and > the systems/CS guys administer it and can once in a while trace > workloads, test new load balancing mechanisms, try different kernel > settings for performance, etc. This way the physical scientists get > their work done on a well supported HPC system for no extra cash and > computer scientists get great, non-toy traces and workloads to further > their own research. Both parties win. > I don't know about this model. This is like developing software on prototype hardware. The hardware guys and gals keep wanting to change the hardware, and the software developers complain that their software keeps breaking, or that the hardware is buggy (and it is). The computational physics and computational biology guys get to work on cool, nifty stuff to push their dissertation forward by using a hopefully stable computational platform. But I don't think the CS guys would drool over the possibility of administering a cluster. The CS guys get to be sysadmin/maintenance types...not very fun for them, and not the kind of work that would work for their dissertation. Now, if the two groups were doing research on new computational methods (what's the best way to simulate X) perhaps you'd get a collaboration. From dmitri.chubarov at gmail.com Tue Oct 19 20:51:24 2010 From: dmitri.chubarov at gmail.com (Dmitri Chubarov) Date: Wed, 20 Oct 2010 10:51:24 +0700 Subject: [Beowulf] Looking for references for parallelization and optimization In-Reply-To: <20101020005647.44e6e2db@vivalunalitshi.luna.local> References: <20101020005647.44e6e2db@vivalunalitshi.luna.local> Message-ID: Dear Micha, we are working on a course on the subject for Novosibirsk University. There are several widely used books that we use as reference material for the optimization part of the course. In particular, * Stefan Goedecker, Adolfy Hoisie, "Performance optimization of numerically intensive codes", SIAM, 2000. * Kewin Wadleigh, Isom Crawford, "Software optimization for High Performance Computing", HP Professional Books, 2000 We would like to start with more theoretical approaches, like an introduction to dependency graph analysis, asymptotic analysis of algorithms and then proceed with specific optimization techniques like the ones described in the above books. Please compile a list from the responses you will receive from Beowulf community. I would definitely find such a list very helpful. Best regards, Dima On Wed, Oct 20, 2010 at 5:56 AM, Micha wrote: > A bit off topic, so sorry, but it looks like a place where people who > learned > these things at some point hand out ... > > I've been asked to write a course on the subject of optimizing code. As > it's > hard to translate knowledge into an actual course, I was wondering if > anyone > here has references to either books, online tutorials or course syllabuses > on > the subjects of parallelization (OpenMP, MPI, also matlabs parallel > computing > toolbox) and optimization (sse, caches, memory access patterns, etc.) . > It's less on > the subject of this list, but I also need references regarding testing > (unit > and project), design and profiling. > > I trying to build a coherent syllabus, and having some reference texts > really > helps the process, and all my uni course materials are long dead. > > Thanks > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jack at crepinc.com Thu Oct 21 08:48:12 2010 From: jack at crepinc.com (Jack Carrozzo) Date: Thu, 21 Oct 2010 11:48:12 -0400 Subject: [Beowulf] how Google warps your brain In-Reply-To: <76097BB0C025054786EFAB631C4A2E3C0948BCAB@MERCMBX03D.na.SAS.com> References: <20101021104323.GV28998@leitl.org> <76097BB0C025054786EFAB631C4A2E3C0948BCAB@MERCMBX03D.na.SAS.com> Message-ID: To add my $0.02 to Bills points, it becomes more difficult also when dealing with multiple groups to decide on the type of setup and whatnot. Where I went to school, the Math dept had a huge shared-memory SGI setup whilst the Physics department had a standard Beowulf cluster. Both groups used their systems rarely, and other departments had been asking for HPC hardware also. However, after long debates by all parties, a single infrastructure couldn't be decided upon and each independent dept just got a little money to fix up their curent systems. -Jack On Thu, Oct 21, 2010 at 11:19 AM, Bill Rankin wrote: > Good points by Jim, and while I generally try and avoid "me too" posts, I > just wanted to add my two cents. > > > In my previous life I worked on building a central HPC cluster facility at > Duke. The single biggest impediment to creating this resource was actually > trying to justify its expense and put a actual number on the cost savings of > having a centrally managed system. This was extremely difficult to do given > the way the university tracked its infrastructure and IT costs. > > If a research group bought a rack or two of nodes then they were usually > hosted in the local school/department facilities and supported by local IT > staff. The cost of power/cooling and staff time became part of a larger > departmental budget and effectively disappeared from the financial radar. > They were not tracked at that level of granularity. They were effectively > invisible. > > Put all those systems together into a shared facility and all of a sudden > those costs become very visible. You can track the power and cooling costs. > You now have salaries for dedicated IT/HPC staff. And ultimately you have > one person having to cut some very large checks. And because of the > university funding model and the associated politics it is extremely > difficult, if not impossible, to actually recoup funds from the departments > or even the research groups who would be saving money. > > In order to make it work, you really need the senior leadership of the > university to commit to making central HPC infrastructure an absolute > requirement, and sticking to that commitment when it comes budget time and > the politics are running hot and heavy over who gets how much. > > Now to most of us this is a rehash of a conversation that we have had often > before. And with clusters and HPC pretty much established as a necessity > for any major research university, the development of central facilities > would seem to be the obvious solution. I find it somewhat concerning that > institutions like Harvard are apparently still dealing with this issue. > > > -bill > > > > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeff.johnson at aeoncomputing.com Thu Oct 21 09:56:50 2010 From: jeff.johnson at aeoncomputing.com (Jeff Johnson) Date: Thu, 21 Oct 2010 09:56:50 -0700 Subject: [Beowulf] how Google warps your brain In-Reply-To: References: <20101021104323.GV28998@leitl.org> <76097BB0C025054786EFAB631C4A2E3C0948BCAB@MERCMBX03D.na.SAS.com> <4CC065FB.1060905@scalableinformatics.com> Message-ID: <4CC070D2.8040905@aeoncomputing.com> On 10/21/10 9:37 AM, Robert G. Brown wrote: > Of course (he chimes in from the bleachers where he is quietly sitting > and wishing he were drinking a beer while watching students -- also > wishing they were drinking a beer -- take a physics exam:-) this simply > puts the world back where it was so long ago before the beowulf concept > was invented... Did someone say 'beer'? I'll take one (two) please... > The real beauty of clusters (to me) has always been at least partly the > fact that you could build YOURSELF a cluster, just for your own > research, without having to have major leadership, infrastructure, > space, or other resources. We are seeing a number of university research locations where there is a top down push to consolidate HPC under a unified campus IT umbrella. To pull localized HPC resources out of departments and labs, pull the gear into a centralized and managed enterprise location and convert the researchers into resource customers of the unified HPC resource. In some cases to the point where the university administration plans to no longer fund localized expenditures on increased HVAC or room retrofitting for HPC. -- ------------------------------ Jeff Johnson Manager Aeon Computing jeff.johnson at aeoncomputing.com www.aeoncomputing.com t: 858-412-3810 x101 f: 858-412-3845 m: 619-204-9061 4905 Morena Boulevard, Suite 1313 - San Diego, CA 92117 From daniel.challen at ocsl.co.uk Fri Oct 22 01:31:42 2010 From: daniel.challen at ocsl.co.uk (Daniel Challen) Date: Fri, 22 Oct 2010 09:31:42 +0100 Subject: [Beowulf] Anybody using Redhat HPC Solution in their Beowulf In-Reply-To: References: <4CC11EA8.8030602@gmail.com> Message-ID: <1287736302.2871.10.camel@khorium.ocsl.local> On Fri, 2010-10-22 at 07:54 +0200, Jonathan Aquilina wrote: > i have seen in repositories of other distros the red hat cluster > suite. im not sure if that is the same thing as you mentioned below. Red Hat Cluster Suite (RHCS) is a high availability cluster solution built upon OpenAIS plus other RH elements. It's an entirely different thing to HPC. Red Hat HPC Solution is based upon Lava, which I *believe* to be an OSS version of Platform's LSF resource manager. It also includes management and provisioning tools, but I couldn't tell you what those are. Personally, I've built a number of clusters running RHEL, but usually with TORQUE+Maui or Moab, plus one or two with the commercial Platform LSF or PBS Pro (depending on customers). I probably ought to at least look at it, but I've not yet touched RH's HPC offering. - Dan From trainor at presciencetrust.org Fri Oct 22 10:11:57 2010 From: trainor at presciencetrust.org (Douglas J. Trainor) Date: Fri, 22 Oct 2010 13:11:57 -0400 Subject: [Beowulf] Anybody using Redhat HPC Solution in their Beowulf In-Reply-To: <4CC1BA64.5050200@gmail.com> References: <4CC11EA8.8030602@gmail.com> <4CC1BA64.5050200@gmail.com> Message-ID: There is also the situation when you ask for a quote from sales and you never get a quote back! (I asked one vendor for a quote for two Beowulfs, one owned by a company and one owned by a nonprofit...) douglas On Oct 22, 2010, at 12:23 PM, Richard Chang wrote: > On 10/22/2010 6:21 PM, Michael Di Domenico wrote: >> Is there something Rocks is not providing that you need? I haven't >> used Rocks in a few years, but from what I can recall it's a decent >> solution for someone just starting out. There certainly are other >> solutions on the market. I think it comes down to which flavor of HPC >> software you feel most comfortable administering, not necessarily what >> other people "think" you should be running. >> > > Thanks Michael, > > I am satisfied with Rocks, but my group people wanted something well supported that means, RHEL instead of CentOS and "some other PAID" cluster manager instead of Rocks, especially, when we have the budget. > > Somehow, I have not been able to convince my folks that Rocks is a much better solution than the PAID ones. > > Thanks for your suggestion anyway!!. > > regards, > Richard. From vallard at benincosa.com Fri Oct 22 09:41:51 2010 From: vallard at benincosa.com (Vallard Benincosa) Date: Fri, 22 Oct 2010 09:41:51 -0700 Subject: [Beowulf] Anybody using Redhat HPC Solution in their Beowulf In-Reply-To: <4CC1BA64.5050200@gmail.com> References: <4CC11EA8.8030602@gmail.com> <4CC1BA64.5050200@gmail.com> Message-ID: You can buy paid support for ROCKS from clustercorp. In addition you can buy paid support for xCAT (with RedHat) through Sumavi. And of course, which ever vendor you decide to buy hardware from will have a supported offering as well. On Fri, Oct 22, 2010 at 9:23 AM, Richard Chang wrote: > On 10/22/2010 6:21 PM, Michael Di Domenico wrote: > > Is there something Rocks is not providing that you need? I haven't > > used Rocks in a few years, but from what I can recall it's a decent > > solution for someone just starting out. There certainly are other > > solutions on the market. I think it comes down to which flavor of HPC > > software you feel most comfortable administering, not necessarily what > > other people "think" you should be running. > > > > Thanks Michael, > > I am satisfied with Rocks, but my group people wanted something well > supported that means, RHEL instead of CentOS and "some other PAID" cluster > manager instead of Rocks, especially, when we have the budget. > > Somehow, I have not been able to convince my folks that Rocks is a much > better solution than the PAID ones. > > Thanks for your suggestion anyway!!. > > regards, > Richard. > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -------------- next part -------------- An HTML attachment was scrubbed... URL: From billharman at comcast.net Fri Oct 22 10:35:57 2010 From: billharman at comcast.net (William Harman) Date: Fri, 22 Oct 2010 11:35:57 -0600 Subject: [Beowulf] how Google warps your brain Message-ID: <011801cb720f$96b08da0$c411a8e0$@comcast.net> Rgb makes many good points (and should change his profession to that of a futurist (compliment intended)) but one thing I believe needs to be put in place, for the omnipresence of this type of technological world - in a word - power. Whatever device you use and wherever you use it, you need a source of power. Devices or appliances that have power for a few hours or days will not cut it. I still prefer good old hard cover books to ebooks, which I can read after the evening meal and outside with some fresh air, no extension cord needed to keep my notebook juiced up. Now if I had a cold fusion battery pack that lasted for years, (or extracted power from the ether) I could take my notebook, netbook or any other device and go and live happily ever after :-) - cheers Bill Harman, P - (801) 572-9252; F - (801) 571-4927 billharman at comcast.net billharman1027 at gmail.com skype name: harman8015729252 skype phone: +1 (801) 938-4764 -----Original Message----- From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Robert G. Brown Sent: Friday, October 22, 2010 6:39 AM To: Beowulf Mailing List Subject: RE: [Beowulf] how Google warps your brain On Thu, 21 Oct 2010, Mark Hahn wrote: > I find myself using my desktop more and more as a terminal - I hardly > ever run anything but xterm and google chrome. as such, I don't mind that > it's a terrible old recycled xeon from a 2003 project. it would seem > like a waste of money to buy something modern, (and for me to work locally) > since there are basically infinite resources 1ms away as the packet flies... Again, an ancient (well, as much as anything in computing ever is:-) paradigm. The interesting thing is that people have been engineering, designing, selling lightweight/thin computing models in the personal computer game since maybe 1983 or 1984. I bought one of the very first ones -- it was a straight up PC motherboard-in-a-box with a custom (and enormously expensive) coax-based network interface that ran back to the PC. It leeched all of the devices and some of the software off of the PC. Then there were the Sun offerings -- SLC and ELC diskless machines (or really, any Sun system you liked run diskless) on real networks, where they still booted their OS over the network as well as the software they ran, but were consistent with the "network is the computer" slogan. There was yet another burst of enthusiasm around the time of the release of java -- java was supposed to enable a new kind of thin appliance (and in fact did IIRC -- a few were sold but were a commercial failure). However, none of these models succeeded in the long run. The only thin/remote computing model that has persisted is the xterm/rsh/ssh model on top of Unix (with its many enhancements and variations, including for the most part beowulfery, which with a few exceptions relies on e.g. ssh for remote job distribution and control). I think that this has finally changed. Google in particular is intent on fundamentally changing it and >>really<< making the network (or rather, remote computing cloud) into the computer. Finally, I think the conditions are right for them to succeed where everybody else has failed. It's interesting to think about the conditions that enable this to work (and how they differ from those that faced people in the 80's, 90's, even 00's). a) Computers are now fast enough that it is possible to create a DOUBLE breakout to isolate software from both the hardware (which is what operating sytems were supposed to do) and from the operating system itself, which hasn't done so since people learned that they could make a ton of money selling operating systems and controlling the software market. Up until the last decade, at least some of what people wanted to do required "native" code, written for and compiled for a particular operating system and often for a particular hardware environment underneath the operating system. That is still true for a few things (notably high end games) but very little of the rest of what people do isn't accessible with interpreted or emulated pseudocode. b) Networking is no longer much of a bottleneck. As you say, things are a few ms away as the packet flies. Or, as I sit here, 16 ms ping time away from my desktop at Duke, where I'm sitting at home inside a network with a 7 Mbps pipe to the world. Slashdot has Google preparing to build a 1 Gbps broadband network for Stanford undergrads. TWC and other communications companies are furiously laying fiber to neighborhoods if not homes. It's easy to see their motivation. c) HTML, which was never REALLY intended for it, has morphed into a device independent presentation layer. Browsers, which were never at all intended for it, have morphed into a de facto user extensible psuedo-operating system, capable of downloading and running software both transient and permanent (plug-in extensions as well as e.g. straight up programs). The software for this isn't all quite hardware layer independent yet, but a lot of it is and there is a SEPARATION between the hardware sensitive part and the interface that if nothing else makes it easy to write things that will run on top of plug-ins, not the actual operating system, in an operating system independent way. d) Servers were once expensive and represented a massive investment barrier to remote computing. Only crazed, uberhacker-skilled individuals would set up servers at home, for example. Those services that were remote-offered in home environments or small offices were trivia -- click-controlled shared printer or file access. Only Unix (and ssh/rsh) provided a real remote login/execution environment, and even a Unix tyro was uberhacker compared to a Windows Joe User or an Apple Semi-Luddite User. Providing MORE resources to an unskilled user desktop than the desktop itself could provide to the user by simply spending money on local software required an enormous investment in hardware and near-genius systems engineers -- in other words, resources that only existed inside large corporations, universities, governments, and of course crazed hacker households (like many of ours:-). Google in particular engineered a truly scalable cheap superserver, patiently building the infrastructure from the metal up so that it was virtually infinitely extensible at linear cost. I can't imagine what their server-to-human ratio must be, but I'm guessing thousands to tens of thousands -- orders of magnitude better than the best of the supercomputing centers or corporate or government or household server collectives. No doubt it was expensive to get it all started, but at this point they are reaping the benefits of infinite scalability and it isn't clear to me that ANYONE is going to ever be able to touch them in the foreseeable future. Put all of these together -- oh, and let's not forget to throw in e), the advent of phones and pads and ebooks and so on that are little more than a browser on top of a minimal software stack and a network -- and things truly have changed. Who cares if you are running Linux or Windows or MacOS any more if you are running Google Chrome and it "contains" an integrated office suite, manages your browsing, plays your music and videos, lets you run a wide range of games, and does it all transparently identically, for free, on top of any operating system? Google, and Mozilla/Firefox in direct competition, have basically replaced the operating system with an OPERATING system, because computers are finally fast enough to make 99% of all users happy with a fully emulated/isolated translation layer, because the reliance of the environment on the network is no longer bottlenecked so that many compute-intensive tasks are executed transparently remotely (with the user not knowing or caring what is done where), because the environment is powerful enough to do anything they really care about doing including playing lots of games, because they will soon be able to do most of it on a wide range of handhelds without altering their environment. Indeed, even storage isn't an issue -- Google will cheerfully provide you with as much as you are likely to ever need more cheaply than you can provide it for yourself in exchange for subtle and inobtrusive access to your heart and mind. Which they already have. An anecdote. I am shopping for a telescope, since I have a PDA at Duke that I have to spend down before next semester lest it hit the 'have to give some back' threshold and I'm teaching astronomy these days. A good telescope -- I'm planning to spend ballpark of $2600 for an 8" Celestron Schmidt-Casselgrain capable of decent astrophotography, some good lenses, a cheap (starter) CCD camera. So I've googled/browsed several vendors looking at their offerings. In my igoogle page, guess what ad is insidiously placed somewhere on my screen in one of my add ons, every day? When I visit remote sites that have nothing (on the surface) to do with Google but that have adds placed on the screen, guess what ads are there? It's really remarkable to pay attention to this, because my own entrepreneurial activities have often been related to predictive marketing and the paradox that however much we dislike SPAM and direct marketing advertising in general, it is really because it is all noise, little signal. Google's mobile Orion telescope ad is not noise. It is indeed directly focused on what I'm interested in buying. It isn't lingerie (hmmm, buying that might be fun too:-) or machine tools, or video cameras -- although I'll bet I could stimulate these to appear instead with the right bit of browsing. It is the most expensive (highest margin) thing I'm actually looking at and for. I can't tell if Google is the second coming, arriving at last to kick the butt of the Microsoft antichrist and usher in the millenium, or if they are the antichrist who is simply preparing to eliminate all of the lesser devils and bring about the apocalypse. The scary thing is that Google is a significant portion of my brain -- with its new type-and-it-appears answering system, all that's missing is a neural interface and the ability to back up my memories to a remote silo and I might not even notice my own dying. I cannot imagine living and working without it, but it is starting to remind me of some very scary science fiction novels as what could possibly provide a better opportunity for mind control than an interface that is effectively part of your mind? So what can one do? Google is offering up Chrome-crack with the lavish and unspoken promise -- that I have no doubt that they will keep -- that it will be the last operating environment you ever, um, don't actually buy, that inside a year or two we'll see Chromeputers that may well run linux underneath -- but no one will know or care. That through its magic window you will be able to get to all of your music and movies and personal or professional data (efficiently and remotely stored, backed up and sort-of-secure). That within it PDFs will "just display", movies and music will "just play", email will move, news will be read, documents will be word processed, games will be played, and if you borrow a friend's computer for a day or use your phone or your pad, everything will be right there with nothing more than the inconvenience or convenience of the particular hardware interface to surmount or exploit. It won't end there. Who can provide remote computing resources even for actual computations cheaper than Google? For them, adding a server costs what, five FTE person minutes plus the cost of the cheapest possible hardware itself -- assembly line server prep plus somebody plugging it in? Who can provide server management at their ratio of humans to servers? Who can fund/subsidize most of the power and management cost for your tiny slice of this resource for the right to insert subtle little advertising messages into your brain that are NOT noise, they are indeed things you are likely to buy and hence pure gold for the advertiser? Microsoft is only now starting to realize that Windows 7 might well be the last Windows ever released and is scrabbling to cut a too-little, too-late deal with Yahoo and/or Adobe to try to transform themselves into something they only dimly perceive and understand and cannot now duplicate in time. One thing that has often been discussed on this list is marketing the supercomputer center. People have proposed setting up a big supercomputer center and renting it out to companies or universities that need this sort of resource. In general, the few times this has been tried it has failed, for all sorts of good reasons. As Bill noted, it is difficult enough to set up a center WITHIN a closed environment with captive users and real cash flow -- even though beowulfish clusters are quite scalable, only rarely do they achieve the 1000 node/systems person scaling limit (and then there is the infrastructure cost, depreciation and maintenance and replacement and programming support and the fact that a general purpose center achieves generality at the expense of across-the-board price-performance compromise). Google, OTOH, could do it. In fact, they could do it almost as an afterthought, as a side effect. Inside a decade, I can see Google quite literally owning the data Universe, dwarfing Microsoft and Apple combined and making both of them pretty much irrelevant if not bankrupt. And not just in the United States -- worldwide. Few things in computing have actually scared me. Microsoft is pretty scary, but it is the scariness of a clown -- its monopoly was never really stable once Linux was invented and I think it may have peaked and at long last be on the long road do oblivion. Apple isn't scary -- it is genuinely innovative for which I salute them, but its innovations provide at best a transient advantage and its vision has been too local to take over the world. Even Linux with its avowed goal of world domination hasn't been scary, because ultimately linux belongs to the world and as long as the computers being run on also belong to the world, control remains where it belongs, with the people of the world. Google scares me. It has quietly ACHIEVED world domination, and is about to transform the world in a way that will be shocking, amazing, dangerous, liberating, captivating -- and supremely beyond the control of anybody but the people running Google. Be afraid, be very afraid. Happy Halloween! rgb P.S. -- C'mon, haven't y'all missed my 10K essays? Admit it...;-) Alas, now it is off to grade papers and disappear again. Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From alex.chekholko at gmail.com Fri Oct 22 10:56:15 2010 From: alex.chekholko at gmail.com (Alex Chekholko) Date: Fri, 22 Oct 2010 13:56:15 -0400 Subject: [Beowulf] Anybody using Redhat HPC Solution in their Beowulf In-Reply-To: <4CC11EA8.8030602@gmail.com> References: <4CC11EA8.8030602@gmail.com> Message-ID: On Fri, Oct 22, 2010 at 1:18 AM, Richard Chang wrote: > Hello List, > My University is going for a new HPC System. I was using Rocks + CentOS until now but someone suggested to use Redhat HPC Solution with the new system. > > I am not able to find good documentation to setup and use Redhat HPC. It seems, Redhat uses Platform Computing's Platform Cluster Manager re-branded with their(Redhat's) logo, though I may be wrong. ?For that matter, does anyone use Platform Cluster Manager also?. > The RH HPC mailing list suggests this project is inactive: https://www.redhat.com/archives/rhel-hpc-list/ You can get an "evaluation" version of the product, I think, have you tried that? I'm not sure it provides you with any better functionality/support than ROCKS. Regards, Alex From htor at illinois.edu Sat Oct 23 13:28:21 2010 From: htor at illinois.edu (Torsten Hoefler) Date: Sat, 23 Oct 2010 16:28:21 -0400 (EDT) Subject: [Beowulf] [hpc-announce] International Workshop on High-Level Parallel Programming Models and Supportive Environments Call for Papers (HIPS'11, 1st CFP) Message-ID: <20101023202821.BFF33ABDFF@benten.cs.indiana.edu> Dear Sir or Madam, (We apologize if you receive multiple copies of this message) --------------------------------------------------------------- ** CALL FOR PAPERS ** International Workshop on High-Level Parallel Programming Models and Supportive - HIPS 2011 Anchorage, Alaska May 16, 2011 http://www.unixer.de/hips2011/ -------------------------------------------------------------- To be held in conjunction with IPDPS 2011, the 25th IEEE International Parallel & Distributed Processing Symposium IMPORTANT DATES Paper Submission: December 2, 2010 Author Notification: February 1, 2011 HIPS is a forum for leading work on high-level programming of multiprocessors, compute clusters, and massively parallel machines. Like previous workshops in the series, which was established in 1996, this event serves as a forum for research in the areas of parallel applications, language design, compilers, runtime systems, and programming tools. It provides a timely and lightweight forum for scientists and engineers to present the latest ideas and findings in these rapidly changing fields. This year we especially encourage innovative approaches in the areas of emerging programming models for large-scale parallel systems and many-core architectures. The topics include but are not limited to: * New programming languages and constructs for exploiting parallelism and locality * Experience with and improvements for existing parallel languages and run-time environments such as MPI, OpenMP, Cilk, UPC, and Co-array Fortran * Parallel compilers, programming tools, and environments * (Scalable) tools for performance analysis, modeling, monitoring, and debugging * OS and architectural support for parallel programming and debugging * Software and system support for extreme scalability including fault tolerance * Programming environments for heterogeneous multicore systems and accelerators such as GPUs, FPGAs, and Cell The HIPS workshop proceedings will be published electronically along with the IPDPS conference proceedings via IEEE Xplore. Submitted manuscripts should be formatted according to IPDPS proceedings guidelines: 10-point fonts, single-spaced, and two-column format. The page size is US letter (8.5x11 inch). The maximal length is 8 pages. All papers must be in English. The best papers in the area of parallel computing will be considered for inclusion in a special issue of Elsevier Parallel Computing (PARCO). --Organizing Committee-- Workshop Chair Torsten Hoefler University of Illinois at Urbana-Champaign, USA Steering Committee Rudolf Eigenmann, Purdue University, USA Michael Gerndt, Technische Universit??t M??nchen, Germany Frank M??ller, North Carolina State University, USA Craig Rasmussen, Los Alamos National Laboratory, USA Martin Schulz, Lawrence Livermore National Laboratory, USA Program Committee Sadaf Alam, Swiss National Supercomputing Centre, Switzerland Pavan Balaji, Argonne National Laboratory, USA Richard Barrett, Sandia National Laboratories, USA Brett Bode, National Center for Supercomputing Applications, USA Greg Bronevetsky, Lawrence Livermore National Laboratory, USA Bronis de Supinski, Lawrence Livermore National Laboratory, USA Chen Ding, University of Rochester, USA Michael Gerndt, Technische Universit??t M??nchen, Germany Thomas Fahringer, University of Innsbruck, Austria Yutaka Ishikawa, University of Tokyo, Japan Andreas Kn??pfer, Technische Universit??t Dresden, Germany Bernd Mohr, Forschungszentrum J??lich, Germany Craig Rasmussen, Los Alamos National Laboratory, USA Sven-Bodo Scholz, University of Herfordshire, UK Martin Schulz, Lawrence Livermore National Laboratory, USA Tony Skjellum, University of Alabama Birmingham, USA Marc Snir, University of Illinois at Urbana-Champaign, USA Fabian Tillier, Microsoft, USA Jesper Larsson Tr??ff, University of Vienna, Austria Jeremiah Willcock, Indiana University, USA Felix Wolf, German Research School for Simulation Sciences, Germany From bruce.allen at aei.mpg.de Sun Oct 24 01:28:49 2010 From: bruce.allen at aei.mpg.de (Bruce Allen) Date: Sun, 24 Oct 2010 10:28:49 +0200 Subject: [Beowulf] 48-port 10gig switches? In-Reply-To: <20100902011737.GB13598@bx9.net> References: <20100902011737.GB13598@bx9.net> Message-ID: <5F691688-E115-4C46-9373-111930B0A1A5@aei.mpg.de> Hi Greg, We have some switches made by Fortinet (model FS500) which we like. They were previously designed and marketed by Woven Systems, which was then acquired by Fortinet. Cheers, Bruce On Sep 2, 2010, at 3:17 AM, Greg Lindahl wrote: > I'm in the market for 48-port 10gig switches (preferably not a > chassis), and was wondering if anyone other than Arista and (soon) > Voltaire makes them? Force10 seems to only have a chassis that big? > Cisco isn't my favorite vendor anyway. One would think that the > availability of a single-chip 48-port 10gig chip would lead to more > than just 2 vendors selling 'em. > > -- greg > > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From tvssarma.omega9 at gmail.com Wed Oct 27 10:03:32 2010 From: tvssarma.omega9 at gmail.com (Sarma Tangirala) Date: Wed, 27 Oct 2010 17:03:32 +0000 Subject: [Beowulf] Interesting In-Reply-To: <201010271636.o9RGa3Bq013346@bluewest.scyld.com> References: <201010271636.o9RGa3Bq013346@bluewest.scyld.com> Message-ID: <276820005-1288199010-cardhu_decombobulator_blackberry.rim.net-1964510620-@bda263.bisx.produk.on.blackberry> The recent digests that I am getting are quite interesting (bad google) and I have a question. What I'd like to know is, is it possible to have a our history captured in its entirety so that none of the future generations have to run around (like Hari Seldon) because information from waaaay back is corrupt and not take care of? Do you guys know of any existing sources that you can point me to? Is this under distributed systems or under compression algorithms? Any other two cents on this is welcome! Sent from my BlackBerry -----Original Message----- From: beowulf-request at beowulf.org Sender: beowulf-bounces at beowulf.org Date: Wed, 27 Oct 2010 09:36:13 To: Reply-To: beowulf at beowulf.org Subject: Beowulf Digest, Vol 80, Issue 22 Send Beowulf mailing list submissions to beowulf at beowulf.org To subscribe or unsubscribe via the World Wide Web, visit http://www.beowulf.org/mailman/listinfo/beowulf or, via email, send a message with subject or body 'help' to beowulf-request at beowulf.org You can reach the person managing the list at beowulf-owner at beowulf.org When replying, please edit your Subject line so it is more specific than "Re: Contents of Beowulf digest..." Today's Topics: 1. RE: how Google warps your brain (Bill Rankin) 2. RE: how Google warps your brain (Douglas Eadline) 3. RE: Anybody using Redhat HPC Solution in their Beowulf (Hearns, John) 4. Re: Anybody using Redhat HPC Solution in their Beowulf (Ellis H. Wilson III) 5. Re: Anybody using Redhat HPC Solution in their Beowulf (Kilian CAVALOTTI) 6. RE: Anybody using Redhat HPC Solution in their Beowulf (Lux, Jim (337C)) ---------------------------------------------------------------------- Message: 1 Date: Tue, 26 Oct 2010 14:54:43 +0000 From: Bill Rankin Subject: RE: [Beowulf] how Google warps your brain To: Beowulf Mailing List Cc: "Robert G. Brown" Message-ID: <76097BB0C025054786EFAB631C4A2E3C0948F542 at MERCMBX03D.na.SAS.com> Content-Type: text/plain; charset="us-ascii" Heading completely off-topic now, but the area of digital media and long-term archival/retrieval is something that I find very interesting. I'll leave it to Rob to somehow eventually tie this back into a discussion of COTs technology and HPC. > > It's interesting: I just got an iPad a few weeks ago, mostly as a > > reader/web-browser device, and I've been reading a variety of > > out-of-copyright works: H. Rider Haggard, Joseph Conrad, Mark Twain. > Thank > > you Gutenberg Project! > > It is awesome, isn't it? Amazon also carries many of the out-of-copyright works in their Kindle store for $0 (and gives credit to Gutenburg to a small extent). It was nice to be able to go pickup things like the Sherlock Holmes series, Homer's Illiad and some of Einstein's works (which I don't pretend to understand) and have them downloaded via 3G on Amazon's dime. I will say that because of this I tend to overlook their rather high (IMHO) price on current digital content and have probably purchased more e-books overall as a result. > > And, since I am sitting/lying here with a very sore back from moving boxes > > of books around this weekend looking for that book that I *know* is in there > > somewhere, the prospect of some magic box that would scan all my books into > > a format usable into eternity would be quite nice. I might even think that > > a personal "print on demand" would be nice that could generate a cheap/quick > > copy for reading in bed(yes, the iPad and Kindle, etc., are nice, but > > there's affordances provided by the paper edition that is nice.. But I don't > > need hardcover or, even, any cover..) There is just *something* about paper, isn't there? And while I don't have a library to the extent of RGBs or others, I do like having some books around (glancing at the two bookshelves in my office). On the other hand, I still have boxes of books sitting around unopened since we moved house 4-5 years ago. I certainly need a purge, lest I end up on one of those "hoarding" shows that seem to be popular as of late. At some point, I have to ask myself if I really *need* to have a old beat-up, falling apart copy of "Voyage of the Space Beagle" laying around. > > (or, even better, a service that has scanned all the books for me, e.g. > > Google, and that upon receiving some proof of ownership of the physical > > book, lets me have an electronic copy of the same... I'd gladly pay some > > nominal fee for such a thing, providing it wasn't for some horrible locked, > > time limited format which depends on the original vendor being in business > > 20 years from now. I also recognize the concern about how "once in digital > > form, copying becomes very cheap" which I think is valid. A scanning service would be wonderful for a lot of the books I have, mainly those I view as reference-type material. For current reference material, Safari Books Online has a reasonable usage model that allows for making hardcopy of their online content. Now if there was only a simple way to transcribe the same content for download to my Kindle I would be set (something beyond the OCR+PDF approach, which is awkward and inconsistent). > What a killer idea. Acceptable use, doggone it! I'd ship them books > by the boxful in exchange for a movable (even DRM controlled) image, a la > Ipod music. I just don't want to rebuy them, like I've now bought most > of my music collection TWICE (vinyl and CD). [let's not get started about vinyl collections - that's a whole 'nother set of unopened boxes] The problem is that many of the media houses are still waging an underground war on Fair Use, despite the legal decisions handed down by the courts. As an example, I recently had a email exchange with one of the customer service people at a major network. I was trying to locate additional interview footage from when my brother-in-law was on a certain hour-long Sunday evening news show. This person informed me that I did not have their "permission" to recorded the over-the-air broadcast of the show and burn it on a DVD to give to my sister, so what I was doing was not legal. This was news to me, since this usage model was clearly defined as permissible by the Supreme Court many years ago in the Sony v. Universal "Betamax Case". While the market for online music, video and written works have forced the various publishers to acknowledge to the need to provide content in digital form, to a great extent they had to be dragged kicking and screaming into the 21st century. A lot of progress has been made but there is still a lot of resistance towards efforts to open up availability and access even further. I would like see a service where I could take bins of old books to a used book store and somehow get credits towards the purchase of e-books online. I think that could break me of my paperback hoarding habit pretty quickly. -bill ------------------------------ Message: 2 Date: Tue, 26 Oct 2010 10:59:25 -0400 (EDT) From: "Douglas Eadline" Subject: RE: [Beowulf] how Google warps your brain To: "Hearns, John" Cc: beowulf at beowulf.org, "Robert G. Brown" Message-ID: <49886.192.168.93.213.1288105165.squirrel at mail.eadline.org> Content-Type: text/plain;charset=iso-8859-1 Not that there is anything wrong with that. > > As usual, a highly insightful post from RGB. > > > >> a) Multiple copies. Passenger pigeons may be robust, but once the > number of copies drops below a critical point, they are gone. E. Coli > we will always have >> with us (possibly in a constantly changing form) because there are so > very many copies, so very widely spread. > > I probably shouldn't mention Wikileaks here... > >> >> At the moment, the internet has if anything VASTLY INCREASED a, b and > c >> for every single document in the public domain that has been ported > to, >> e.g. Project Gutenberg. >> >> Right now, I'm sitting on a cache of "Saint" books, by Leslie > Charteris >> (who was a great favorite of mine growing up and still is). >> >> Nobody is going to reprint the Saint stories. They are a gay fantasy >> from another time, > > Simon Templar? Gay? Cough. > > Next you will be telling me that there are gay undertones in Top Gun, > the film with the sexiest astrophysicist ever. > > >> might well last to the end of civilization. Replicate them a few >> million times, PERPETUATE them from generation to generation by >> renewing >> the copies, and backing them up, and recopying them in formats where >> they are still useful. > > The cloud backup providers will be keeping copies of data on > geographically spread sites. > However, we should at this stage be asking what are the mechanisms for > cloud storage companies > for > *) living wills - what happens when the company goes bust > > *) what are the strategies for migrating the data onto new storage > formats > > >> >> Or, to put it differently, suppose every single human on the planet > had >> access to the modern equivalent of Diophantus's Arithmetica on their >> computer, their Kindle, their Ipad > I believe that was the original intent for the Web. Still under > development! > > > The contents of this email are confidential and for the exclusive use of > the intended recipient. If you receive this email in error you should not > copy it, retransmit it, use it or disclose its contents but should return > it to the sender immediately and delete your copy. > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > -- > This message has been scanned for viruses and > dangerous content by MailScanner, and is > believed to be clean. > -- Doug -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. ------------------------------ Message: 3 Date: Tue, 26 Oct 2010 09:16:47 +0100 From: "Hearns, John" Subject: RE: [Beowulf] Anybody using Redhat HPC Solution in their Beowulf To: "Ellis H. Wilson III" , Message-ID: <68A57CCFD4005646957BD2D18E60667B12154E23 at milexchmb1.mil.tagmclarengroup.com> Content-Type: text/plain; charset="us-ascii" > I don't think you could find a statement more orthogonal to the spirit > of the Beowulf list than, "Please, please don't "roll your own" > system..." Isn't Beowulfery about the drawing together of inexpensive > components in an intelligent fashion suited just for your particular > application while using standardized (and thereby cheap by the law of > scale) hardware? I'm not suggesting Richard build his own NIC - but > there is nothing wrong with using even a distribution of Linux not > intended for HPC (so long as you're smart about it) and picking and > choosing the software (queuing managers, tracers, etc) he finds works > best. > > Also, I would argue if a company is selling you an HPC solution, it's > either: > 1. A true Beowulf in terms of using COTS hardware, in which case you > are > likely getting less than your money is worth or Ellis, I am going to politely disagree with you - now there's a surprise! I have worked as an engineer for two HPC companies - Clustervision and Streamline. My slogan phrase on this issue is "Any fool can go down PC World and buy a bunch of PCs" By that I mean that CPU is cheap these days, but all you will get is a bunch of boxes on your loading bay. As you say, and you are right, you then have the option of installing Linux plus a cluster management stack and getting a cluster up and running. However, as regards price, I would say that actually you will be paying very, very little premium for getting a supported, tested and pre-assembled cluster from a vendor. Academic margins are razor thin - the companies are not growing fat over academic deals. They also can get special pricing from Intel/AMD if the project can be justified - probably ending up at a price per box near to what you pay at PC World. Or take (say) rack top switches. Do you want to have a situation where the company which supports your cluster has switches sitting on a shelf, so when a switch fails someone (me!) is sent out the next morning to deliver a new switch in a box, cable it in and get you running? Or do you want to deal direct with the returns department at $switch vendor, or even (shudder) take the route of using the same switches as the campus network - so you don't get to choose on the basis of performance or suitability, but just depend on the warm and fuzzies your campus IT people have. We then come to support - say you buy that heap of boxes from a Tier 1 - say it is the same company your campus IT folks have a campus wide deal with. You'll get the same type of support you get for general servers running Windows - and you'll deal with first line support staff on the phone every time. Me, I've been there, seen there, done it with tier 1 support like that. As a for instance, HPC workloads tend to stress the RAM in a system, and you get frequent ECC errors on a young system as it is bedding in. Try phoning support every time a light comes on, and get talked through the "have you run XXX diagnostic", it soon gets wearing. Before Tier 1 companies cry foul, of course both the above companies and all other cluster companies integrate Tier 1 servers - but that is a different scenario from getting boxes delivered through your campus agreement with $Tier1. The contents of this email are confidential and for the exclusive use of the intended recipient. If you receive this email in error you should not copy it, retransmit it, use it or disclose its contents but should return it to the sender immediately and delete your copy. ------------------------------ Message: 4 Date: Tue, 26 Oct 2010 12:09:12 -0400 From: "Ellis H. Wilson III" Subject: Re: [Beowulf] Anybody using Redhat HPC Solution in their Beowulf To: "Hearns, John" Cc: beowulf at beowulf.org Message-ID: <4CC6FD28.1050303 at runnersroll.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed On 10/26/10 04:16, Hearns, John wrote: > I have worked as an engineer for two HPC companies - Clustervision and > Streamline. > My slogan phrase on this issue is "Any fool can go down PC World and buy > a bunch of PCs" Well if you are buying PCs in bulk at retail pricing, you are a fool anyway. Plus most PC World PCs won't have ECC RAM so I wasn't really referring to those as few of us tolerate random bit flips. > However, as regards price, I would say that actually you will be paying > very, very little premium > for getting a supported, tested and pre-assembled cluster from a vendor. > Academic margins are razor thin - the companies are not growing fat over > academic deals. > They also can get special pricing from Intel/AMD if the project can be > justified - probably ending > up at a price per box near to what you pay at PC World. Again, not comparing PC World to Tier 1 bulk purchases. I'm comparing Tier 1 bulk purchases w/o an OS (so you can DIY) with specialized HPC vendor purchases where you don't have to DIY. Even then, perhaps it breaks even the first year if you get a very, very good deal from the HPC vendor. However, to get the deal you are probably contracted into four or five years of support and when considering HPC, involving more humans are the fastest way to get a really inefficient and expensive cluster. After the first year and up until the lifetime of the cluster involving human support annually will add a large cost overhead you have to account for at the beginning (and probably buy less hardware because of which). > Or take (say) rack top switches. Do you want to have a situation where > the company which supports your cluster > has switches sitting on a shelf, so when a switch fails someone (me!) is > sent out the next morning to deliver > a new switch in a box, cable it in and get you running? That's probably a hell of a lot faster than waiting on a vendor to get you a new switch through some RMA process. Plus you know the cabling is done right :). Optimally IMHO, in university setups physical scientists create the need for HPC. These types shouldn't (as Kilian mentions) need to inherit all of the responsibilities and overheads of cluster management to use one (or pay cluster vendors annually for support). They should simply walk over to the CS department, find system guys (who would probably drool over the potential of administering a reasonably sized cluster) and work out an agreement where the physical science types can "just use it" and the systems/CS guys administer it and can once in a while trace workloads, test new load balancing mechanisms, try different kernel settings for performance, etc. This way the physical scientists get their work done on a well supported HPC system for no extra cash and computer scientists get great, non-toy traces and workloads to further their own research. Both parties win. Now in organizations that don't have a CS department I agree that HPC vendors are the way to go. ellis ------------------------------ Message: 5 Date: Tue, 26 Oct 2010 11:18:56 +0200 From: Kilian CAVALOTTI Subject: Re: [Beowulf] Anybody using Redhat HPC Solution in their Beowulf To: "Ellis H. Wilson III" Cc: beowulf at beowulf.org Message-ID: Content-Type: text/plain; charset=UTF-8 Hi, On Tue, Oct 26, 2010 at 1:00 AM, Ellis H. Wilson III wrote: > Also, I would argue if a company is selling you an HPC solution, it's > either: > 1. A true Beowulf in terms of using COTS hardware, in which case you are > likely getting less than your money is worth or Well, depends on how you value your time and the required expertise to put all those COTS and OSS pieces together to make them run smoothly and efficiently. Most scientists and HPC systems users are not professional sysadmins (which is good, they have a job to do), and the value of trained, experienced, skilled individuals who can put together a reliable and useful HPC system is sometimes overlooked (ie. undervalued). I agree with your later statement, though: > I personally don't think the "market for cluster vendors" is [...] > the Beowulf list. Cheers, -- Kilian ------------------------------ Message: 6 Date: Wed, 27 Oct 2010 09:32:43 -0700 From: "Lux, Jim (337C)" Subject: RE: [Beowulf] Anybody using Redhat HPC Solution in their Beowulf To: "Ellis H. Wilson III" , "Hearns, John" Cc: "beowulf at beowulf.org" Message-ID: Content-Type: text/plain; charset="us-ascii" > > Optimally IMHO, in university setups physical scientists create the need > for HPC. These types shouldn't (as Kilian mentions) need to inherit all > of the responsibilities and overheads of cluster management to use one > (or pay cluster vendors annually for support). They should simply walk > over to the CS department, find system guys (who would probably drool > over the potential of administering a reasonably sized cluster) and work > out an agreement where the physical science types can "just use it" and > the systems/CS guys administer it and can once in a while trace > workloads, test new load balancing mechanisms, try different kernel > settings for performance, etc. This way the physical scientists get > their work done on a well supported HPC system for no extra cash and > computer scientists get great, non-toy traces and workloads to further > their own research. Both parties win. > I don't know about this model. This is like developing software on prototype hardware. The hardware guys and gals keep wanting to change the hardware, and the software developers complain that their software keeps breaking, or that the hardware is buggy (and it is). The computational physics and computational biology guys get to work on cool, nifty stuff to push their dissertation forward by using a hopefully stable computational platform. But I don't think the CS guys would drool over the possibility of administering a cluster. The CS guys get to be sysadmin/maintenance types...not very fun for them, and not the kind of work that would work for their dissertation. Now, if the two groups were doing research on new computational methods (what's the best way to simulate X) perhaps you'd get a collaboration. ------------------------------ _______________________________________________ Beowulf mailing list Beowulf at beowulf.org http://www.beowulf.org/mailman/listinfo/beowulf End of Beowulf Digest, Vol 80, Issue 22 *************************************** From rgb at phy.duke.edu Wed Oct 27 12:48:57 2010 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed, 27 Oct 2010 15:48:57 -0400 (EDT) Subject: [Beowulf] how Google warps your brain In-Reply-To: References: <20101021104323.GV28998@leitl.org> <76097BB0C025054786EFAB631C4A2E3C0948BCAB@MERCMBX03D.na.SAS.com> Message-ID: On Thu, 21 Oct 2010, Jack Carrozzo wrote: > To add my $0.02 to Bills points, it becomes more difficult also when dealing > with multiple groups to decide on the type of setup and whatnot.? > Where I went to school, the Math dept had a huge shared-memory SGI setup > whilst the Physics department had a standard Beowulf cluster. Both groups > used their systems rarely, and other departments had been asking for HPC > hardware also. However, after long debates by all parties, a > single?infrastructure?couldn't be decided upon and each?independent?dept > just got a little money to fix up their curent systems. It's not a trivial question. I (when I do HPC at all) run trivially parallel simulations that run for a long time independently and then produce only a few numbers and could (and once upon a time, did) use sneakernet for my IPCs and job control and still get excellent speedup. Other people need bleeding edge networks in unusual topologies, or enormous disk arrays with high speed access, or huge amounts of memory (in any combination) in order to do their work. Where I would care only about building a bigger pile of fast enough big enough cheap PCs, they would care about building a real beowulf with more (maybe even a lot more) spent on memory and network and disk than on lots of cores at best possible FLOPS/Dollar. Then there are differences in utilization patterns, robustness of the programs in a large cluster, how deep your pockets are, how expensive systems management is, whether or not you can DIY, whether the cluster you need require renovations such as devoted AC and power and space or can stack up a pile of processors in a corner of your office without either blowing a fuse or melting down, whether you are a computer geek yourself or if you are slightly scared of actually touching a mouse because it might bite. I personally got into cluster computing because the large computing center built at enormous expense by the state was -- "useless" to me doesn't do it justice, doesn't convey the waste of money and resources that my utilization of the resource even "for free" (with a small grant of heavily shared time) represented. Ultimately, it is as true today as it was then that YMMV, that people's needs vary wildly (and the cluster architecture that represents CBA optimum varies along with it), that optimizing all of this across a large group of users doing very different kinds of work is more difficult still (and introduces politics, new economic costs and benefits, questions of the relative "value" of the research being done by different groups, and much more into the not-terribly trivial equations involved). In the end, for many people I'm quite certain that the best possible solution (from a CB point of view) is to build their own cluster for their own use (this is almost a no-brainer if they can use 100% of the duty cycle of any cluster in the Universe that they can afford in the first place). Sure, they may WANT to participate in a shared cluster but if they do it is only to get the free cycles without any need or intention of contributing free cycles of their own as their needs are infinite, or any reasonable shared architecture cluster that will be either too lacking in some key resource to be useful or will have too much spent on the "standard" cluster nodes so that buying in wastes more value than they gain relative to buying just what THEY need. For others the opposite is true. This is particularly true when there are lots of people who ARE doing embarrassingly parallel computations (say), especially ones where their work pattern represents only 20-30% utilization. Work very hard for three months, then do something else for four (writing the papers, getting more experimental data, whatever). Then sharing with other people with similar requirements can cut the three months down to one, perhaps, if things work out just right. So yup, YMMV. Cluster "centers" done right can be great. "Clouds" of desktops, mediated by things like Condor, can be great. Personal clusters can be great. Small/departmental clusters can be ideal, especially if computational needs are homogeneous within the department (but not between departments). Where great in context means "cost-benefit near-optimal given your resources and needs". One size, or architecture, does not fit all. But "cluster computing", and the beowulfery studyied and discussed on this list, has long had the virtue of cutting ACROSS the diversity, with smart people sharing ideas and experiences that give you a decent chance of putting together a >>good<< CBA solution if not a >>perfect<< one, for a very wide range of possible tasks and attendant architectures and political/economic resource constraints. rgb > > -Jack > > On Thu, Oct 21, 2010 at 11:19 AM, Bill Rankin wrote: > Good points by Jim, and while I generally try and avoid "me too" > posts, I just wanted to add my two cents. > > > In my previous life I worked on building a central HPC cluster > facility at Duke. ?The single biggest impediment to creating > this resource was actually trying to justify its expense and put > a actual number on the cost savings of having a centrally > managed system. ?This was extremely difficult to do given the > way the university tracked its infrastructure and IT costs. > > If a research group bought a rack or two of nodes then they were > usually hosted in the local school/department facilities and > supported by local IT staff. ?The cost of power/cooling and > staff time became part of a larger departmental budget and > effectively disappeared from the financial radar. ?They were not > tracked at that level of granularity. ?They were effectively > invisible. > > Put all those systems together into a shared facility and all of > a sudden those costs become very visible. ?You can track the > power and cooling costs. ?You now have salaries for dedicated > IT/HPC staff. ?And ultimately you have one person having to cut > some very large checks. ?And because of the university funding > model and the associated politics it is extremely difficult, if > not impossible, to actually recoup funds from the departments or > even the research groups who would be saving money. > > In order to make it work, you really need the senior leadership > of the university to commit to making central HPC infrastructure > an absolute requirement, and sticking to that commitment when it > comes budget time and the politics are running hot and heavy > over who gets how much. > > Now to most of us this is a rehash of a conversation that we > have had often before. ?And with clusters and HPC pretty much > established as a necessity for any major research university, > the development of central facilities would seem to be the > obvious solution. ?I find it somewhat concerning that > institutions like Harvard are apparently still dealing with this > issue. > > > -bill > > > > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin > Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > > > Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu From rgb at phy.duke.edu Wed Oct 27 13:08:14 2010 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed, 27 Oct 2010 16:08:14 -0400 (EDT) Subject: [Beowulf] Interesting In-Reply-To: <276820005-1288199010-cardhu_decombobulator_blackberry.rim.net-1964510620-@bda263.bisx.produk.on.blackberry> References: <201010271636.o9RGa3Bq013346@bluewest.scyld.com> <276820005-1288199010-cardhu_decombobulator_blackberry.rim.net-1964510620-@bda263.bisx.produk.on.blackberry> Message-ID: On Wed, 27 Oct 2010, Sarma Tangirala wrote: > The recent digests that I am getting are quite interesting (bad google) and I have a question. > > What I'd like to know is, is it possible to have a our history captured in its entirety so that none of the future generations have to run around (like Hari Seldon) because information from waaaay back is corrupt and not take care of? > > Do you guys know of any existing sources that you can point me to? > > Is this under distributed systems or under compression algorithms? Sure. It's very simple. Obtain a very large store of platinum plates (or maybe gold -- one of the more or less corrosion-proof, durable metals at any rate). Engrave our history, in pictures that require no language to understand or encoded in such a way that the documentation contains a universal rosetta stone allowing language to be bootstrapped (there are theories for how to go about doing this; they were used to engrave messages on the Voyager and some of the other spacecraft we shot out of the solar system IIRC). Go to the moon. Dig a really big hole at one of the poles (to avoid thermal extremes) and build a bunker out of fused glass three meters thick and one kilometer underground. Place the individually shock wrapped and thermally isolated plates in neat racks. Seal it hermetically (melt the glass back over the hole you put them in through). Backfill the hole, fusing multiple sets of three meter thick layers of steel reinforced glass separated by ten or so meters of dirt across an area many times larger than the vault itself. Cap the whole thing with a large basin of "soft" sand 100 meters thick. With luck, it will then last until the sun reaches the point where it starts to turn into a red giant and engulfs the earth and the moon, unless of course one is unlucky and an asteroid big enough to overwhelm the shock absorber strikes dead on. To outlast the sun, consider duplicating this process on several of the planetoids out in the Oort cloud or just shooting mid-sized asteroids containing vaults out of the solar system entirely. But out there one has to worry about gravitational resonance and so on -- the inner planets (including the moon) are actually a lot more stable and safer at this point individually. Beyond that, the only solution is not to rigidly resist entropy, but to flow with, to adapt to, the process of entropy. Copy, copy, and copy again. Copy in many formats, in many locations. Avoid Library of Alexandria solutions as they are not robust to events like fire and political turmoil -- put the Library of Alexandria in the hands of >>every citizen<< and constantly update them to keep them in sync with a widely dispersed set of master copies, all nicely checksummed and so on. Surf the technology as the waves dictate, rather than being crushed by the tides of change. No matter what you face thermal chaos, the inexorable force of entropy. One day the hot winds will blow over the ashes of our species, our very civilization, and we will become mere dust in the wind once again, and very probably, long before that time arrives, all that we now know and say and do will have been discovered and lost over and over again. The second law of thermodynamics has no mercy, and entropy will eventually prevail. rgb > > Any other two cents on this is welcome! > Sent from my BlackBerry > > -----Original Message----- > From: beowulf-request at beowulf.org > Sender: beowulf-bounces at beowulf.org > Date: Wed, 27 Oct 2010 09:36:13 > To: > Reply-To: beowulf at beowulf.org > Subject: Beowulf Digest, Vol 80, Issue 22 > > Send Beowulf mailing list submissions to > beowulf at beowulf.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://www.beowulf.org/mailman/listinfo/beowulf > or, via email, send a message with subject or body 'help' to > beowulf-request at beowulf.org > > You can reach the person managing the list at > beowulf-owner at beowulf.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Beowulf digest..." > > > Today's Topics: > > 1. RE: how Google warps your brain (Bill Rankin) > 2. RE: how Google warps your brain (Douglas Eadline) > 3. RE: Anybody using Redhat HPC Solution in their Beowulf > (Hearns, John) > 4. Re: Anybody using Redhat HPC Solution in their Beowulf > (Ellis H. Wilson III) > 5. Re: Anybody using Redhat HPC Solution in their Beowulf > (Kilian CAVALOTTI) > 6. RE: Anybody using Redhat HPC Solution in their Beowulf > (Lux, Jim (337C)) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Tue, 26 Oct 2010 14:54:43 +0000 > From: Bill Rankin > Subject: RE: [Beowulf] how Google warps your brain > To: Beowulf Mailing List > Cc: "Robert G. Brown" > Message-ID: > <76097BB0C025054786EFAB631C4A2E3C0948F542 at MERCMBX03D.na.SAS.com> > Content-Type: text/plain; charset="us-ascii" > > Heading completely off-topic now, but the area of digital media and long-term archival/retrieval is something that I find very interesting. I'll leave it to Rob to somehow eventually tie this back into a discussion of COTs technology and HPC. > > >>> It's interesting: I just got an iPad a few weeks ago, mostly as a >>> reader/web-browser device, and I've been reading a variety of >>> out-of-copyright works: H. Rider Haggard, Joseph Conrad, Mark Twain. >> Thank >>> you Gutenberg Project! >> >> It is awesome, isn't it? > > Amazon also carries many of the out-of-copyright works in their Kindle store for $0 (and gives credit to Gutenburg to a small extent). It was nice to be able to go pickup things like the Sherlock Holmes series, Homer's Illiad and some of Einstein's works (which I don't pretend to understand) and have them downloaded via 3G on Amazon's dime. > > I will say that because of this I tend to overlook their rather high (IMHO) price on current digital content and have probably purchased more e-books overall as a result. > > >>> And, since I am sitting/lying here with a very sore back from moving boxes >>> of books around this weekend looking for that book that I *know* is in there >>> somewhere, the prospect of some magic box that would scan all my books into >>> a format usable into eternity would be quite nice. I might even think that >>> a personal "print on demand" would be nice that could generate a cheap/quick >>> copy for reading in bed(yes, the iPad and Kindle, etc., are nice, but >>> there's affordances provided by the paper edition that is nice.. But I don't >>> need hardcover or, even, any cover..) > > There is just *something* about paper, isn't there? And while I don't have a library to the extent of RGBs or others, I do like having some books around (glancing at the two bookshelves in my office). On the other hand, I still have boxes of books sitting around unopened since we moved house 4-5 years ago. I certainly need a purge, lest I end up on one of those "hoarding" shows that seem to be popular as of late. > > At some point, I have to ask myself if I really *need* to have a old beat-up, falling apart copy of "Voyage of the Space Beagle" laying around. > > >>> (or, even better, a service that has scanned all the books for me, e.g. >>> Google, and that upon receiving some proof of ownership of the physical >>> book, lets me have an electronic copy of the same... I'd gladly pay some >>> nominal fee for such a thing, providing it wasn't for some horrible locked, >>> time limited format which depends on the original vendor being in business >>> 20 years from now. I also recognize the concern about how "once in digital >>> form, copying becomes very cheap" which I think is valid. > > A scanning service would be wonderful for a lot of the books I have, mainly those I view as reference-type material. For current reference material, Safari Books Online has a reasonable usage model that allows for making hardcopy of their online content. Now if there was only a simple way to transcribe the same content for download to my Kindle I would be set (something beyond the OCR+PDF approach, which is awkward and inconsistent). > > >> What a killer idea. Acceptable use, doggone it! I'd ship them books >> by the boxful in exchange for a movable (even DRM controlled) image, a la >> Ipod music. I just don't want to rebuy them, like I've now bought most >> of my music collection TWICE (vinyl and CD). > > [let's not get started about vinyl collections - that's a whole 'nother set of unopened boxes] > > The problem is that many of the media houses are still waging an underground war on Fair Use, despite the legal decisions handed down by the courts. As an example, I recently had a email exchange with one of the customer service people at a major network. I was trying to locate additional interview footage from when my brother-in-law was on a certain hour-long Sunday evening news show. This person informed me that I did not have their "permission" to recorded the over-the-air broadcast of the show and burn it on a DVD to give to my sister, so what I was doing was not legal. > > This was news to me, since this usage model was clearly defined as permissible by the Supreme Court many years ago in the Sony v. Universal "Betamax Case". > > While the market for online music, video and written works have forced the various publishers to acknowledge to the need to provide content in digital form, to a great extent they had to be dragged kicking and screaming into the 21st century. A lot of progress has been made but there is still a lot of resistance towards efforts to open up availability and access even further. > > > I would like see a service where I could take bins of old books to a used book store and somehow get credits towards the purchase of e-books online. I think that could break me of my paperback hoarding habit pretty quickly. > > > -bill > > > > > ------------------------------ > > Message: 2 > Date: Tue, 26 Oct 2010 10:59:25 -0400 (EDT) > From: "Douglas Eadline" > Subject: RE: [Beowulf] how Google warps your brain > To: "Hearns, John" > Cc: beowulf at beowulf.org, "Robert G. Brown" > Message-ID: > <49886.192.168.93.213.1288105165.squirrel at mail.eadline.org> > Content-Type: text/plain;charset=iso-8859-1 > > > Not that there is anything wrong with that. > > >> >> As usual, a highly insightful post from RGB. >> >> >> >>> a) Multiple copies. Passenger pigeons may be robust, but once the >> number of copies drops below a critical point, they are gone. E. Coli >> we will always have >>> with us (possibly in a constantly changing form) because there are so >> very many copies, so very widely spread. >> >> I probably shouldn't mention Wikileaks here... >> >>> >>> At the moment, the internet has if anything VASTLY INCREASED a, b and >> c >>> for every single document in the public domain that has been ported >> to, >>> e.g. Project Gutenberg. >>> >>> Right now, I'm sitting on a cache of "Saint" books, by Leslie >> Charteris >>> (who was a great favorite of mine growing up and still is). >>> >>> Nobody is going to reprint the Saint stories. They are a gay fantasy >>> from another time, >> >> Simon Templar? Gay? Cough. >> >> Next you will be telling me that there are gay undertones in Top Gun, >> the film with the sexiest astrophysicist ever. >> >> >>> might well last to the end of civilization. Replicate them a few >>> million times, PERPETUATE them from generation to generation by >>> renewing >>> the copies, and backing them up, and recopying them in formats where >>> they are still useful. >> >> The cloud backup providers will be keeping copies of data on >> geographically spread sites. >> However, we should at this stage be asking what are the mechanisms for >> cloud storage companies >> for >> *) living wills - what happens when the company goes bust >> >> *) what are the strategies for migrating the data onto new storage >> formats >> >> >>> >>> Or, to put it differently, suppose every single human on the planet >> had >>> access to the modern equivalent of Diophantus's Arithmetica on their >>> computer, their Kindle, their Ipad >> I believe that was the original intent for the Web. Still under >> development! >> >> >> The contents of this email are confidential and for the exclusive use of >> the intended recipient. If you receive this email in error you should not >> copy it, retransmit it, use it or disclose its contents but should return >> it to the sender immediately and delete your copy. >> >> _______________________________________________ >> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing >> To change your subscription (digest mode or unsubscribe) visit >> http://www.beowulf.org/mailman/listinfo/beowulf >> >> -- >> This message has been scanned for viruses and >> dangerous content by MailScanner, and is >> believed to be clean. >> > > > -- > Doug > > -- > This message has been scanned for viruses and > dangerous content by MailScanner, and is > believed to be clean. > > > > ------------------------------ > > Message: 3 > Date: Tue, 26 Oct 2010 09:16:47 +0100 > From: "Hearns, John" > Subject: RE: [Beowulf] Anybody using Redhat HPC Solution in their > Beowulf > To: "Ellis H. Wilson III" , > > Message-ID: > <68A57CCFD4005646957BD2D18E60667B12154E23 at milexchmb1.mil.tagmclarengroup.com> > > Content-Type: text/plain; charset="us-ascii" > >> I don't think you could find a statement more orthogonal to the spirit >> of the Beowulf list than, "Please, please don't "roll your own" >> system..." Isn't Beowulfery about the drawing together of inexpensive >> components in an intelligent fashion suited just for your particular >> application while using standardized (and thereby cheap by the law of >> scale) hardware? I'm not suggesting Richard build his own NIC - but >> there is nothing wrong with using even a distribution of Linux not >> intended for HPC (so long as you're smart about it) and picking and >> choosing the software (queuing managers, tracers, etc) he finds works >> best. >> >> Also, I would argue if a company is selling you an HPC solution, it's >> either: >> 1. A true Beowulf in terms of using COTS hardware, in which case you >> are >> likely getting less than your money is worth or > > > Ellis, I am going to politely disagree with you - now there's a > surprise! > > I have worked as an engineer for two HPC companies - Clustervision and > Streamline. > My slogan phrase on this issue is "Any fool can go down PC World and buy > a bunch of PCs" > By that I mean that CPU is cheap these days, but all you will get is a > bunch of boxes > on your loading bay. As you say, and you are right, you then have the > option of installing > Linux plus a cluster management stack and getting a cluster up and > running. > > However, as regards price, I would say that actually you will be paying > very, very little premium > for getting a supported, tested and pre-assembled cluster from a vendor. > Academic margins are razor thin - the companies are not growing fat over > academic deals. > They also can get special pricing from Intel/AMD if the project can be > justified - probably ending > up at a price per box near to what you pay at PC World. > > Or take (say) rack top switches. Do you want to have a situation where > the company which supports your cluster > has switches sitting on a shelf, so when a switch fails someone (me!) is > sent out the next morning to deliver > a new switch in a box, cable it in and get you running? > Or do you want to deal direct with the returns department at $switch > vendor, or even (shudder) take the route > of using the same switches as the campus network - so you don't get to > choose on the basis of performance or > suitability, but just depend on the warm and fuzzies your campus IT > people have. > > > We then come to support - say you buy that heap of boxes from a Tier 1 - > say it is the same company your > campus IT folks have a campus wide deal with. You'll get the same type > of support you get for general > servers running Windows - and you'll deal with first line support staff > on the phone every time. > Me, I've been there, seen there, done it with tier 1 support like that. > As a for instance, HPC workloads tend to stress the RAM in a system, and > you get frequent ECC errors on > a young system as it is bedding in. Try phoning support every time a > light comes on, and get talked through > the "have you run XXX diagnostic", it soon gets wearing. > Before Tier 1 companies cry foul, of course both the above companies and > all other cluster companies integrate > Tier 1 servers - but that is a different scenario from getting boxes > delivered through your campus agreement with > $Tier1. > > > > > > > > > > > > > The contents of this email are confidential and for the exclusive use of the intended recipient. If you receive this email in error you should not copy it, retransmit it, use it or disclose its contents but should return it to the sender immediately and delete your copy. > > > > ------------------------------ > > Message: 4 > Date: Tue, 26 Oct 2010 12:09:12 -0400 > From: "Ellis H. Wilson III" > Subject: Re: [Beowulf] Anybody using Redhat HPC Solution in their > Beowulf > To: "Hearns, John" > Cc: beowulf at beowulf.org > Message-ID: <4CC6FD28.1050303 at runnersroll.com> > Content-Type: text/plain; charset=ISO-8859-1; format=flowed > > On 10/26/10 04:16, Hearns, John wrote: >> I have worked as an engineer for two HPC companies - Clustervision and >> Streamline. >> My slogan phrase on this issue is "Any fool can go down PC World and buy >> a bunch of PCs" > > Well if you are buying PCs in bulk at retail pricing, you are a fool > anyway. Plus most PC World PCs won't have ECC RAM so I wasn't really > referring to those as few of us tolerate random bit flips. > >> However, as regards price, I would say that actually you will be paying >> very, very little premium >> for getting a supported, tested and pre-assembled cluster from a vendor. >> Academic margins are razor thin - the companies are not growing fat over >> academic deals. >> They also can get special pricing from Intel/AMD if the project can be >> justified - probably ending >> up at a price per box near to what you pay at PC World. > > Again, not comparing PC World to Tier 1 bulk purchases. I'm comparing > Tier 1 bulk purchases w/o an OS (so you can DIY) with specialized HPC > vendor purchases where you don't have to DIY. Even then, perhaps it > breaks even the first year if you get a very, very good deal from the > HPC vendor. However, to get the deal you are probably contracted into > four or five years of support and when considering HPC, involving more > humans are the fastest way to get a really inefficient and expensive > cluster. After the first year and up until the lifetime of the cluster > involving human support annually will add a large cost overhead you have > to account for at the beginning (and probably buy less hardware because > of which). > >> Or take (say) rack top switches. Do you want to have a situation where >> the company which supports your cluster >> has switches sitting on a shelf, so when a switch fails someone (me!) is >> sent out the next morning to deliver >> a new switch in a box, cable it in and get you running? > > That's probably a hell of a lot faster than waiting on a vendor to get > you a new switch through some RMA process. Plus you know the cabling is > done right :). > > Optimally IMHO, in university setups physical scientists create the need > for HPC. These types shouldn't (as Kilian mentions) need to inherit all > of the responsibilities and overheads of cluster management to use one > (or pay cluster vendors annually for support). They should simply walk > over to the CS department, find system guys (who would probably drool > over the potential of administering a reasonably sized cluster) and work > out an agreement where the physical science types can "just use it" and > the systems/CS guys administer it and can once in a while trace > workloads, test new load balancing mechanisms, try different kernel > settings for performance, etc. This way the physical scientists get > their work done on a well supported HPC system for no extra cash and > computer scientists get great, non-toy traces and workloads to further > their own research. Both parties win. > > Now in organizations that don't have a CS department I agree that HPC > vendors are the way to go. > > ellis > > > ------------------------------ > > Message: 5 > Date: Tue, 26 Oct 2010 11:18:56 +0200 > From: Kilian CAVALOTTI > Subject: Re: [Beowulf] Anybody using Redhat HPC Solution in their > Beowulf > To: "Ellis H. Wilson III" > Cc: beowulf at beowulf.org > Message-ID: > > Content-Type: text/plain; charset=UTF-8 > > Hi, > > On Tue, Oct 26, 2010 at 1:00 AM, Ellis H. Wilson III > wrote: >> Also, I would argue if a company is selling you an HPC solution, it's >> either: >> 1. A true Beowulf in terms of using COTS hardware, in which case you are >> likely getting less than your money is worth or > > Well, depends on how you value your time and the required expertise to > put all those COTS and OSS pieces together to make them run smoothly > and efficiently. > Most scientists and HPC systems users are not professional sysadmins > (which is good, they have a job to do), and the value of trained, > experienced, skilled individuals who can put together a reliable and > useful HPC system is sometimes overlooked (ie. undervalued). > > I agree with your later statement, though: > >> I personally don't think the "market for cluster vendors" is [...] >> the Beowulf list. > > Cheers, > -- > Kilian > > > ------------------------------ > > Message: 6 > Date: Wed, 27 Oct 2010 09:32:43 -0700 > From: "Lux, Jim (337C)" > Subject: RE: [Beowulf] Anybody using Redhat HPC Solution in their > Beowulf > To: "Ellis H. Wilson III" , "Hearns, John" > > Cc: "beowulf at beowulf.org" > Message-ID: > > > Content-Type: text/plain; charset="us-ascii" > > >> >> Optimally IMHO, in university setups physical scientists create the need >> for HPC. These types shouldn't (as Kilian mentions) need to inherit all >> of the responsibilities and overheads of cluster management to use one >> (or pay cluster vendors annually for support). They should simply walk >> over to the CS department, find system guys (who would probably drool >> over the potential of administering a reasonably sized cluster) and work >> out an agreement where the physical science types can "just use it" and >> the systems/CS guys administer it and can once in a while trace >> workloads, test new load balancing mechanisms, try different kernel >> settings for performance, etc. This way the physical scientists get >> their work done on a well supported HPC system for no extra cash and >> computer scientists get great, non-toy traces and workloads to further >> their own research. Both parties win. >> > > > I don't know about this model. > This is like developing software on prototype hardware. The hardware guys and gals keep wanting to change the hardware, and the software developers complain that their software keeps breaking, or that the hardware is buggy (and it is). > > The computational physics and computational biology guys get to work on cool, nifty stuff to push their dissertation forward by using a hopefully stable computational platform. > But I don't think the CS guys would drool over the possibility of administering a cluster. The CS guys get to be sysadmin/maintenance types...not very fun for them, and not the kind of work that would work for their dissertation. > > Now, if the two groups were doing research on new computational methods (what's the best way to simulate X) perhaps you'd get a collaboration. > > > > > ------------------------------ > > _______________________________________________ > Beowulf mailing list > Beowulf at beowulf.org > http://www.beowulf.org/mailman/listinfo/beowulf > > > End of Beowulf Digest, Vol 80, Issue 22 > *************************************** > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu From samuel at unimelb.edu.au Wed Oct 27 18:14:25 2010 From: samuel at unimelb.edu.au (Christopher Samuel) Date: Thu, 28 Oct 2010 12:14:25 +1100 Subject: [Beowulf] Interesting In-Reply-To: <276820005-1288199010-cardhu_decombobulator_blackberry.rim.net-1964510620-@bda263.bisx.produk.on.blackberry> References: <201010271636.o9RGa3Bq013346@bluewest.scyld.com> <276820005-1288199010-cardhu_decombobulator_blackberry.rim.net-1964510620-@bda263.bisx.produk.on.blackberry> Message-ID: <4CC8CE71.2030703@unimelb.edu.au> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 28/10/10 04:03, Sarma Tangirala wrote: > What I'd like to know is, is it possible to have a our > history captured in its entirety so that none of the > future generations have to run around (like Hari Seldon) > because information from waaaay back is corrupt and not > take care of? For a lesson in the dangers of digital preservation the 1986 BBC Domesday Book project and the battle to be able to read it again after 15 years is a very good case in point. Here's the situation in 2002: http://www.guardian.co.uk/uk/2002/mar/03/research.elearning A background on the original project, and a bit about the recovery efforts: http://www.atsf.co.uk/dottext/domesday.html The project that started to recover access to the data: http://www2.si.umich.edu/CAMILEON/domesday/domesday.html For a while the data was available on a website here: http://www.domesday1986.com/ But in another lesson the owner of that site died in 2008 and the site went away (there's a small site about it there now, but none of the content). You can see the original front page courtesy of The Wayback Machine but it appears the owner (ironically a company called Long Lived Data) pointed to another server via javascript and so the javascript window that appears to access it just pops up a hosting company website now. :-( http://web.archive.org/web/*/http://www.domesday1986.com/ Finally, and probably most instructively, a message from the RISKS digest from one of the originators of the BBC project about the choices they made and what went wrong: http://catless.ncl.ac.uk/Risks/25.44.html#subj7 # In sharp contrast to the way we are portrayed now by # some commentators, we were always acutely aware of the # volatility of the hardware and software we had used to # implement the Domesday Project and the need to preserve # this unique archive for the future. Knowing that our # project was coming to an end we transferred the master # tapes and server files for everything we had compiled, # including all our working documents and enabling software # to the National Data Archive under the supervision of # Professor Newby. But 18 years later, when the recovery project began: # I immediately went to the National Data Archive website # to assure myself that our original masters had been # preserved, only to find no record of them! Worse still, it appears that the records of the recovery project have also gone missing from both the UK National Data Archive and the UK National Archives. :-( Depressing really.. cheers, Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computational Initiative Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkzIznEACgkQO2KABBYQAh/P7ACeMnpxes7OIIiy3IVVGO/wynyd 9MAAnjJO1CE5C4aXc/9hPDcZUOIdYyQz =i9bp -----END PGP SIGNATURE----- From rgb at phy.duke.edu Wed Oct 27 19:24:36 2010 From: rgb at phy.duke.edu (Robert G. Brown) Date: Wed, 27 Oct 2010 22:24:36 -0400 (EDT) Subject: [Beowulf] Interesting In-Reply-To: <4CC8CE71.2030703@unimelb.edu.au> References: <201010271636.o9RGa3Bq013346@bluewest.scyld.com> <276820005-1288199010-cardhu_decombobulator_blackberry.rim.net-1964510620-@bda263.bisx.produk.on.blackberry> <4CC8CE71.2030703@unimelb.edu.au> Message-ID: On Thu, 28 Oct 2010, Christopher Samuel wrote: > http://www.guardian.co.uk/uk/2002/mar/03/research.elearning I've already lived through whole generations of this process from as far back as 1979. I have chunks of code (mostly very old fortran) that I carefully preserved from punched cards through disk packs on IBMs through disk packs on a Harris 800 minicomputer and only floppy drives on an IBM PC, onto bigger/newer floppy drives and hideously expensive hard disks, onto a Sun 386i, a Sparcstation 2, a Linux PC with dual Pentium Pros (and simultaneously propagating through several generation of Sun servers in the department) until, eventually, they made their way to the laptop where I'm typing this, backed up in the department. I've gone back to the well to look at the algorithms for generating e.g. 3j, 6j, 9j coefficients in angular momentum coupling theory, ported them to C, and written whole new programs using the crumbs of that work. I've also failed. I had for a long time a QIC for the IBM 5100 with Mastermind written in APL on it. I'd kill to be able to get to the code for a variety of reasons, not the least of which being that because I mentioned it once on this very list I was for a while accused of being this dude you claimed to be a time traveller on uunet. They other is that it would be fun to port the result to C under subversion, given that the version I wrote in Fortran and the version I wrote in Basica have also fled. I have a 9 track tape reel with LCAO code from the dark ages (maybe 1978?) that I don't think I will ever be able to play even if the tape hasn't degraded over 32 years. I've lost stories I've written on paper, and a really cool poem that I wrote with a pen popular in the 70's that turned out to have ink that faded to clear over 20 year, with or without the help of ambient UV. I have spiral notebooks from graduate school with barely visible orange lines that might or might not once have been figures and words and equations. I've tried to rescue old wordstar and old word documents -- the latter by going in and chopping the ascii out of the corrupt binary (early word and many other early WPs used the 8th bit as a kind of markup delimiter), with some success but it is like breaking a code or solving a puzzle. And yes, I'm very, very concerned about things like od formats that do the RIGHT thing -- save everything as straight up ASCII inside pure XML markup that one can always write filters to decode even if XML itself and the WP that created it is long gone -- and then COMPRESS the document, producing a result that might as well be encrypted (compression IS a kind of encryption) unless one knows the algorithm used to do the compressing. I've salvaged gradesheets the hard way years after the open source tool I used to produce them has disappeared only because they DIDN'T do this -- they basically stuck the data in a sort of custom ascii human readable markup where one can "see" how to get it back out again without anything but a straight up text editor. The "Microsoft Word" problem at this point is huge. There are an enormous number of documents that were written with old versions of Word (and Works) and are now all put impossible to retrieve (if only their owners realized it). Important stuff. Oops. One reason Europe as largely endorsed XML-only document encodings, one reason MS "suddenly" made Word and Office XML-compliant (and hence, to their chagrin, impossible to jerk around they way they'd jerked Office around for a decade or so previously). For me, I now write every single thing I write using jove (an absolutely trivial, wonderful, text only editor), and with the exception of email, if it is important enough to preserve it is in a version control system on a solidly backed up server, with multiple cloned images of the repository on my person machines in different places. Nuclear war, I lose it. A really bad solar flare or magnetic storm or terrorist EMP attack on campus, I MIGHT lose it (although our server room is deep in the bowels of the physics building, the building is full of steel and the basement like a faraday cage as far as e.g. cell phones etc are concerned) and some of the disks or backups might survive). And with all of that, if I died in the next ten minutes, what of all of the gigabytes of text I've generated over the last decade or three would survive a decade more? Maybe a few tens of megabytes. Maybe. Probably not, though. Who is going to be able to keep them, move them along through format changes, update the media they are stored on? Who will care? In a few centuries, even my actual publications will be most unlikely to survive. Thus cries the humble cell contemplating its own inevitable death as the vast superorganismal being of which it is a very tiny part lumbers on to ITS inevitable destiny, no different on the macroscale from the cell's fate on the microscale. In time all of the marvelous structure and information that is us, our thoughts, our civilization, our knowledge, will succumb to entropy, to processes that are always more likely to take one from a state of relative organization to a much more probable state of disorganization. Sad indeed, but there it is. rgb Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu From deadline at eadline.org Thu Oct 28 05:00:31 2010 From: deadline at eadline.org (Douglas Eadline) Date: Thu, 28 Oct 2010 08:00:31 -0400 (EDT) Subject: [Beowulf] Interesting In-Reply-To: <276820005-1288199010-cardhu_decombobulator_blackberry.rim.net-1964510 620-@bda263.bisx.produk.on.blackberry> References: <201010271636.o9RGa3Bq013346@bluewest.scyld.com> <276820005-1288199010-cardhu_decombobulator_blackberry.rim.net-1964510620-@bda263.bisx.produk.on.blackberry> Message-ID: <57981.192.168.93.213.1288267231.squirrel@mail.eadline.org> Just reboot the matrix, probably turn out the same in any case -- Doug > The recent digests that I am getting are quite interesting (bad google) > and I have a question. > > What I'd like to know is, is it possible to have a our history captured in > its entirety so that none of the future generations have to run around > (like Hari Seldon) because information from waaaay back is corrupt and not > take care of? > > Do you guys know of any existing sources that you can point me to? > > Is this under distributed systems or under compression algorithms? > > Any other two cents on this is welcome! > Sent from my BlackBerry > > -----Original Message----- > From: beowulf-request at beowulf.org > Sender: beowulf-bounces at beowulf.org > Date: Wed, 27 Oct 2010 09:36:13 > To: > Reply-To: beowulf at beowulf.org > Subject: Beowulf Digest, Vol 80, Issue 22 > > Send Beowulf mailing list submissions to > beowulf at beowulf.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://www.beowulf.org/mailman/listinfo/beowulf > or, via email, send a message with subject or body 'help' to > beowulf-request at beowulf.org > > You can reach the person managing the list at > beowulf-owner at beowulf.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Beowulf digest..." > > > Today's Topics: > > 1. RE: how Google warps your brain (Bill Rankin) > 2. RE: how Google warps your brain (Douglas Eadline) > 3. RE: Anybody using Redhat HPC Solution in their Beowulf > (Hearns, John) > 4. Re: Anybody using Redhat HPC Solution in their Beowulf > (Ellis H. Wilson III) > 5. Re: Anybody using Redhat HPC Solution in their Beowulf > (Kilian CAVALOTTI) > 6. RE: Anybody using Redhat HPC Solution in their Beowulf > (Lux, Jim (337C)) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Tue, 26 Oct 2010 14:54:43 +0000 > From: Bill Rankin > Subject: RE: [Beowulf] how Google warps your brain > To: Beowulf Mailing List > Cc: "Robert G. Brown" > Message-ID: > <76097BB0C025054786EFAB631C4A2E3C0948F542 at MERCMBX03D.na.SAS.com> > Content-Type: text/plain; charset="us-ascii" > > Heading completely off-topic now, but the area of digital media and > long-term archival/retrieval is something that I find very interesting. > I'll leave it to Rob to somehow eventually tie this back into a discussion > of COTs technology and HPC. > > >> > It's interesting: I just got an iPad a few weeks ago, mostly as a >> > reader/web-browser device, and I've been reading a variety of >> > out-of-copyright works: H. Rider Haggard, Joseph Conrad, Mark Twain. >> Thank >> > you Gutenberg Project! >> >> It is awesome, isn't it? > > Amazon also carries many of the out-of-copyright works in their Kindle > store for $0 (and gives credit to Gutenburg to a small extent). It was > nice to be able to go pickup things like the Sherlock Holmes series, > Homer's Illiad and some of Einstein's works (which I don't pretend to > understand) and have them downloaded via 3G on Amazon's dime. > > I will say that because of this I tend to overlook their rather high > (IMHO) price on current digital content and have probably purchased more > e-books overall as a result. > > >> > And, since I am sitting/lying here with a very sore back from moving >> boxes >> > of books around this weekend looking for that book that I *know* is in >> there >> > somewhere, the prospect of some magic box that would scan all my books >> into >> > a format usable into eternity would be quite nice. I might even think >> that >> > a personal "print on demand" would be nice that could generate a >> cheap/quick >> > copy for reading in bed(yes, the iPad and Kindle, etc., are nice, but >> > there's affordances provided by the paper edition that is nice.. But I >> don't >> > need hardcover or, even, any cover..) > > There is just *something* about paper, isn't there? And while I don't > have a library to the extent of RGBs or others, I do like having some > books around (glancing at the two bookshelves in my office). On the other > hand, I still have boxes of books sitting around unopened since we moved > house 4-5 years ago. I certainly need a purge, lest I end up on one of > those "hoarding" shows that seem to be popular as of late. > > At some point, I have to ask myself if I really *need* to have a old > beat-up, falling apart copy of "Voyage of the Space Beagle" laying around. > > >> > (or, even better, a service that has scanned all the books for me, >> e.g. >> > Google, and that upon receiving some proof of ownership of the >> physical >> > book, lets me have an electronic copy of the same... I'd gladly pay >> some >> > nominal fee for such a thing, providing it wasn't for some horrible >> locked, >> > time limited format which depends on the original vendor being in >> business >> > 20 years from now. I also recognize the concern about how "once in >> digital >> > form, copying becomes very cheap" which I think is valid. > > A scanning service would be wonderful for a lot of the books I have, > mainly those I view as reference-type material. For current reference > material, Safari Books Online has a reasonable usage model that allows for > making hardcopy of their online content. Now if there was only a simple > way to transcribe the same content for download to my Kindle I would be > set (something beyond the OCR+PDF approach, which is awkward and > inconsistent). > > >> What a killer idea. Acceptable use, doggone it! I'd ship them books >> by the boxful in exchange for a movable (even DRM controlled) image, a >> la >> Ipod music. I just don't want to rebuy them, like I've now bought most >> of my music collection TWICE (vinyl and CD). > > [let's not get started about vinyl collections - that's a whole 'nother > set of unopened boxes] > > The problem is that many of the media houses are still waging an > underground war on Fair Use, despite the legal decisions handed down by > the courts. As an example, I recently had a email exchange with one of > the customer service people at a major network. I was trying to locate > additional interview footage from when my brother-in-law was on a certain > hour-long Sunday evening news show. This person informed me that I did > not have their "permission" to recorded the over-the-air broadcast of the > show and burn it on a DVD to give to my sister, so what I was doing was > not legal. > > This was news to me, since this usage model was clearly defined as > permissible by the Supreme Court many years ago in the Sony v. Universal > "Betamax Case". > > While the market for online music, video and written works have forced the > various publishers to acknowledge to the need to provide content in > digital form, to a great extent they had to be dragged kicking and > screaming into the 21st century. A lot of progress has been made but > there is still a lot of resistance towards efforts to open up availability > and access even further. > > > I would like see a service where I could take bins of old books to a used > book store and somehow get credits towards the purchase of e-books online. > I think that could break me of my paperback hoarding habit pretty > quickly. > > > -bill > > > > > ------------------------------ > > Message: 2 > Date: Tue, 26 Oct 2010 10:59:25 -0400 (EDT) > From: "Douglas Eadline" > Subject: RE: [Beowulf] how Google warps your brain > To: "Hearns, John" > Cc: beowulf at beowulf.org, "Robert G. Brown" > Message-ID: > <49886.192.168.93.213.1288105165.squirrel at mail.eadline.org> > Content-Type: text/plain;charset=iso-8859-1 > > > Not that there is anything wrong with that. > > >> >> As usual, a highly insightful post from RGB. >> >> >> >>> a) Multiple copies. Passenger pigeons may be robust, but once the >> number of copies drops below a critical point, they are gone. E. Coli >> we will always have >>> with us (possibly in a constantly changing form) because there are so >> very many copies, so very widely spread. >> >> I probably shouldn't mention Wikileaks here... >> >>> >>> At the moment, the internet has if anything VASTLY INCREASED a, b and >> c >>> for every single document in the public domain that has been ported >> to, >>> e.g. Project Gutenberg. >>> >>> Right now, I'm sitting on a cache of "Saint" books, by Leslie >> Charteris >>> (who was a great favorite of mine growing up and still is). >>> >>> Nobody is going to reprint the Saint stories. They are a gay fantasy >>> from another time, >> >> Simon Templar? Gay? Cough. >> >> Next you will be telling me that there are gay undertones in Top Gun, >> the film with the sexiest astrophysicist ever. >> >> >>> might well last to the end of civilization. Replicate them a few >>> million times, PERPETUATE them from generation to generation by >>> renewing >>> the copies, and backing them up, and recopying them in formats where >>> they are still useful. >> >> The cloud backup providers will be keeping copies of data on >> geographically spread sites. >> However, we should at this stage be asking what are the mechanisms for >> cloud storage companies >> for >> *) living wills - what happens when the company goes bust >> >> *) what are the strategies for migrating the data onto new storage >> formats >> >> >>> >>> Or, to put it differently, suppose every single human on the planet >> had >>> access to the modern equivalent of Diophantus's Arithmetica on their >>> computer, their Kindle, their Ipad >> I believe that was the original intent for the Web. Still under >> development! >> >> >> The contents of this email are confidential and for the exclusive use of >> the intended recipient. If you receive this email in error you should >> not >> copy it, retransmit it, use it or disclose its contents but should >> return >> it to the sender immediately and delete your copy. >> >> _______________________________________________ >> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing >> To change your subscription (digest mode or unsubscribe) visit >> http://www.beowulf.org/mailman/listinfo/beowulf >> >> -- >> This message has been scanned for viruses and >> dangerous content by MailScanner, and is >> believed to be clean. >> > > > -- > Doug > > -- > This message has been scanned for viruses and > dangerous content by MailScanner, and is > believed to be clean. > > > > ------------------------------ > > Message: 3 > Date: Tue, 26 Oct 2010 09:16:47 +0100 > From: "Hearns, John" > Subject: RE: [Beowulf] Anybody using Redhat HPC Solution in their > Beowulf > To: "Ellis H. Wilson III" , > > Message-ID: > <68A57CCFD4005646957BD2D18E60667B12154E23 at milexchmb1.mil.tagmclarengroup.com> > > Content-Type: text/plain; charset="us-ascii" > >> I don't think you could find a statement more orthogonal to the spirit >> of the Beowulf list than, "Please, please don't "roll your own" >> system..." Isn't Beowulfery about the drawing together of inexpensive >> components in an intelligent fashion suited just for your particular >> application while using standardized (and thereby cheap by the law of >> scale) hardware? I'm not suggesting Richard build his own NIC - but >> there is nothing wrong with using even a distribution of Linux not >> intended for HPC (so long as you're smart about it) and picking and >> choosing the software (queuing managers, tracers, etc) he finds works >> best. >> >> Also, I would argue if a company is selling you an HPC solution, it's >> either: >> 1. A true Beowulf in terms of using COTS hardware, in which case you >> are >> likely getting less than your money is worth or > > > Ellis, I am going to politely disagree with you - now there's a > surprise! > > I have worked as an engineer for two HPC companies - Clustervision and > Streamline. > My slogan phrase on this issue is "Any fool can go down PC World and buy > a bunch of PCs" > By that I mean that CPU is cheap these days, but all you will get is a > bunch of boxes > on your loading bay. As you say, and you are right, you then have the > option of installing > Linux plus a cluster management stack and getting a cluster up and > running. > > However, as regards price, I would say that actually you will be paying > very, very little premium > for getting a supported, tested and pre-assembled cluster from a vendor. > Academic margins are razor thin - the companies are not growing fat over > academic deals. > They also can get special pricing from Intel/AMD if the project can be > justified - probably ending > up at a price per box near to what you pay at PC World. > > Or take (say) rack top switches. Do you want to have a situation where > the company which supports your cluster > has switches sitting on a shelf, so when a switch fails someone (me!) is > sent out the next morning to deliver > a new switch in a box, cable it in and get you running? > Or do you want to deal direct with the returns department at $switch > vendor, or even (shudder) take the route > of using the same switches as the campus network - so you don't get to > choose on the basis of performance or > suitability, but just depend on the warm and fuzzies your campus IT > people have. > > > We then come to support - say you buy that heap of boxes from a Tier 1 - > say it is the same company your > campus IT folks have a campus wide deal with. You'll get the same type > of support you get for general > servers running Windows - and you'll deal with first line support staff > on the phone every time. > Me, I've been there, seen there, done it with tier 1 support like that. > As a for instance, HPC workloads tend to stress the RAM in a system, and > you get frequent ECC errors on > a young system as it is bedding in. Try phoning support every time a > light comes on, and get talked through > the "have you run XXX diagnostic", it soon gets wearing. > Before Tier 1 companies cry foul, of course both the above companies and > all other cluster companies integrate > Tier 1 servers - but that is a different scenario from getting boxes > delivered through your campus agreement with > $Tier1. > > > > > > > > > > > > > The contents of this email are confidential and for the exclusive use of > the intended recipient. If you receive this email in error you should not > copy it, retransmit it, use it or disclose its contents but should return > it to the sender immediately and delete your copy. > > > > ------------------------------ > > Message: 4 > Date: Tue, 26 Oct 2010 12:09:12 -0400 > From: "Ellis H. Wilson III" > Subject: Re: [Beowulf] Anybody using Redhat HPC Solution in their > Beowulf > To: "Hearns, John" > Cc: beowulf at beowulf.org > Message-ID: <4CC6FD28.1050303 at runnersroll.com> > Content-Type: text/plain; charset=ISO-8859-1; format=flowed > > On 10/26/10 04:16, Hearns, John wrote: >> I have worked as an engineer for two HPC companies - Clustervision and >> Streamline. >> My slogan phrase on this issue is "Any fool can go down PC World and buy >> a bunch of PCs" > > Well if you are buying PCs in bulk at retail pricing, you are a fool > anyway. Plus most PC World PCs won't have ECC RAM so I wasn't really > referring to those as few of us tolerate random bit flips. > >> However, as regards price, I would say that actually you will be paying >> very, very little premium >> for getting a supported, tested and pre-assembled cluster from a vendor. >> Academic margins are razor thin - the companies are not growing fat over >> academic deals. >> They also can get special pricing from Intel/AMD if the project can be >> justified - probably ending >> up at a price per box near to what you pay at PC World. > > Again, not comparing PC World to Tier 1 bulk purchases. I'm comparing > Tier 1 bulk purchases w/o an OS (so you can DIY) with specialized HPC > vendor purchases where you don't have to DIY. Even then, perhaps it > breaks even the first year if you get a very, very good deal from the > HPC vendor. However, to get the deal you are probably contracted into > four or five years of support and when considering HPC, involving more > humans are the fastest way to get a really inefficient and expensive > cluster. After the first year and up until the lifetime of the cluster > involving human support annually will add a large cost overhead you have > to account for at the beginning (and probably buy less hardware because > of which). > >> Or take (say) rack top switches. Do you want to have a situation where >> the company which supports your cluster >> has switches sitting on a shelf, so when a switch fails someone (me!) is >> sent out the next morning to deliver >> a new switch in a box, cable it in and get you running? > > That's probably a hell of a lot faster than waiting on a vendor to get > you a new switch through some RMA process. Plus you know the cabling is > done right :). > > Optimally IMHO, in university setups physical scientists create the need > for HPC. These types shouldn't (as Kilian mentions) need to inherit all > of the responsibilities and overheads of cluster management to use one > (or pay cluster vendors annually for support). They should simply walk > over to the CS department, find system guys (who would probably drool > over the potential of administering a reasonably sized cluster) and work > out an agreement where the physical science types can "just use it" and > the systems/CS guys administer it and can once in a while trace > workloads, test new load balancing mechanisms, try different kernel > settings for performance, etc. This way the physical scientists get > their work done on a well supported HPC system for no extra cash and > computer scientists get great, non-toy traces and workloads to further > their own research. Both parties win. > > Now in organizations that don't have a CS department I agree that HPC > vendors are the way to go. > > ellis > > > ------------------------------ > > Message: 5 > Date: Tue, 26 Oct 2010 11:18:56 +0200 > From: Kilian CAVALOTTI > Subject: Re: [Beowulf] Anybody using Redhat HPC Solution in their > Beowulf > To: "Ellis H. Wilson III" > Cc: beowulf at beowulf.org > Message-ID: > > Content-Type: text/plain; charset=UTF-8 > > Hi, > > On Tue, Oct 26, 2010 at 1:00 AM, Ellis H. Wilson III > wrote: >> Also, I would argue if a company is selling you an HPC solution, it's >> either: >> 1. A true Beowulf in terms of using COTS hardware, in which case you are >> likely getting less than your money is worth or > > Well, depends on how you value your time and the required expertise to > put all those COTS and OSS pieces together to make them run smoothly > and efficiently. > Most scientists and HPC systems users are not professional sysadmins > (which is good, they have a job to do), and the value of trained, > experienced, skilled individuals who can put together a reliable and > useful HPC system is sometimes overlooked (ie. undervalued). > > I agree with your later statement, though: > >> I personally don't think the "market for cluster vendors" is [...] >> the Beowulf list. > > Cheers, > -- > Kilian > > > ------------------------------ > > Message: 6 > Date: Wed, 27 Oct 2010 09:32:43 -0700 > From: "Lux, Jim (337C)" > Subject: RE: [Beowulf] Anybody using Redhat HPC Solution in their > Beowulf > To: "Ellis H. Wilson III" , "Hearns, John" > > Cc: "beowulf at beowulf.org" > Message-ID: > > > Content-Type: text/plain; charset="us-ascii" > > >> >> Optimally IMHO, in university setups physical scientists create the need >> for HPC. These types shouldn't (as Kilian mentions) need to inherit all >> of the responsibilities and overheads of cluster management to use one >> (or pay cluster vendors annually for support). They should simply walk >> over to the CS department, find system guys (who would probably drool >> over the potential of administering a reasonably sized cluster) and work >> out an agreement where the physical science types can "just use it" and >> the systems/CS guys administer it and can once in a while trace >> workloads, test new load balancing mechanisms, try different kernel >> settings for performance, etc. This way the physical scientists get >> their work done on a well supported HPC system for no extra cash and >> computer scientists get great, non-toy traces and workloads to further >> their own research. Both parties win. >> > > > I don't know about this model. > This is like developing software on prototype hardware. The hardware guys > and gals keep wanting to change the hardware, and the software developers > complain that their software keeps breaking, or that the hardware is buggy > (and it is). > > The computational physics and computational biology guys get to work on > cool, nifty stuff to push their dissertation forward by using a hopefully > stable computational platform. > But I don't think the CS guys would drool over the possibility of > administering a cluster. The CS guys get to be sysadmin/maintenance > types...not very fun for them, and not the kind of work that would work > for their dissertation. > > Now, if the two groups were doing research on new computational methods > (what's the best way to simulate X) perhaps you'd get a collaboration. > > > > > ------------------------------ > > _______________________________________________ > Beowulf mailing list > Beowulf at beowulf.org > http://www.beowulf.org/mailman/listinfo/beowulf > > > End of Beowulf Digest, Vol 80, Issue 22 > *************************************** > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > -- > This message has been scanned for viruses and > dangerous content by MailScanner, and is > believed to be clean. > -- Doug -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From ellis at runnersroll.com Thu Oct 28 07:09:25 2010 From: ellis at runnersroll.com (Ellis H. Wilson III) Date: Thu, 28 Oct 2010 10:09:25 -0400 Subject: [Beowulf] Anybody using Redhat HPC Solution in their Beowulf In-Reply-To: References: <4CC11EA8.8030602@gmail.com> <68A57CCFD4005646957BD2D18E60667B12154605@milexchmb1.mil.tagmclarengroup.com> <4CC60BFA.4080502@runnersroll.com> <68A57CCFD4005646957BD2D18E60667B12154E23@milexchmb1.mil.tagmclarengroup.com> <4CC6FD28.1050303@runnersroll.com> Message-ID: <4CC98415.3070006@runnersroll.com> On 10/27/10 12:32, Lux, Jim (337C) wrote: > I don't know about this model. > This is like developing software on prototype hardware. The hardware guys and gals keep wanting to change the hardware, and the software developers complain that their software keeps breaking, or that the hardware is buggy (and it is). I wasn't suggesting the CS guys affect the correctness of the stack or kernel, my comment was purely performance-specific: "CS guys...can once in a while trace workloads, test new load balancing mechanisms, try different kernel settings for performance, etc." Obviously if you are altering things that endanger the correctness of the scientific workload people will be upset. If your tracer fails, your load balancer degrades performance slightly, or your new cache replacement policy sucks then the program might run slow but it should complete correctly. > But I don't think the CS guys would drool over the possibility of administering a cluster. The CS guys get to be sysadmin/maintenance types...not very fun for them, and not the kind of work that would work for their dissertation. The difficulty I have getting access to alter and research root-level stuff on clusters is so great that administration by me or my adviser would allow my dissertation to move forward much more rapidly. Instead systems researchers try and simulate large systems, which as you can imagine often leads to inaccurate or downright incorrect results and consequent publications. Frankly, I'd be the rock-star of the CS department if I had administrative control of a reasonably-sized cluster. Everyone (in CS) would be coming to me to get their research done. So it requires a little administration?? With all my spare cycles not having to write simulation codes for an entire I/O stack it would be totally worth it. ellis From Bill.Rankin at sas.com Thu Oct 28 07:24:48 2010 From: Bill.Rankin at sas.com (Bill Rankin) Date: Thu, 28 Oct 2010 14:24:48 +0000 Subject: [Beowulf] Interesting In-Reply-To: References: <201010271636.o9RGa3Bq013346@bluewest.scyld.com> <276820005-1288199010-cardhu_decombobulator_blackberry.rim.net-1964510620-@bda263.bisx.produk.on.blackberry> Message-ID: <76097BB0C025054786EFAB631C4A2E3C0949AAA7@MERCMBX03R.na.SAS.com> > Go to the moon. Dig a really big hole > at one of the poles (to avoid thermal extremes) and build a bunker out > of fused glass three meters thick and one kilometer underground. Ahh, I see that RGB has now revealed the plans for his villainous moon-base lair from which he will launch his nefarious plot to take over the worlds HPC resources so that he may run more Monte-Carlo simulations. It is all starting to make sense now. -b From james.p.lux at jpl.nasa.gov Thu Oct 28 07:32:55 2010 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Thu, 28 Oct 2010 07:32:55 -0700 Subject: [Beowulf] Anybody using Redhat HPC Solution in their Beowulf In-Reply-To: <4CC98415.3070006@runnersroll.com> Message-ID: On 10/28/10 7:09 AM, "Ellis H. Wilson III" wrote: > On 10/27/10 12:32, Lux, Jim (337C) wrote: >> I don't know about this model. >> This is like developing software on prototype hardware. The hardware guys >> and gals keep wanting to change the hardware, and the software developers >> complain that their software keeps breaking, or that the hardware is buggy >> (and it is). > > I wasn't suggesting the CS guys affect the correctness of the stack or > kernel, my comment was purely performance-specific: > > "CS guys...can once in a while trace workloads, test new load balancing > mechanisms, try different kernel settings for performance, etc." > > Obviously if you are altering things that endanger the correctness of > the scientific workload people will be upset. If your tracer fails, > your load balancer degrades performance slightly, or your new cache > replacement policy sucks then the program might run slow but it should > complete correctly. And I agree with you here, but the problem is what I next commented on: You're asking the CS department (full of researchers wanting to do novel research for their dissertation or to move them towards tenure) to be sysadmins. Being an SA is fun, once. > >> But I don't think the CS guys would drool over the possibility of >> administering a cluster. The CS guys get to be sysadmin/maintenance >> types...not very fun for them, and not the kind of work that would work for >> their dissertation. > > The difficulty I have getting access to alter and research root-level > stuff on clusters is so great that administration by me or my adviser > would allow my dissertation to move forward much more rapidly. Instead > systems researchers try and simulate large systems, which as you can > imagine often leads to inaccurate or downright incorrect results and > consequent publications. > > Frankly, I'd be the rock-star of the CS department if I had > administrative control of a reasonably-sized cluster. Everyone (in CS) > would be coming to me to get their research done. So it requires a > little administration?? With all my spare cycles not having to write > simulation codes for an entire I/O stack it would be totally worth it. > Yes, but that would mean more like "sharing a cluster" as opposed to CS providing support and SA services. And "sharing a cluster" means that the cluster architecture has to be appropriate for both people, which is challenging as has been addressed here recently. Then there's the "if you're getting benefit, can you kick in part of the cash needed" which gets into a whole other area of complexity. It works like this.. You (A) need 1000 units of resources but can only afford 500 by yourself. However, you don't need your 1000 units all the time. So you find someone else who has similar needs who can share, call them B.. They've only got 300 resource units of cash, so you both go find someone else who needs similar stuff (call them C), and they need 1200 units, but only 20% of the time, but they've got enough cash, along with A, to do the deal. You don't want to leave B out in the cold so you make the cluster a little bit bigger (say, 1500), and use A, B, and C money. You're moving along, got the procurements in place, etc. Now C gets the unhappy news that their funding stream has been "rephased" so you need to find a fourth party "D" to pick up the slack. Meanwhile, B is unhappy about C coming and going, because B was excited about getting a bigger cluster and revised their research plans to take advantage of it, to their sponsor's delight and pleasure. Now that they won't get the bigger cluster, they have to go back to the sponsor and descope from their recent upscope. D saves the day, for a while, but because their research needs a different interconnect than A, B, and C were going to use, we have to change the cluster architecture, just a bit. Meanwhile A, who started the whole thing, gets real tired of spending all their time negotiating cluster usage agreements and looking for funding, and they throw up their hands and bails out of the project, buying their own cluster with half as many computers that are 3 times as fast. B,C, and D are now gently twisting in the wind, trying to figure out what to do next, because the deadline for the paper and grant applications is coming up soon. The institution steps in and says, cease this wasteful squabbling, henceforth all computing resources will be managed by the institution: "to each according to their needs", and we'll come up with a fair way to do the "from each according to their ability". Just submit your computing resource request to the steering committee and we'll decide what's best for the institution overall. Yes.. Local control of a "personal supercomputer" is a very, very nice thing. And so it goes... > ellis > From Bill.Rankin at sas.com Thu Oct 28 07:36:52 2010 From: Bill.Rankin at sas.com (Bill Rankin) Date: Thu, 28 Oct 2010 14:36:52 +0000 Subject: [Beowulf] Looking for references for parallelization and optimization In-Reply-To: References: <20101020005647.44e6e2db@vivalunalitshi.luna.local> Message-ID: <76097BB0C025054786EFAB631C4A2E3C0949AAD7@MERCMBX03R.na.SAS.com> If you are looking for more theoretical approaches, there is always John Reif?s book: John Reif (ed), ?Synthesis of Parallel Algorithms?, published by Morgan Kaufmann, Spring, 1993. http://portal.acm.org/citation.cfm?coll=GUIDE&dl=GUIDE&id=562546 It is a weighty tome and perhaps one of the more theoretical books on parallel codes that I have read. -bill From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Dmitri Chubarov Sent: Tuesday, October 19, 2010 11:51 PM To: Micha Cc: Beowulf List Subject: Re: [Beowulf] Looking for references for parallelization and optimization Dear Micha, we are working on a course on the subject for Novosibirsk University. There are several widely used books that we use as reference material for the optimization part of the course. In particular, * Stefan Goedecker, Adolfy Hoisie, "Performance optimization of numerically intensive codes", SIAM, 2000. * Kewin Wadleigh, Isom Crawford, "Software optimization for High Performance Computing", HP Professional Books, 2000 We would like to start with more theoretical approaches, like an introduction to dependency graph analysis, asymptotic analysis of algorithms and then proceed with specific optimization techniques like the ones described in the above books. Please compile a list from the responses you will receive from Beowulf community. I would definitely find such a list very helpful. Best regards, Dima -------------- next part -------------- An HTML attachment was scrubbed... URL: From prentice at ias.edu Thu Oct 28 07:36:33 2010 From: prentice at ias.edu (Prentice Bisbal) Date: Thu, 28 Oct 2010 10:36:33 -0400 Subject: [Beowulf] China Wrests Supercomputer Title From U.S. Message-ID: <4CC98A71.5050701@ias.edu> http://www.nytimes.com/2010/10/28/technology/28compute.html -- Prentice Bisbal Linux Software Support Specialist/System Administrator School of Natural Sciences Institute for Advanced Study Princeton, NJ From ellis at runnersroll.com Thu Oct 28 07:36:54 2010 From: ellis at runnersroll.com (Ellis H. Wilson III) Date: Thu, 28 Oct 2010 10:36:54 -0400 Subject: [Beowulf] Interesting In-Reply-To: <76097BB0C025054786EFAB631C4A2E3C0949AAA7@MERCMBX03R.na.SAS.com> References: <201010271636.o9RGa3Bq013346@bluewest.scyld.com> <276820005-1288199010-cardhu_decombobulator_blackberry.rim.net-1964510620-@bda263.bisx.produk.on.blackberry> <76097BB0C025054786EFAB631C4A2E3C0949AAA7@MERCMBX03R.na.SAS.com> Message-ID: <4CC98A86.7050203@runnersroll.com> On 10/28/10 10:24, Bill Rankin wrote: >> Go to the moon. Dig a really big hole >> at one of the poles (to avoid thermal extremes) and build a bunker out >> of fused glass three meters thick and one kilometer underground. > > Ahh, I see that RGB has now revealed the plans for his villainous moon-base lair from which he will launch his nefarious plot to take over the worlds HPC resources so that he may run more Monte-Carlo simulations. Nay, this is just the optimal location for the final resting place of the epic tomes of the RGB-Bot. Emphasize the "really big hole" part, and they might all fit ;). Perhaps if there is some left-over space we can fit the remaining works of humanity. ellis From Bill.Rankin at sas.com Thu Oct 28 08:26:14 2010 From: Bill.Rankin at sas.com (Bill Rankin) Date: Thu, 28 Oct 2010 15:26:14 +0000 Subject: [Beowulf] China Wrests Supercomputer Title From U.S. In-Reply-To: <4CC98A71.5050701@ias.edu> References: <4CC98A71.5050701@ias.edu> Message-ID: <76097BB0C025054786EFAB631C4A2E3C0949AB3A@MERCMBX03R.na.SAS.com> I was just going to post the same thing, but with the HPCWire link instead. http://www.hpcwire.com/blogs/New-China-GPGPU-Super-Outruns-Jaguar-105987389.html A few comments: A 42% increase (2.5TF v. 1.75TF for jaguar) is not a "wide margin". If I did my math right, that represents about 7 months worth of gain on the growth curve for the Top500 list (peak performance growth is ~2x every 13 months). Linpack is very forgiving of low-bandwidth networks (or PCI-x busses in this case). This strikes me as a machine that will most likely never see a single application that runs on the full system. There is nothing wrong with that per-se, but it must be taken into consideration when comparing it to machines that have run real production applications across the entire processor set (ie. Jaguar). What this machine does do is validate to some extent the continued use and development of GPUs in an HPC/cluster setting. I will admit that I have been very skeptic in the past as to whether GPU-based computing had any long-term traction. In my defense I have seen many past examples of specialized computing approaches that did not survive past the first generation and eventually lost out to the general purpose microprocessor. I will now admit that GPU technology may have a bigger long-term impact that I had originally imagined. -b > http://www.nytimes.com/2010/10/28/technology/28compute.html > > -- > Prentice Bisbal > Linux Software Support Specialist/System Administrator > School of Natural Sciences > Institute for Advanced Study > Princeton, NJ > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin > Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf From gus at ldeo.columbia.edu Thu Oct 28 08:57:51 2010 From: gus at ldeo.columbia.edu (Gus Correa) Date: Thu, 28 Oct 2010 11:57:51 -0400 Subject: [Beowulf] China Wrests Supercomputer Title From U.S. In-Reply-To: <76097BB0C025054786EFAB631C4A2E3C0949AB3A@MERCMBX03R.na.SAS.com> References: <4CC98A71.5050701@ias.edu> <76097BB0C025054786EFAB631C4A2E3C0949AB3A@MERCMBX03R.na.SAS.com> Message-ID: <4CC99D7F.6030309@ldeo.columbia.edu> Hi Bill, list Here is it from The Register: http://www.theregister.co.uk/2010/10/28/china_tianhe_1a_supercomputer/ Which real real production applications ran on Jaguar across all processors? I heard of large high-resolution climate models that ran on Jaguar and on Kraken, using about 6000 cores, but not the full set available. There certainly are other large applications, though. Thanks, Gus Correa Bill Rankin wrote: > I was just going to post the same thing, but with the HPCWire link instead. > > http://www.hpcwire.com/blogs/New-China-GPGPU-Super-Outruns-Jaguar-105987389.html > > A few comments: > > A 42% increase (2.5TF v. 1.75TF for jaguar) is not a "wide margin". If I did my math right, that represents about 7 months worth of gain on the growth curve for the Top500 list (peak performance growth is ~2x every 13 months). > > Linpack is very forgiving of low-bandwidth networks (or PCI-x busses in this case). > > This strikes me as a machine that will most likely never see a single application that runs on the full system. There is nothing wrong with that per-se, but it must be taken into consideration when comparing it to machines that have run real production applications across the entire processor set (ie. Jaguar). > > What this machine does do is validate to some extent the continued use and development of GPUs in an HPC/cluster setting. I will admit that I have been very skeptic in the past as to whether GPU-based computing had any long-term traction. In my defense I have seen many past examples of specialized computing approaches that did not survive past the first generation and eventually lost out to the general purpose microprocessor. I will now admit that GPU technology may have a bigger long-term impact that I had originally imagined. > > -b > >> http://www.nytimes.com/2010/10/28/technology/28compute.html >> >> -- >> Prentice Bisbal >> Linux Software Support Specialist/System Administrator >> School of Natural Sciences >> Institute for Advanced Study >> Princeton, NJ >> _______________________________________________ >> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin >> Computing >> To change your subscription (digest mode or unsubscribe) visit >> http://www.beowulf.org/mailman/listinfo/beowulf > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From brockp at umich.edu Thu Oct 28 09:48:18 2010 From: brockp at umich.edu (Brock Palen) Date: Thu, 28 Oct 2010 12:48:18 -0400 Subject: [Beowulf] China Wrests Supercomputer Title From U.S. In-Reply-To: <4CC99D7F.6030309@ldeo.columbia.edu> References: <4CC98A71.5050701@ias.edu> <76097BB0C025054786EFAB631C4A2E3C0949AB3A@MERCMBX03R.na.SAS.com> <4CC99D7F.6030309@ldeo.columbia.edu> Message-ID: <5512DFDE-E0A8-4F56-A5D4-C16B22D27CD2@umich.edu> I have an account on Kraken (not as big as jaguar) but my jobs have been blocked by jobs using 99,000+ cores several times. What this job does I don't know, Brock Palen www.umich.edu/~brockp Center for Advanced Computing brockp at umich.edu (734)936-1985 On Oct 28, 2010, at 11:57 AM, Gus Correa wrote: > Hi Bill, list > > Here is it from The Register: > http://www.theregister.co.uk/2010/10/28/china_tianhe_1a_supercomputer/ > > Which real real production applications ran on Jaguar across all processors? > I heard of large high-resolution climate models that ran on Jaguar > and on Kraken, using about 6000 cores, but not the full set available. > There certainly are other large applications, though. > > Thanks, > > Gus Correa > > Bill Rankin wrote: >> I was just going to post the same thing, but with the HPCWire link instead. >> http://www.hpcwire.com/blogs/New-China-GPGPU-Super-Outruns-Jaguar-105987389.html >> A few comments: >> A 42% increase (2.5TF v. 1.75TF for jaguar) is not a "wide margin". > If I did my math right, that represents about 7 months > worth of gain on the growth curve for the Top500 list > (peak performance growth is ~2x every 13 months). >> Linpack is very forgiving of low-bandwidth networks > (or PCI-x busses in this case). >> This strikes me as a machine that will most likely > never see a single application that runs on the full system. > There is nothing wrong with that per-se, but it must be taken > into consideration when comparing it to machines that have run > real production applications across the entire processor set (ie. Jaguar). >> What this machine does do is validate to some extent the > continued use and development of GPUs in an HPC/cluster setting. > I will admit that I have been very skeptic in the past as to whether > GPU-based computing had any long-term traction. > In my defense I have seen many past examples of > specialized computing approaches that did not > survive past the first generation and eventually > lost out to the general purpose microprocessor. > > I will now admit that GPU technology may have a bigger > long-term impact that I had originally imagined. >> -b >>> http://www.nytimes.com/2010/10/28/technology/28compute.html >>> >>> -- >>> Prentice Bisbal >>> Linux Software Support Specialist/System Administrator >>> School of Natural Sciences >>> Institute for Advanced Study >>> Princeton, NJ >>> _______________________________________________ >>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin >>> Computing >>> To change your subscription (digest mode or unsubscribe) visit >>> http://www.beowulf.org/mailman/listinfo/beowulf >> _______________________________________________ >> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing >> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > > From i.n.kozin at googlemail.com Thu Oct 28 09:51:26 2010 From: i.n.kozin at googlemail.com (Igor Kozin) Date: Thu, 28 Oct 2010 17:51:26 +0100 Subject: [Beowulf] China Wrests Supercomputer Title From U.S. In-Reply-To: <4CC99D7F.6030309@ldeo.columbia.edu> References: <4CC98A71.5050701@ias.edu> <76097BB0C025054786EFAB631C4A2E3C0949AB3A@MERCMBX03R.na.SAS.com> <4CC99D7F.6030309@ldeo.columbia.edu> Message-ID: > http://www.hpcwire.com/blogs/New-China-GPGPU-Super-Outruns-Jaguar-105987389.html I have been wondering what use if any Tianhe-1 made of Radeon HD 4870 X2. I had a card like that and it died three times during one year warranty. Needless to add it perished shortly after the warranty run out. Perhaps we'll see 2560 of those (or whatever was left of them) on eBay shortly. Incidentally, does anyone know existing _production_ GPU clusters used not for development but to run jobs by ordinary users? TSUBAME 2.0 is supposed to be operational shortly. All the other places that I know of are for development, e.g. Keeneland. Igor -------------- next part -------------- An HTML attachment was scrubbed... URL: From rgb at phy.duke.edu Thu Oct 28 10:54:46 2010 From: rgb at phy.duke.edu (Robert G. Brown) Date: Thu, 28 Oct 2010 13:54:46 -0400 (EDT) Subject: [Beowulf] Interesting In-Reply-To: <76097BB0C025054786EFAB631C4A2E3C0949AAA7@MERCMBX03R.na.SAS.com> References: <201010271636.o9RGa3Bq013346@bluewest.scyld.com> <276820005-1288199010-cardhu_decombobulator_blackberry.rim.net-1964510620-@bda263.bisx.produk.on.blackberry> <76097BB0C025054786EFAB631C4A2E3C0949AAA7@MERCMBX03R.na.SAS.com> Message-ID: On Thu, 28 Oct 2010, Bill Rankin wrote: >> Go to the moon. Dig a really big hole >> at one of the poles (to avoid thermal extremes) and build a bunker out >> of fused glass three meters thick and one kilometer underground. > > Ahh, I see that RGB has now revealed the plans for his villainous moon-base lair from which he will launch his nefarious plot to take over the worlds HPC resources so that he may run more Monte-Carlo simulations. > > It is all starting to make sense now. Actually, I want to fill it with air at a few atmospheres and strap on some wings and fly (Heinlein, anyone)? rgb > > -b > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu From mathog at caltech.edu Thu Oct 28 11:25:02 2010 From: mathog at caltech.edu (David Mathog) Date: Thu, 28 Oct 2010 11:25:02 -0700 Subject: [Beowulf] Re: Interesting Message-ID: "Robert G. Brown" wrote: > I've lost stories I've > written on paper, and a really cool poem that I wrote with a pen popular > in the 70's that turned out to have ink that faded to clear over 20 > year, with or without the help of ambient UV. I have spiral notebooks > from graduate school with barely visible orange lines that might or > might not once have been figures and words and equations. It isn't necessary to wait that long for documents to fade away. I have seen cash register receipts from both Home Depot and Harbor Freight fade to nothingness in less than a year. Which wouldn't be a problem except one must present these same receipts in order to return any "lifetime guarantee" tools that did not, in fact, last a lifetime. Regards, David Mathog mathog at caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech From hahn at mcmaster.ca Thu Oct 28 14:00:09 2010 From: hahn at mcmaster.ca (Mark Hahn) Date: Thu, 28 Oct 2010 17:00:09 -0400 (EDT) Subject: [Beowulf] China Wrests Supercomputer Title From U.S. In-Reply-To: <4CC98A71.5050701@ias.edu> References: <4CC98A71.5050701@ias.edu> Message-ID: > http://www.nytimes.com/2010/10/28/technology/28compute.html "wrests", bah. you get exactly the rank on top500 that you pay for. this url: http://www.eetimes.com/electronics-news/4210223/Interconnect-pushed-China-super-to--1 mentions 160 Gbps as the speed of the interconnect, but also says "twice QDR". afaik QDR is 40 GB before encoding, so Galaxy would be 80, not 160. anyone have further details on what the "galaxy" interconnect is? From james.p.lux at jpl.nasa.gov Thu Oct 28 16:31:51 2010 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Thu, 28 Oct 2010 16:31:51 -0700 Subject: [Beowulf] China Wrests Supercomputer Title From U.S. In-Reply-To: References: <4CC98A71.5050701@ias.edu> Message-ID: > -----Original Message----- > From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Mark Hahn > Sent: Thursday, October 28, 2010 2:00 PM > To: Beowulf Mailing List > Subject: Re: [Beowulf] China Wrests Supercomputer Title From U.S. > > > http://www.nytimes.com/2010/10/28/technology/28compute.html > > "wrests", bah. you get exactly the rank on top500 that you pay for. > In a way, that's kind of a cool verification that Beowulfery works.. it's scalable and configurable, so you *can* buy exactly what you need. From lindahl at pbm.com Thu Oct 28 16:43:57 2010 From: lindahl at pbm.com (Greg Lindahl) Date: Thu, 28 Oct 2010 16:43:57 -0700 Subject: [Beowulf] China Wrests Supercomputer Title From U.S. In-Reply-To: References: <4CC98A71.5050701@ias.edu> <76097BB0C025054786EFAB631C4A2E3C0949AB3A@MERCMBX03R.na.SAS.com> <4CC99D7F.6030309@ldeo.columbia.edu> Message-ID: <20101028234357.GL597@bx9.net> On Thu, Oct 28, 2010 at 05:51:26PM +0100, Igor Kozin wrote: > Incidentally, does anyone know existing _production_ GPU clusters used not > for development but to run jobs by ordinary users? That depends on whether you call Cell a GPU. And whether you think RoadRunner has reached production, and if its users are ordinary or not. -- greg From alscheinine at tuffmail.us Thu Oct 28 20:55:32 2010 From: alscheinine at tuffmail.us (Alan Louis Scheinine) Date: Thu, 28 Oct 2010 22:55:32 -0500 Subject: [Beowulf] China Wrests Supercomputer Title From U.S. In-Reply-To: <4CC98A71.5050701@ias.edu> References: <4CC98A71.5050701@ias.edu> Message-ID: <4CCA45B4.90700@tuffmail.us> With regard to networks, a near-future fork in the road between Beowulf clusters versus supercomputers may be the intelligence concerning global memory added to the network interface chip for upcoming models of supercomputers. An example is the SGI Ultraviolet series, but also, other supercomputer vendors will have something not too dissimilar ... without being too specific. Moreover, execution paradigms for petascale and exascale computing need that kind of intelligent network interface. My point is that the key question about the interconnect may not be the bandwidth, but rather, how much support it provides for global addressing and process migration. These declarative statements should be seen as questions. I welcome comments from better informed members of the mailing list concerning intelligent interconnection interfaces for modestly-priced Beowulf clusters. Regards, Alan -- Alan Scheinine 200 Georgann Dr., Apt. E6 Vicksburg, MS 39180 Email: alscheinine at tuffmail.us Mobile phone: 225 288 4176 http://www.flickr.com/photos/ascheinine From lindahl at pbm.com Thu Oct 28 23:46:28 2010 From: lindahl at pbm.com (Greg Lindahl) Date: Thu, 28 Oct 2010 23:46:28 -0700 Subject: [Beowulf] China Wrests Supercomputer Title From U.S. In-Reply-To: <4CCA45B4.90700@tuffmail.us> References: <4CC98A71.5050701@ias.edu> <4CCA45B4.90700@tuffmail.us> Message-ID: <20101029064626.GE26764@bx9.net> On Thu, Oct 28, 2010 at 10:55:32PM -0500, Alan Louis Scheinine wrote: > With regard to networks, a near-future fork in the road between Beowulf > clusters versus supercomputers may be the intelligence concerning global > memory added to the network interface chip for upcoming models of supercomputers. Yawn. First we had the attack of the killer micros. Now we have the attack of the commodity interconnects. Do these special interconnects with fancy global memory justify their cost vs. InfiniBand + HPC tweaks? Do you really think you're going to program these machines with something other than MPI? Make your bets, wait, see if you win. -- greg ps. I am pleased to see that the new EXTOLL interconnect has a non-RDMA short message system. pps. Blekko launches November 1. Check out the /hpc slashtag. From Bill.Rankin at sas.com Fri Oct 29 06:53:41 2010 From: Bill.Rankin at sas.com (Bill Rankin) Date: Fri, 29 Oct 2010 13:53:41 +0000 Subject: [Beowulf] China Wrests Supercomputer Title From U.S. In-Reply-To: <20101028234357.GL597@bx9.net> References: <4CC98A71.5050701@ias.edu> <76097BB0C025054786EFAB631C4A2E3C0949AB3A@MERCMBX03R.na.SAS.com> <4CC99D7F.6030309@ldeo.columbia.edu> <20101028234357.GL597@bx9.net> Message-ID: <76097BB0C025054786EFAB631C4A2E3C0949BD3D@MERCMBX03R.na.SAS.com> > [...]and if its users are ordinary or not. > > -- greg In my experience very few people in this business would ever be called "ordinary". :-) -b From ellis at runnersroll.com Fri Oct 29 06:50:40 2010 From: ellis at runnersroll.com (Ellis H. Wilson III) Date: Fri, 29 Oct 2010 09:50:40 -0400 Subject: [Beowulf] Re: Interesting In-Reply-To: References: Message-ID: <4CCAD130.2050004@runnersroll.com> Interestingly, I found "Keeping Bits Safe: How Hard Can It Be?" by David Rosenthal in the November Communications of the ACM just released. It does discuss data retention at the centuries level, but unfortunately does not consider the moon-based strategy proposed by Rob. Nonetheless is a good read for any out there who are now interested in this area. However, I do wish flash (or any technology besides normal 3.5in hard drives) was considered. I would expect dormant flash-based technology to last quite a while at controlled temperatures. ellis From Shainer at mellanox.com Fri Oct 29 07:08:40 2010 From: Shainer at mellanox.com (Gilad Shainer) Date: Fri, 29 Oct 2010 07:08:40 -0700 Subject: [Beowulf] China Wrests Supercomputer Title From U.S. References: <4CC98A71.5050701@ias.edu> Message-ID: <9FA59C95FFCBB34EA5E42C1A8573784F030B350C@mtiexch01.mti.com> Its their own new proprietary interconnect, runs at 80Gb/s (similar speed to IB 8x), switches are low port count. What the article missed is that they are also using their own CPUs and host chipset (along with Intel) in that system as well. Gilad -----Original Message----- From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Mark Hahn Sent: Thursday, October 28, 2010 2:00 PM To: Beowulf Mailing List Subject: Re: [Beowulf] China Wrests Supercomputer Title From U.S. > http://www.nytimes.com/2010/10/28/technology/28compute.html "wrests", bah. you get exactly the rank on top500 that you pay for. this url: http://www.eetimes.com/electronics-news/4210223/Interconnect-pushed-Chin a-super-to--1 mentions 160 Gbps as the speed of the interconnect, but also says "twice QDR". afaik QDR is 40 GB before encoding, so Galaxy would be 80, not 160. anyone have further details on what the "galaxy" interconnect is? _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From Bill.Rankin at sas.com Fri Oct 29 07:22:04 2010 From: Bill.Rankin at sas.com (Bill Rankin) Date: Fri, 29 Oct 2010 14:22:04 +0000 Subject: [Beowulf] Re: Interesting In-Reply-To: References: Message-ID: <76097BB0C025054786EFAB631C4A2E3C0949BD60@MERCMBX03R.na.SAS.com> > "Robert G. Brown" wrote: > > > I've lost stories I've > > written on paper, and a really cool poem that I wrote with a pen popular > > in the 70's that turned out to have ink that faded to clear over 20 > > year, with or without the help of ambient UV. I have spiral notebooks > > from graduate school with barely visible orange lines that might or > > might not once have been figures and words and equations. Something to be said for the simple pencil. As long as you don't smudge it too much, carbon tends to hang onto its molecular structure pretty well over time. :-) I guess that we don't think too much these days about the archival properties of paper and pen, simply because it's seemingly so much more stable than the various computer formats. I wonder how resistant to aging modern printer/copier paper is versus its older equivalent? I know for example that newspaper quality went way down in the latter half of the last century to where copies of pre-WWII vintage editions survived much better than stuff out of the 60s and 70s, which deteriorated very quickly. I tend to make lots of hand-written notes and have several fountain pens I use. There are several brands of archival-class inks available and much debate over which ones are "best". Because of their nature they tend to be difficult to use and a mess to clean up which is not something that makes for a good general-use consumer product. -bill From Bill.Rankin at sas.com Fri Oct 29 07:40:24 2010 From: Bill.Rankin at sas.com (Bill Rankin) Date: Fri, 29 Oct 2010 14:40:24 +0000 Subject: [Beowulf] China Wrests Supercomputer Title From U.S. In-Reply-To: <404CAE66-4D29-412A-91B8-3BC9AD221231@presciencetrust.org> References: <4CC98A71.5050701@ias.edu> <76097BB0C025054786EFAB631C4A2E3C0949AB3A@MERCMBX03R.na.SAS.com> <404CAE66-4D29-412A-91B8-3BC9AD221231@presciencetrust.org> Message-ID: <76097BB0C025054786EFAB631C4A2E3C0949BD89@MERCMBX03R.na.SAS.com> Douglas: > > [...] > > What this machine does do is validate to some extent the continued > use and development of GPUs in an HPC/cluster setting. > > [...] > > Nvidia claims Tianhe-1A's 4.04 megawatts of CUDA GPUs and Xeon CPUs is > three times more power efficient than CPUs alone. The Nvidia press > release is at http://bit.ly/d9VNtY Numbers game. Lies, damned lies, and benchmarks. :-) I image if they had quartered the number of CPU cores and doubled the number of GPUs per node, they could have gotten even larger HPL numbers without significantly increasing their power footprint. But they didn't. Why? Perhaps because even though it would have been "more powerful" on paper, it probably would not run real applications any faster. Making "revolutionary" claims like these that are based on extrapolation of a single (and very questionable) data point like HPL illustrates that someone has not yet put in the time and/or effort to properly analyze the system and understand what is being presented. But it makes for good news copy. -b From john.hearns at mclaren.com Fri Oct 29 07:40:17 2010 From: john.hearns at mclaren.com (Hearns, John) Date: Fri, 29 Oct 2010 15:40:17 +0100 Subject: [Beowulf] Re: Interesting In-Reply-To: <76097BB0C025054786EFAB631C4A2E3C0949BD60@MERCMBX03R.na.SAS.com> References: <76097BB0C025054786EFAB631C4A2E3C0949BD60@MERCMBX03R.na.SAS.com> Message-ID: <68A57CCFD4005646957BD2D18E60667B121ACFC2@milexchmb1.mil.tagmclarengroup.com> > -----Original Message----- > > I guess that we don't think too much these days about the archival > properties of paper and pen, simply because it's seemingly so much more > stable than the various computer formats. I wonder how resistant to > aging modern printer/copier paper is versus its older equivalent? I > know for example that newspaper quality went way down in the latter > half of the last century to where copies of pre-WWII vintage editions > survived much better than stuff out of the 60s and 70s, which > deteriorated very quickly. EE, bah gum: http://www.yorkshireeveningpost.co.uk/news/Leeds-British-Library-unveils -new.5882827.jp The contents of this email are confidential and for the exclusive use of the intended recipient. If you receive this email in error you should not copy it, retransmit it, use it or disclose its contents but should return it to the sender immediately and delete your copy. From james.p.lux at jpl.nasa.gov Fri Oct 29 07:49:16 2010 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Fri, 29 Oct 2010 07:49:16 -0700 Subject: [Beowulf] Re: Interesting In-Reply-To: <4CCAD130.2050004@runnersroll.com> Message-ID: On 10/29/10 6:50 AM, "Ellis H. Wilson III" wrote: > Interestingly, I found "Keeping Bits Safe: How Hard Can It Be?" by David > Rosenthal in the November Communications of the ACM just released. > > It does discuss data retention at the centuries level, but unfortunately > does not consider the moon-based strategy proposed by Rob. Nonetheless > is a good read for any out there who are now interested in this area. > However, I do wish flash (or any technology besides normal 3.5in hard > drives) was considered. I would expect dormant flash-based technology > to last quite a while at controlled temperatures. > Flash memory probably has a life time of around 10 years (at reasonable temperatures). The data is stored as charge on capacitors, and the capacitors aren't perfect. Errors tend to be transient. That is, you read a page again and it reads ok the second time. (That is, it's not like DRAM, where a bit flips and stays flipped, so word level ECC works quite well) So, if you want your flash to hold forever, you'll have to periodically rewrite it. Say you rewrote every year, you'd get 10,000-100,000 years before you "wore out" the flash. There are other aging effects: diffusion of metal ions, etc. You'd want to keep your flash cold, (but not too cold, or it will break... No liquid nitrogen) I think your best bet is real CDs... That is, the mechanically stamped variety. They're dense, and nothing beats a mechanical change. You can still read Jacquard punch cards from the early 19th century (in fact, I was reading an article recently about there being a dearth of loom programmers.. So when your job at the buggy whip factory finally goes away...) Some sort of photographic technique would also have good archival properties (e.g. Silver clumps). There are lots of photographic negatives 100 years old around with little or no degradation. And it's denser than ink on paper. 100 lines/mm is an easy resolution to get. Or, how about something like the UNICON aka "terabit memory" (TBM) from Illiac IV days. It's a stable polyester base with a thin film of rhodium that was ablated by a laser making 3 micron holes to write the bits. $3.5M to store a terabit in 1975. From cbergstrom at pathscale.com Fri Oct 29 08:13:34 2010 From: cbergstrom at pathscale.com (=?ISO-8859-1?Q?=22C=2E_Bergstr=F6m=22?=) Date: Fri, 29 Oct 2010 22:13:34 +0700 Subject: [Beowulf] China Wrests Supercomputer Title From U.S. In-Reply-To: <76097BB0C025054786EFAB631C4A2E3C0949BD89@MERCMBX03R.na.SAS.com> References: <4CC98A71.5050701@ias.edu> <76097BB0C025054786EFAB631C4A2E3C0949AB3A@MERCMBX03R.na.SAS.com> <404CAE66-4D29-412A-91B8-3BC9AD221231@presciencetrust.org> <76097BB0C025054786EFAB631C4A2E3C0949BD89@MERCMBX03R.na.SAS.com> Message-ID: <4CCAE49E.3040205@pathscale.com> Bill Rankin wrote: > Douglas: > > >>> [...] >>> What this machine does do is validate to some extent the continued >>> >> use and development of GPUs in an HPC/cluster setting. >> >>> [...] >>> >> Nvidia claims Tianhe-1A's 4.04 megawatts of CUDA GPUs and Xeon CPUs is >> three times more power efficient than CPUs alone. The Nvidia press >> release is at http://bit.ly/d9VNtY >> > > Numbers game. Lies, damned lies, and benchmarks. :-) > > I image if they had quartered the number of CPU cores and doubled the number of GPUs per node, they could have gotten even larger HPL numbers without significantly increasing their power footprint. But they didn't. Why? Perhaps because even though it would have been "more powerful" on paper, it probably would not run real applications any faster. > Define "real" applications, but to give my guess at your question "But they didn't. Why?" One word - cost From trainor at presciencetrust.org Thu Oct 28 11:45:43 2010 From: trainor at presciencetrust.org (Douglas J. Trainor) Date: Thu, 28 Oct 2010 14:45:43 -0400 Subject: [Beowulf] China Wrests Supercomputer Title From U.S. In-Reply-To: <76097BB0C025054786EFAB631C4A2E3C0949AB3A@MERCMBX03R.na.SAS.com> References: <4CC98A71.5050701@ias.edu> <76097BB0C025054786EFAB631C4A2E3C0949AB3A@MERCMBX03R.na.SAS.com> Message-ID: <404CAE66-4D29-412A-91B8-3BC9AD221231@presciencetrust.org> On Oct 28, 2010, at 11:26 AM, Bill Rankin wrote: > [...] > What this machine does do is validate to some extent the continued use and development of GPUs in an HPC/cluster setting. > [...] Nvidia claims Tianhe-1A's 4.04 megawatts of CUDA GPUs and Xeon CPUs is three times more power efficient than CPUs alone. The Nvidia press release is at http://bit.ly/d9VNtY douglas From SMorton at hess.com Thu Oct 28 20:14:18 2010 From: SMorton at hess.com (Morton, Scott) Date: Thu, 28 Oct 2010 22:14:18 -0500 Subject: [Beowulf] China Wrests Supercomputer Title From U.S. Message-ID: <5FB4750C65D3134DAC5003F67800F26D0F71C5@hacssex002.ihess.com> Yes, the petroleum industry is using GPUs in production. At Hess Corporation, we have our main seismic imaging codes running in production on nvidia GPUs. I have several presentations you can find on the internet, if you want details. There are several other companies with some similar success. WesternGeco/Schlumberger gave two presentations at nvidia's recent GPU technology conference on their efforts and successes. Scott Morton Hess Corporation ----- Original Message ----- From: beowulf-bounces at beowulf.org To: beowulf at beowulf.org Sent: Thu Oct 28 18:43:57 2010 Subject: Re: [Beowulf] China Wrests Supercomputer Title From U.S. On Thu, Oct 28, 2010 at 05:51:26PM +0100, Igor Kozin wrote: > Incidentally, does anyone know existing _production_ GPU clusters used not > for development but to run jobs by ordinary users? That depends on whether you call Cell a GPU. And whether you think RoadRunner has reached production, and if its users are ordinary or not. -- greg _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf This e-mail and any attachments are for the sole use of the intended recipient(s) and may contain information that is confidential. If you are not the intended recipient(s) and have received this e-mail in error, please immediately notify the sender by return e-mail and delete this e-mail from your computer. Any distribution, disclosure or the taking of any other action by anyone other than the intended recipient(s) is strictly prohibited. From trainor at presciencetrust.org Thu Oct 28 20:56:13 2010 From: trainor at presciencetrust.org (Douglas J. Trainor) Date: Thu, 28 Oct 2010 23:56:13 -0400 Subject: [Beowulf] China Wrests Supercomputer Title From U.S. In-Reply-To: References: <4CC98A71.5050701@ias.edu> Message-ID: <6889B265-6ABF-44AC-BC7E-2C33C726431E@presciencetrust.org> a Dongarra interview [with a nice photo on Dongarra] only stated, "The Chinese designed their own interconnect. It's not commodity. It's based on chips, based on a router, based on a switch that they produce." http://news.cnet.com/8301-13924_3-20021122-64.html On Oct 28, 2010, at 5:00 PM, Mark Hahn wrote: >> http://www.nytimes.com/2010/10/28/technology/28compute.html > > "wrests", bah. you get exactly the rank on top500 that you pay for. > > this url: > http://www.eetimes.com/electronics-news/4210223/Interconnect-pushed-China-super-to--1 > > mentions 160 Gbps as the speed of the interconnect, but also says "twice QDR". afaik QDR is 40 GB before encoding, so Galaxy would be 80, not 160. > > anyone have further details on what the "galaxy" interconnect is? From pcgrid2011 at gmail.com Fri Oct 29 02:25:49 2010 From: pcgrid2011 at gmail.com (Eric Heien) Date: Fri, 29 Oct 2010 02:25:49 -0700 Subject: [Beowulf] PCGrid 2011 CFP - Deadline Extended Message-ID: <1c3fe7b64a77fa0067d03da1d2d28244@heien.org> Due to requests from some contributors, we've extended the manuscript submission deadline to November 15. The abstract submission deadline is November 1. ###################################################################### CALL FOR PAPERS Fifth Workshop on Desktop Grids and Volunteer Computing Systems (PCGrid 2011) held in conjunction with the IEEE International Parallel & Distributed Processing Symposium (IPDPS) May 16-20, 2011 Abstract submission deadline: November 1, 2010 Manuscript submission deadline: November 15, 2010 Anchorage, Alaska, USA web site: http://pcgrid.imag.fr/ Keynote speaker Prof. Henri Casanova University of Hawaii at Manoa, USA ###################################################################### *********************** CALL FOR PAPERS *********************** OVERVIEW/SCOPE: Desktop grids and volunteer computing systems (DGVCS's) utilize the free resources available in Intranet or Internet environments for supporting large-scale computation and storage. For over a decade, DGVCS's have been one of the largest and most powerful distributed computing systems in the world, offering a high return on investment for applications from a wide range of scientific domains (including computational biology, climate prediction, and high-energy physics). While DGVCS's sustain up to PetaFLOPS of computing power from hundreds of thousands to millions of resources, fully leveraging the platform's computational power is still a major challenge because of the immense scale, high volatility, and extreme heterogeneity of such systems. The purpose of the workshop is to provide a forum for discussing recent advances and identifying open issues for the development of scalable, fault-tolerant, and secure DGVCS's. The workshop seeks to bring desktop grid researchers together from theoretical, system, and application areas to identify plausible approaches for supporting applications with a range of complexity and requirements on desktop environments. This year's workshop will have special emphasis on DGCVS's relationship and integration with Clouds. We invite submissions on DGVCS topics including the following: - cloud computing over unreliable enterprise or Internet resources - DGVCS middleware and software infrastructure (including management), with emphasis on virtual machines - incorporation of DGVCS's with Grid infrastructures - DGVCS programming environments and models - modeling, simulation, and emulation of large-scale, volatile environments - resource management and scheduling - resource measurement and characterization - novel DGVCS applications - data management (strategies, protocols, storage) - security on DGVCS's (reputation systems, result verification) - fault-tolerance on shared, volatile resources - peer-to-peer (P2P) algorithms or systems applied to DGVCS's With regard to the last topic, we strongly encourage authors of P2P-related paper submissions to emphasize the applicability to DGVCS's in order to be within the scope of the workshop. The workshop proceedings will be published through the IEEE Computer Society Press as part of the IPDPS CD-ROM. ###################################################################### IMPORTANT DATES Abstract submission deadline: November 1, 2010 Manuscript submission deadline: November 15, 2010 Acceptance Notification: December 28, 2010 Camera-ready paper deadline: February 1, 2011 Workshop: May 20, 2011 ###################################################################### SUBMISSIONS Manuscripts will be evaluated based on their originality, technical strength, quality of presentation, and relevance to the workshop scope. Only manuscripts that have neither appeared nor been submitted previously for publication are allowed. Authors are invited to submit a manuscript of up to 8 pages in IEEE format (10pt font, two-columns, single-spaced). The procedure for electronic submissions will be posted at: http://pcgrid.imag.fr/submission.html ##################################################################### ORGANIZATION General Chairs Derrick Kondo, INRIA, France Gilles Fedak, INRIA, France Program Chair Eric Heien, University of California, Davis, USA Program Committee David Abramson, Monash University, Australia David Anderson, University of California at Berkeley, USA Artur Andrzejak, Zuse Institute of Berlin, Germany Filipe Araujo, University of Coimbra, Portugal Henri Bal, Vrije Universiteit, The Netherlands Zoltan Balaton, SZTAKI, Hungary Adam Beberg, Stanford University, USA Francisco Brasileiro, Federal University of Campina Grande, Brazil Massimo Canonico, University of Piemonte Orientale, Italy Henri Casanova, University of Hawaii at Manoa, USA Abhishek Chandra, University of Minnesota, USA Edgar Gabriel, University of Houston, USA Haiwu He, INRIA, France Bahman Javadi, University of Melbourne, Australia Yang-Suk Kee, University of Southern California, USA Arnaud Legrand, CNRS, France Grzegorz Malewicz, University of Alabama, USA Alan Sussman, University of Maryland, USA Michela Taufer, University of Delaware, USA David Toth, Merrimack College, USA Bernard Traversat, Oracle Corporation, USA Carlos Varela, Rensselaer Polytechnic Institute, USA Sebastien Varrette, University of Luxembourg, Luxembourg Jon Weissman, University of Minnesota, USA Zhiyuan Zhan, Microsoft, USA -- If you do not want to receive any more newsletters, http://heien.org/lists/?p=unsubscribe&uid=91cf1cd00d8108810df8ddca7f4ab015 To update your preferences and to unsubscribe visit http://heien.org/lists/?p=preferences&uid=91cf1cd00d8108810df8ddca7f4ab015 Forward a Message to Someone http://heien.org/lists/?p=forward&uid=91cf1cd00d8108810df8ddca7f4ab015&mid=17 -- Powered by PHPlist, www.phplist.com -- From Shai at ScaleMP.com Fri Oct 29 05:34:47 2010 From: Shai at ScaleMP.com (Shai Fultheim (Shai@ScaleMP.com)) Date: Fri, 29 Oct 2010 08:34:47 -0400 Subject: [Beowulf] MPI-IO + nfs - alternatives? In-Reply-To: <1285777453.1665.170.camel@moelwyn> References: <1285777453.1665.170.camel@moelwyn> Message-ID: <9B14D1490DDECA4E974F6B9FC9EBAB310EC7065C79@VMBX108.ihostexchange.net> Robert, I would go for virtualizing the cluster as single system (virtual SMP) and then using local I/O - just as we all run MPI apps on large SMPs years ago. Specifically, vSMP Foundation (www.scalemp.com) provides great scratch performance with local drives (use Linux raid utilities to make RAID0). Best regards, shai --Shai Visit us at SC10, Nov 13-19,? New Orleans, booth 3239 -----Original Message----- From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Robert Horton Sent: Wednesday, September 29, 2010 18:24 To: Beowulf Mailing List Subject: [Beowulf] MPI-IO + nfs - alternatives? Hi, I've been running some benchmarks on a new fileserver which we are intending to use to serve scratch space via nfs. In order to support MPI-IO I need to mount with the "noac" option. Unfortunately this takes the write block performance from around 100 to 20MB/s which is a bit annoying given that most of the workload isn't MPI-IO. 1) Does anyone have any hints for improving the nfs performance under these circumstances? I've tried using jumbo frames, different filesystems, having the log device on an SSD and increasing the nfs block size to 1MB, none of which have any significant effect. 2) Are there any reasonable alternatives to nfs in this situation? The main possibilities seem to be: - PVFS or similar with a single IO server. Not sure what performance I should expect from this though, and it's a lot more complex than nfs. - Sharing a block device via iSCSI and using GFS, although this is also going to be somewhat complex and I can't find any evidence that MPI-IO will even work with GFS. Otherwise it looks though the best bet would be to export two volumes via nfs, only one of which is mounted with noac. Any other suggestions? Rob _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From ckpurvis at ua.edu Thu Oct 28 08:51:25 2010 From: ckpurvis at ua.edu (Purvis, Cameron) Date: Thu, 28 Oct 2010 10:51:25 -0500 Subject: [Beowulf] Anybody using Redhat HPC Solution in their Beowulf In-Reply-To: References: <4CC98415.3070006@runnersroll.com> Message-ID: <63F9C2B1CE3A1B49B0EDAE18F7B072511D2A885651@MAIL1.ua-net.ua.edu> > You're asking the CS department (full of researchers wanting > to do novel research for their dissertation or to move them > towards tenure) to be sysadmins. Being an SA is fun, once. An IT guy here: A challenge at my institution is that these systems are usually by faculty or by undergrad / grad students. Students eventually leave, and here that (often) eventually ends up with IT closing the gap on management and support. There's a lot of good to that, from my IT perspective. I'd like to keep system administration 'out of the way' of faculty so they can focus on research. Centrally managed HPC helps manage that; here we just aren't able to deliver the same level of support to standalone clusters since there's a lot of variation in software, hardware, schedulers, et c. We end up a mile wide and an inch deep with our departmental skills. > Yes, but that would mean more like "sharing a cluster" as > opposed to CS providing support and SA services. And "sharing > a cluster" means that the cluster architecture has to be > appropriate for both people, which is challenging as has been > addressed here recently. Then there's the "if you're getting > benefit, can you kick in part of the cash needed" which > gets into a whole other area of complexity. For our shared system we're going to use good scheduler for fair-share based on user contributions to the system, but let non-contributors use the system just at a lower priority than the funding partners. We're also managing different node types/builds into different node groups, though we have to limit the different node types to keep things manageable. > The institution steps in and says, cease this wasteful squabbling, > henceforth all computing resources will be managed by the > institution: "to each according to their needs", and we'll come > up with a fair way to do the "from each according to their ability". > Just submit your computing resource request to the steering > committee and we'll decide what's best for the institution overall. > Yes.. Local control of a "personal supercomputer" is a very, very nice thing. We're doing exactly this. Researchers have been rolling their own systems, and running them in separate labs, for years. We are hitting power and cooling limitations in those spaces as cluster needs grow. The centralized system (usually ) lets us get higher utilization but it inherently demands resource sharing - that's more a political problem than a technical one. But if we can't build policies to make the shared platform useful, we haven't really created much value, have we? Waiting on a resource is annoying when you just want your job to run NOW, especially if that you put money into the system. Physically hosting clusters in the data center requires some controls (defending against physical sprawl, mainly) but helps retain control of the system for the owner and get the data center facilities, but limits physical access to the hardware. Deskside clusters and single-user or departmental viz systems aren't really candidates for relocation because the user HAS to touch them. We're hoping to focus on smaller jobs and test runs on those, with the bigger jobs in the data center. Of course, this isn't appealing to everyone. Even with central IT support of HPC resources, I don't think it's realistic to centralize ALL cluster services. I anticipate a shared system for our faculty who don't have a lot of money for HPC or who don't have the time or technical resources to administer it. The heavy hitters who do a LOT of HPC will probably still have needs for their own clusters, especially if they have unique requirements that don't fit in a central system. ---------------------------------------- Cameron Purvis University of Alabama Office of Information Technology Research Support From stuartb at 4gh.net Fri Oct 29 08:29:59 2010 From: stuartb at 4gh.net (Stuart Barkley) Date: Fri, 29 Oct 2010 11:29:59 -0400 (EDT) Subject: [Beowulf] how Google warps your brain In-Reply-To: <68A57CCFD4005646957BD2D18E60667B12154CC6@milexchmb1.mil.tagmclarengroup.com> References: <011801cb720f$96b08da0$c411a8e0$@comcast.net><4CC553CB.8070606@bull.co.uk> <68A57CCFD4005646957BD2D18E60667B12154CC6@milexchmb1.mil.tagmclarengroup.com> Message-ID: On Mon, 25 Oct 2010 at 11:26 -0000, Hearns, John wrote: > > might well last to the end of civilization. Replicate them a few > > million times, PERPETUATE them from generation to generation by > > renewing the copies, and backing them up, and recopying them in > > formats where they are still useful. > > The cloud backup providers will be keeping copies of data on > geographically spread sites. > > However, we should at this stage be asking what are the mechanisms > for cloud storage companies for > > *) living wills - what happens when the company goes bust > > *) what are the strategies for migrating the data onto new storage > formats Continuing way off topic, but a pet peeve of mine. *) What are their processes for returning this information to the owners (the public in many cases)? I'm very annoyed that at lot of the "cloud" companies have taken public domain information and hidden it behind unusual access methods. Google bought the dejanews archives and then hid them behind a proprietary access method. Things like mailing list and usenet archives may exist at various cloud sites (google groups, mail-archive.com and even local archives), but the information isn't easily available in bulk form. I prefer to import standard mailbox format archives of these things and use my own search processes on the information. For example, for this list the archives at beowulf.org have a pretty horrible search engine. The individual monthly archives are available in (almost) mailbox format. I need to slightly unmunge some things to bring the archives into my email so I can do my preferred archive browsing and searching. This is marginally acceptable. However, at google groups or mail-archive which have some degree of "its in the cloud" claim, there is no apparent usable way to get bulk information out of their archives. You need to use their sucky search index. You also have no way to locally preserve the long history of your favorite mailing list of usenet group. Oh, and the new sourceforge mailing list archives are next to useless. Stuart Barkley Curmudgeon -- I've never been lost; I was once bewildered for three days, but never lost! -- Daniel Boone From ashley at pittman.co.uk Fri Oct 29 08:33:10 2010 From: ashley at pittman.co.uk (Ashley Pittman) Date: Fri, 29 Oct 2010 16:33:10 +0100 Subject: [Beowulf] Re: Interesting In-Reply-To: References: Message-ID: <950AB77D-62D6-469D-8B18-04E40136D84D@pittman.co.uk> On 29 Oct 2010, at 15:49, Lux, Jim (337C) wrote: > So, if you want your flash to hold forever, you'll have to periodically > rewrite it. Say you rewrote every year, you'd get 10,000-100,000 years > before you "wore out" the flash. > > There are other aging effects: diffusion of metal ions, etc. You'd want to > keep your flash cold, (but not too cold, or it will break... No liquid > nitrogen) > > I think your best bet is real CDs... That is, the mechanically stamped > variety. They're dense, and nothing beats a mechanical change. You can > still read Jacquard punch cards from the early 19th century (in fact, I was > reading an article recently about there being a dearth of loom programmers.. > So when your job at the buggy whip factory finally goes away...) With digital data it strikes me as somewhat easier, Posix isn't going to go away in the next hundred years so keeping access is just a case of transferring to a new media every five to ten years, whatever that media or filesystem may be. Expensive maybe as the storage requirements go up but a simple enough problem to solve. The challenge with digital data is in being able to parse the contents of the files into something meaningful and this is where open standards become essential IMHO. This is many many times harder than simply storing it and the thought of trying to open a word document in fifty years gives me the shudders, it's bad enough getting them to format correctly over a five year period. One of the projects I've worked on has a large word doc that is so fragile we had a afternoons meeting to discuss how to handle it and the conclusion was that we all needed to install a specific version of Windows in a VM and use a single version of word to view and amend the document. It's also easy to point the finger at people who've lost data, I seem to recall a project to store digital data on stonehenge that was subject to a similar restoration project as the BBC domesday book and of course Nasa are famous for this but at the end of the day it's something that we've probably all done, I don't own a DVD player any more and neglected to backup all my DVDs before it broke. With audio tapes and vinyl I'm not so bad, the challenging one for me would be all the Hi8 camcorder footing my parents have lying around in a drawer somewhere. Ashley. -- Ashley Pittman, Bath, UK. Padb - A parallel job inspection tool for cluster computing http://padb.pittman.org.uk From stuartb at 4gh.net Fri Oct 29 08:49:26 2010 From: stuartb at 4gh.net (Stuart Barkley) Date: Fri, 29 Oct 2010 11:49:26 -0400 (EDT) Subject: [Beowulf] Anybody using Redhat HPC Solution in their Beowulf In-Reply-To: <4CC2607C.40605@gmail.com> References: <4CC11EA8.8030602@gmail.com> <4CC2607C.40605@gmail.com> Message-ID: On Sat, 23 Oct 2010 at 00:11 -0000, Richard Chang wrote: > On 10/22/2010 11:26 PM, Alex Chekholko wrote: > > The RH HPC mailing list suggests this project is inactive: > > https://www.redhat.com/archives/rhel-hpc-list/ > > I didn't check that. I never knew that an inactive mailing list > means an in-active project. But an idle mailing list is often (not always) a very good indication of the quality of the project. Some things may be old, well known, rock solid and have usable documentation. Otherwise an idle mailing list is a strong beware sign. Likewise, web sites that say "coming soon" and are dated more than a year ago are good warning signs. Web sites without dates (including years) are another caution sign. From rgb at phy.duke.edu Fri Oct 29 08:59:03 2010 From: rgb at phy.duke.edu (Robert G. Brown) Date: Fri, 29 Oct 2010 11:59:03 -0400 (EDT) Subject: [Beowulf] Re: Interesting In-Reply-To: <4CCAD130.2050004@runnersroll.com> References: <4CCAD130.2050004@runnersroll.com> Message-ID: On Fri, 29 Oct 2010, Ellis H. Wilson III wrote: > Interestingly, I found "Keeping Bits Safe: How Hard Can It Be?" by David > Rosenthal in the November Communications of the ACM just released. > > It does discuss data retention at the centuries level, but unfortunately does > not consider the moon-based strategy proposed by Rob. Nonetheless is a good > read for any out there who are now interested in this area. However, I do > wish flash (or any technology besides normal 3.5in hard drives) was > considered. I would expect dormant flash-based technology to last quite a > while at controlled temperatures. IIRC flash has serious bitlevel problems in its read-write cycle, although this is improving. But fundamentally, what we're dealing with is the good old second law of thermodynamics and microscopic statistical mechanics. The smaller one makes "bits" in any kind of storage system, the more susceptible they are to "random" (thermal or non-thermal) processes that erase their contents -- cosmic rays or other radiation, pure thermal day, quantum decay, physical accidents or electronic accidents. In order to prevent this sort of progressive diffusion of disinformation, one has to a) lower the temperature of the storage medium to as low as one can make it. Pluto would therefore make a great repository, if it weren't for the possibility of a gravitational resonance that might one day send it e.g. crashing into Neptune or the Sun (something we cannot predict as it is a many body problem with unknown masses and chaotic solutions:-); and b) use large objects with lots of atoms in them in highly stable and non-reactive configurations to store the information. Cunieform on carefully stored, thick, baked clay tablets stored in a dry environment that rarely experiences frost -- a good way to make it through 6000 years. Fossil imprints buried deep in the earth in not-particularly-geologically-active rock layers in dry, thermally stable mountainsides are pretty good out to a few hundred million years, with a lot of degradation of course. Not much is good out to a billion years. But as for electronic storage -- well, that's pretty much a joke. Not only are the media themselves cheap mass produced pieces of crap, but the technologies that underlie them have the expected lifetime of a love-sick moth trapped in a box full of lit candles. A decade or perhaps two, tops. At that point the very interface they depend on is likely to go away, and go away forever. Do they even MAKE floppy controllers any more, on motherboards? In a few more years, will they even make motherboards and desktop computers with ROOM for a floppy drive? And then, is the information saved on those "lifetime" verbatim floppy disks still there, or have cosmic rays erased key bits so that it is no longer accessible if you HAD a controller? This may change. In fact, there is room for an entrepreneurial thing, here. Really Long Term storage is valuable. But it ain't available yet. rgb > > ellis > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu From alscheinine at tuffmail.us Fri Oct 29 09:14:23 2010 From: alscheinine at tuffmail.us (Alan Louis Scheinine) Date: Fri, 29 Oct 2010 11:14:23 -0500 Subject: [Beowulf] China Wrests Supercomputer Title From U.S. In-Reply-To: <20101029064626.GE26764@bx9.net> References: <4CC98A71.5050701@ias.edu> <4CCA45B4.90700@tuffmail.us> <20101029064626.GE26764@bx9.net> Message-ID: <4CCAF2DF.50801@tuffmail.us> With regard to the comment: > Do you really think you're going to program these machines > with something other than MPI? Exactly, that is another facet of the same debate about the future: the need for a shift that encompasses both programming language, execution model and interconnect hardware. I'm not necessarily advocating such a shift, but rather, I'm trying to see/understand what is on the horizon. For example, Chapel seems nice but with so few developers, it is likely that its fate will be the same as High Performance Fortran. What facts should I be aware of that might indicate a different future? More generally, the point is to evaluate the news with regard to whether there are new capabilities related to this possible shift. Greg Lindahl wrote: > On Thu, Oct 28, 2010 at 10:55:32PM -0500, Alan Louis Scheinine wrote: > >> With regard to networks, a near-future fork in the road between Beowulf >> clusters versus supercomputers may be the intelligence concerning global >> memory added to the network interface chip for upcoming models of supercomputers. > > Yawn. First we had the attack of the killer micros. Now we have the > attack of the commodity interconnects. Do these special interconnects > with fancy global memory justify their cost vs. InfiniBand + HPC > tweaks? Do you really think you're going to program these machines > with something other than MPI? > > Make your bets, wait, see if you win. > > -- greg > > ps. I am pleased to see that the new EXTOLL interconnect has a > non-RDMA short message system. > > pps. Blekko launches November 1. Check out the /hpc slashtag. > > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- Alan Scheinine 200 Georgann Dr., Apt. E6 Vicksburg, MS 39180 Email: alscheinine at tuffmail.us Mobile phone: 225 288 4176 http://www.flickr.com/photos/ascheinine From john.hearns at mclaren.com Fri Oct 29 09:42:39 2010 From: john.hearns at mclaren.com (Hearns, John) Date: Fri, 29 Oct 2010 17:42:39 +0100 Subject: [Beowulf] Storage - the end of RAID? Message-ID: <68A57CCFD4005646957BD2D18E60667B121AD077@milexchmb1.mil.tagmclarengroup.com> Quite a perceptive article on ZDnet http://www.zdnet.com/blog/storage/the-end-of-raid/1154?tag=nl.e539 Class, discuss. John Hearns | CFD Hardware Specialist | McLaren Racing Limited McLaren Technology Centre, Chertsey Road, Woking, Surrey GU21 4YH, UK T: +44 (0) 1483 261000 D: +44 (0) 1483 262352 F: +44 (0) 1483 261010 E: john.hearns at mclaren.com W: www.mclaren.com The contents of this email are confidential and for the exclusive use of the intended recipient. If you receive this email in error you should not copy it, retransmit it, use it or disclose its contents but should return it to the sender immediately and delete your copy. From Bill.Rankin at sas.com Fri Oct 29 09:48:48 2010 From: Bill.Rankin at sas.com (Bill Rankin) Date: Fri, 29 Oct 2010 16:48:48 +0000 Subject: [Beowulf] China Wrests Supercomputer Title From U.S. In-Reply-To: <4CCAE49E.3040205@pathscale.com> References: <4CC98A71.5050701@ias.edu> <76097BB0C025054786EFAB631C4A2E3C0949AB3A@MERCMBX03R.na.SAS.com> <404CAE66-4D29-412A-91B8-3BC9AD221231@presciencetrust.org> <76097BB0C025054786EFAB631C4A2E3C0949BD89@MERCMBX03R.na.SAS.com> <4CCAE49E.3040205@pathscale.com> Message-ID: <76097BB0C025054786EFAB631C4A2E3C0949BEDC@MERCMBX03R.na.SAS.com> > Define "real" applications, Something that produces tangible, scientifically useful results that would not have otherwise been realized without the availability and capability of that machine. > but to give my guess at your question "But they didn't. Why?" > > One word - cost Well, that's the obvious (and universal) given. But it's not a useful answer in this context. Cost is always a limiting factor. Optimizing capability within the budget envelope is the challenge. Now to be fair, my question was somewhat leading (and my argument is somewhat reduction-to-the-absurd) but what if the system designers has reduced the number of CPU cores per node and used the money saved there to purchase additional GPU nodes. Make the system really CPU light and GPU heavy. You would be left with a something that would potentially have a higher HPL number while maintaining the overall system cost. But why didn't they? Why instead did they spend their money on a things like a custom high-perf interconnect (which tends not to be a limiting factor in HPL performance) and lots of cores on each node? And IIRC 20% of their nodes don't even have GPUs? My point is that while GPUs are certainly a potent tool to use in HPC, trying to draw some sort of universal claim about their efficacy and general usefulness based upon a single contrived benchmark is essentially the same as trying to extrapolate any conclusion from a single data point. Unfortunately many people in the media do not seem to have any reservations about doing exactly that. Take care, and have a great weekend. -b From lindahl at pbm.com Fri Oct 29 10:18:41 2010 From: lindahl at pbm.com (Greg Lindahl) Date: Fri, 29 Oct 2010 10:18:41 -0700 Subject: [Beowulf] Storage - the end of RAID? In-Reply-To: <68A57CCFD4005646957BD2D18E60667B121AD077@milexchmb1.mil.tagmclarengroup.com> References: <68A57CCFD4005646957BD2D18E60667B121AD077@milexchmb1.mil.tagmclarengroup.com> Message-ID: <20101029171841.GC17525@bx9.net> On Fri, Oct 29, 2010 at 05:42:39PM +0100, Hearns, John wrote: > Quite a perceptive article on ZDnet > > http://www.zdnet.com/blog/storage/the-end-of-raid/1154?tag=nl.e539 This has been going on for a long time. Blekko has 5 petabytes of disk, and no RAID anywhere. RAID went out with SQL. Kinda funny that HPC is slower to abandon RAID than other kinds of computing... -- greg From rgb at phy.duke.edu Fri Oct 29 10:36:31 2010 From: rgb at phy.duke.edu (Robert G. Brown) Date: Fri, 29 Oct 2010 13:36:31 -0400 (EDT) Subject: [Beowulf] Re: Interesting In-Reply-To: References: Message-ID: On Fri, 29 Oct 2010, Lux, Jim (337C) wrote: > Or, how about something like the UNICON aka "terabit memory" (TBM) from > Illiac IV days. It's a stable polyester base with a thin film of rhodium > that was ablated by a laser making 3 micron holes to write the bits. $3.5M > to store a terabit in 1975. Burned RO laser disks should in principle be as stable, if the medium used is thick enough. The problem is that CDs tend to be mass produced with very thin media, cheap plastic, and are even susceptible to corrosion through the plastic over time. If one made a CD with tempered glass and a moderately thick slice of e.g. stainless steel or platinum... But then your problem is the reader. CD readers give way to DVD and are still backwards compatible, sort of. But what about the 2020 equivalent? Will there even be one? Nobody will buy actual CDs any more. Nobody will buy movies on DVDs any more (seriously, I doubt that they will). Will there BE a laser drive that is backwards compatible to CD, or will it go the way of reel to reel tapes, 8 track tapes, cassette tapes, QIC tapes, floppy drives of all flavors (including high capacity drives like the ones I have carefully saved at home in case I ever need one), magnetic core memories, large mountable disk packs, exabyte tape drives, DA tapes, and so on? I rather think it will be gone. It isn't even clear if hard disk drives will still be available (not that any computer around would be able to interface with the 5 or 10 MB drives of my youth anyway). This is the problem with electronics. You have to have BOTH long time scale stability AND an interface for the ages. And the latter is highly incompatible with e.g. Moore's Law -- not even the humble serial port has made it through thirty years unscathed. Is the Universal Serial Bus really Universal? I doubt it. And yet, that is likely to be the only interface available AT ALL (except for perhaps some sort of wireless network that isn't even VISIBLE to old peripherals) on the vast bulk of the machines sold in a mere five years. A frightening trend in computing these days is that we may be peaking in the era where one's computer (properly equipped with a sensible operating system) is symmetrically capable of functioning as a client and a server. Desktop computers were clients, servers, or both as one wished, from the days of Sun workstations through to the present, with any sort of Unixoid operating system and adequate resources. From the mid 90's on, with Linux, pure commodity systems were both at the whim of the system owner -- anybody could add more memory, more disks, a backup device, and the same chassis was whatever you needed it to be. Now, however, this general purpose desktop is all but dead, supplanted by laptops that are just as powerful, but that lack the expandability and repurposeability. And laptops are themselves an endangered species all of a sudden -- in five years a "laptop" could very well be a single "pad" (touchscreen) of whatever size with or without an external keyboard, all wireless, smooth as a baby's bottom as far as actual plugs are concerned (or maybe, just maybe, with a single USB charger/data port or a couple of slots for SD-of-the-day or USB peripherals). Actual data storage may well migrate into servers that are completely different beasts, far away, accessible only over a wireless network, and controlled by others. An enormous step backwards, in other words. A risk to our political freedom. And yet so seductive, so economical, so convenient, that we may willingly dance down a primrose path to an information catastrophe that is more or less impossible still with the vast decentralization of stored knowledge. rgb Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu From james.p.lux at jpl.nasa.gov Fri Oct 29 10:56:15 2010 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Fri, 29 Oct 2010 10:56:15 -0700 Subject: [Beowulf] Re: Interesting In-Reply-To: References: Message-ID: Jim Lux +1(818)354-2075 > -----Original Message----- > From: Robert G. Brown [mailto:rgb at phy.duke.edu] > Sent: Friday, October 29, 2010 10:37 AM > To: Lux, Jim (337C) > Cc: Ellis H. Wilson III; beowulf at beowulf.org > Subject: Re: [Beowulf] Re: Interesting > > On Fri, 29 Oct 2010, Lux, Jim (337C) wrote: > > > Or, how about something like the UNICON aka "terabit memory" (TBM) from > > Illiac IV days. It's a stable polyester base with a thin film of rhodium > > that was ablated by a laser making 3 micron holes to write the bits. $3.5M > > to store a terabit in 1975. > > Burned RO laser disks should in principle be as stable, if the medium > used is thick enough. The problem is that CDs tend to be mass produced > with very thin media, cheap plastic, and are even susceptible to > corrosion through the plastic over time. If one made a CD with tempered > glass and a moderately thick slice of e.g. stainless steel or > platinum... > > But then your problem is the reader. CD readers give way to DVD and are > still backwards compatible, sort of. But what about the 2020 > equivalent? Will there even be one? Nobody will buy actual CDs any > more. Nobody will buy movies on DVDs any more (seriously, I doubt that > they will). Will there BE a laser drive that is backwards compatible to > CD, or will it go the way of reel to reel tapes, 8 track tapes, cassette > tapes, QIC tapes, floppy drives of all flavors (including high capacity > drives like the ones I have carefully saved at home in case I ever need > one), magnetic core memories, large mountable disk packs, exabyte tape > drives, DA tapes, and so on? I rather think it will be gone. It isn't > even clear if hard disk drives will still be available (not that any > computer around would be able to interface with the 5 or 10 MB drives of > my youth anyway). True.. but at least for something like the UNICON, the actual media is in a form that would be fairly easy to fabricate a reader from scratch. The "dot pitch" was, I think 5-10 microns, and the piece of plastic was about 2 feet long and 4 inches wide. You can "see" the dots in a microscope, so you could read it by hand, if you had to (although reading a terabit at 1 bit/second would take a mere 31,700 years or so). In reality, I think that with a machine shop, I could probably build a reader in a month that would read out at several megabits/second. You'd need to read some of it by hand to make sure your high speed reader was working. However, the format has the virtue of being simple and durable. And the reader is readily reconstructable... one of the problems with newer higher density formats is that it might rely on some exotic technology to read it (this is why I think holographic storage might be tricky, it depends on a lot of other stuff). One could probably fabricate a CD ROM reader in about the same 1 month time frame. The format is pretty simple, and reading the disk isn't an optical challenge. > > This is the problem with electronics. You have to have BOTH long time > scale stability AND an interface for the ages. And the latter is highly > incompatible with e.g. Moore's Law -- not even the humble serial port > has made it through thirty years unscathed. Is the Universal Serial Bus > really Universal? I doubt it. And yet, that is likely to be the only > interface available AT ALL (except for perhaps some sort of wireless > network that isn't even VISIBLE to old peripherals) on the vast bulk of > the machines sold in a mere five years. I think that 10/100baseT Ethernet might be a longer term bet than USB. But your point is well taken. Firewire/1394 is essentially dead, except in some niche markets. > Now, however, this general purpose desktop is all but dead, supplanted > by laptops that are just as powerful, but that lack the expandability > and repurposeability I think the lack of expandability is actually a benefit. It forces people to use abstracted interfaces. Yes USB and Ethernet will go away, but they're a lot more durable and flexible (particularly Ethernet) than ISA,EISA,PCI, PCIx, etc. This is a curse to us in the space business, where we have 10-20 year lifetimes for equipment, if not longer. You want to maintain a testbed in the lab, and you have to have decades old computers "under glass" to support some custom peripheral that has a, for example, ISA bus interface, and provides a test interface to the spacecraft hardware. At least if you have provided an Ethernet interface, I can get a new computer that has an Ethernet, and rewrite whatever software is needed to talk to the device (assuming you documented the interface, and I can find the documentation in all those boxes in records storage) From james.p.lux at jpl.nasa.gov Fri Oct 29 11:06:03 2010 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Fri, 29 Oct 2010 11:06:03 -0700 Subject: [Beowulf] RE: Storage - the end of RAID? In-Reply-To: <68A57CCFD4005646957BD2D18E60667B121AD077@milexchmb1.mil.tagmclarengroup.com> References: <68A57CCFD4005646957BD2D18E60667B121AD077@milexchmb1.mil.tagmclarengroup.com> Message-ID: > -----Original Message----- > From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Hearns, John > Sent: Friday, October 29, 2010 9:43 AM > To: beowulf at beowulf.org > Subject: [Beowulf] Storage - the end of RAID? > > Quite a perceptive article on ZDnet > > http://www.zdnet.com/blog/storage/the-end-of-raid/1154?tag=nl.e539 > > Class, discuss. > Yes, indeed, his comments makes sense.. After all, the acronym was "Redundant Arrays of Inexpensive Disks" Granted, these implementations had useful side effects (e.g. improving read speed by sharing) The real question is whether drive reliability has improved commensurate with the drive capacity (that is, is the failure rate per drive basically constant, as opposed to the "bit error rate") RAID was designed to solve the "failed drive" problem, more than the "bad bit" problem. And to do it using a less than "rate 1/2" code.. that is, rather than store 2 copies of your data, you could store, essentially, 11/8ths copies of your data (using a Hamming code to generate 3 syndrome bits for each 8 data bits for instance), thereby saving money. However, if drives get cheap, then using 2 copies (or 3) isn't a big deal. From ellis at runnersroll.com Fri Oct 29 11:46:35 2010 From: ellis at runnersroll.com (Ellis H. Wilson III) Date: Fri, 29 Oct 2010 14:46:35 -0400 Subject: [Beowulf] RE: Storage - the end of RAID? In-Reply-To: References: <68A57CCFD4005646957BD2D18E60667B121AD077@milexchmb1.mil.tagmclarengroup.com> Message-ID: <4CCB168B.6000002@runnersroll.com> On 10/29/10 14:06, Lux, Jim (337C) wrote: >> -----Original Message----- >> From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Hearns, John >> Sent: Friday, October 29, 2010 9:43 AM >> To: beowulf at beowulf.org >> Subject: [Beowulf] Storage - the end of RAID? >> >> Quite a perceptive article on ZDnet >> >> http://www.zdnet.com/blog/storage/the-end-of-raid/1154?tag=nl.e539 >> >> Class, discuss. >> > > Yes, indeed, his comments makes sense.. > > After all, the acronym was "Redundant Arrays of Inexpensive Disks" > > Granted, these implementations had useful side effects (e.g. improving read speed by sharing) > > The real question is whether drive reliability has improved commensurate with the drive capacity (that is, is the failure rate per drive basically constant, as opposed to the "bit error rate") > > RAID was designed to solve the "failed drive" problem, more than the "bad bit" problem. And to do it using a less than "rate 1/2" code.. that is, rather than store 2 copies of your data, you could store, essentially, 11/8ths copies of your data (using a Hamming code to generate 3 syndrome bits for each 8 data bits for instance), thereby saving money. > > However, if drives get cheap, then using 2 copies (or 3) isn't a big deal. Drives (of the commodity variety) are pretty darn cheap already. I'd be surprised if this (RAID 1) isn't the better solution today (rather than RAID2-6), rather than some point in the future. The major issue I see with the article is that the author refers to RAID being "dead" when really he should be saying RAID 2-6 is less preferable to RAID 1 (but it does make for a "catchier" article title). RAID 0 will always be around to soften the bottleneck created by the gap in performance between CPU and disk. I would actually be surprised if it wasn't common in big HPC in five years to have cpu nodes talking to I/O forwarding nodes that had RAID1 caches of SSDs in them who in turn talked to Server nodes connected directly to LUNs (who also have RAID, although I cannot say whether it would be 1/10/01/etc). This setup lessens the need for tons of expensive RAM at the client or forwarding nodes since SSD is closer to CPU speed than disk in terms of latency for reads and fixes some of the canonical "durability" problems in HPC. Also, I think he would be hard-pressed to make a case against varieties of hybrid RAID which use 0 and 1. In those situations on failure you are basically performing a straightforward copy - and it can happen from/to multiple disks at once. Slight performance degradation, but nothing as serious as parity-based rebuilds. I personally do not see certain versions of RAID going away anytime soon - they are just too basic a concept for performance/redundancy to kill them off. ellis From ellis at runnersroll.com Fri Oct 29 12:02:45 2010 From: ellis at runnersroll.com (Ellis H. Wilson III) Date: Fri, 29 Oct 2010 15:02:45 -0400 Subject: [Beowulf] Storage - the end of RAID? In-Reply-To: <20101029171841.GC17525@bx9.net> References: <68A57CCFD4005646957BD2D18E60667B121AD077@milexchmb1.mil.tagmclarengroup.com> <20101029171841.GC17525@bx9.net> Message-ID: <4CCB1A55.8090506@runnersroll.com> On 10/29/10 13:18, Greg Lindahl wrote: > On Fri, Oct 29, 2010 at 05:42:39PM +0100, Hearns, John wrote: > >> Quite a perceptive article on ZDnet >> >> http://www.zdnet.com/blog/storage/the-end-of-raid/1154?tag=nl.e539 > > This has been going on for a long time. Blekko has 5 petabytes of > disk, and no RAID anywhere. RAID went out with SQL. Kinda funny that > HPC is slower to abandon RAID than other kinds of computing... I think it's making a pretty wild assumption to say search engines and HPC have the same I/O needs (and thus can use the same I/O setups). If RAID isn't gone from the domain, there is probably a pretty good reason for it. Also, I'd be blown away if Blekko wasn't doing it's own striping/redundancy - even if they aren't using RAID 0 or 1 by the book, they probably are using the same concepts (though hand-spun for search engine needs). I don't think the "whole internet" takes up 5 petabytes, so they probably have a couple copies for redundancy and performance or heterogeneous disk arrays to service more/less accessed items on the net. ellis From lindahl at pbm.com Fri Oct 29 12:15:05 2010 From: lindahl at pbm.com (Greg Lindahl) Date: Fri, 29 Oct 2010 12:15:05 -0700 Subject: [Beowulf] RE: Storage - the end of RAID? In-Reply-To: <4CCB168B.6000002@runnersroll.com> References: <68A57CCFD4005646957BD2D18E60667B121AD077@milexchmb1.mil.tagmclarengroup.com> <4CCB168B.6000002@runnersroll.com> Message-ID: <20101029191505.GA29737@bx9.net> On Fri, Oct 29, 2010 at 02:46:35PM -0400, Ellis H. Wilson III wrote: > Drives (of the commodity variety) are pretty darn cheap already. I'd be > surprised if this (RAID 1) isn't the better solution today (rather than > RAID2-6), rather than some point in the future. Um, it's not really RAID 1 when the drives are in different servers. Although there's not much point in arguing about that. -- greg From atp at piskorski.com Fri Oct 29 12:45:55 2010 From: atp at piskorski.com (Andrew Piskorski) Date: Fri, 29 Oct 2010 15:45:55 -0400 Subject: [Beowulf] RE: Storage - the end of RAID? In-Reply-To: <4CCB168B.6000002@runnersroll.com> References: <4CCB168B.6000002@runnersroll.com> Message-ID: <20101029194555.GA86270@piskorski.com> On Fri, Oct 29, 2010 at 02:46:35PM -0400, Ellis H. Wilson III wrote: >>> http://www.zdnet.com/blog/storage/the-end-of-raid/1154?tag=nl.e539 > The major issue I see with the article is that the author refers to RAID > being "dead" when really he should be saying RAID 2-6 is less preferable > to RAID 1 (but it does make for a "catchier" article title). What I think original article author, Robin Harris, probably knew, but didn't point out explicitly, is that all the official "RAID" algorithms are ** block oriented **, RAID-1 just as much as RAID-5, and *that's* why they're "dead". The only question how long it'll take for better solutions to become widespread enough for us to ditch block-oriented RAID entirely. Everything else Harris mentions is just supporting detail. ZFS and Btrfs seem to have the right approach; end-to-end data assurance provided by the file system. I look forward to the time when I can easily use it for real on all my computers. -- Andrew Piskorski http://www.piskorski.com/ From lindahl at pbm.com Fri Oct 29 12:48:29 2010 From: lindahl at pbm.com (Greg Lindahl) Date: Fri, 29 Oct 2010 12:48:29 -0700 Subject: [Beowulf] Storage - the end of RAID? In-Reply-To: <4CCB1A55.8090506@runnersroll.com> References: <68A57CCFD4005646957BD2D18E60667B121AD077@milexchmb1.mil.tagmclarengroup.com> <20101029171841.GC17525@bx9.net> <4CCB1A55.8090506@runnersroll.com> Message-ID: <20101029194829.GA1676@bx9.net> On Fri, Oct 29, 2010 at 03:02:45PM -0400, Ellis H. Wilson III wrote: > I think it's making a pretty wild assumption to say search engines and > HPC have the same I/O needs (and thus can use the same I/O setups). Well, I'm an HPC guy doing infrastructure for a search engine, so I'm not assuming much. And I didn't say the setup would be the same -- just that Lustre/PVFS would probably be more reliable and higher performance if they stored copies on multiple servers instead of using local or SAN RAID. (Or did they implement this while I wasn't looking?) > Also, I'd be blown away if Blekko wasn't doing it's own > striping/redundancy - even if they aren't using RAID 0 or 1 by the book, > they probably are using the same concepts (though hand-spun for search > engine needs). We do the usual thing: store 3 copies on 3 different servers, locality picked such that a single network or power failure won't take out more than 1 copy. Since we are very concerned about transfer rates, it's well worth buying more disks because the read speed increases. > I don't think the "whole internet" takes up 5 petabytes, The internet is infinite in size thanks to websites that generate data (or crap). Our 3 billion page crawl (1/5 of the size we dream of) is 257 tbytes (compressed), and the corresponding index is 77 terabytes (very compressed). (Yes, we have a lot of disk space empty at the moment.) -- greg From landman at scalableinformatics.com Fri Oct 29 13:10:11 2010 From: landman at scalableinformatics.com (Joe Landman) Date: Fri, 29 Oct 2010 16:10:11 -0400 Subject: [Beowulf] Storage - the end of RAID? In-Reply-To: <4CCB1A55.8090506@runnersroll.com> References: <68A57CCFD4005646957BD2D18E60667B121AD077@milexchmb1.mil.tagmclarengroup.com> <20101029171841.GC17525@bx9.net> <4CCB1A55.8090506@runnersroll.com> Message-ID: <4CCB2A23.8010400@scalableinformatics.com> On 10/29/2010 03:02 PM, Ellis H. Wilson III wrote: > On 10/29/10 13:18, Greg Lindahl wrote: >> On Fri, Oct 29, 2010 at 05:42:39PM +0100, Hearns, John wrote: >> >>> Quite a perceptive article on ZDnet >>> >>> http://www.zdnet.com/blog/storage/the-end-of-raid/1154?tag=nl.e539 >> >> This has been going on for a long time. Blekko has 5 petabytes of >> disk, and no RAID anywhere. RAID went out with SQL. Kinda funny that >> HPC is slower to abandon RAID than other kinds of computing... The danger in broad sweeping generalizations is that they tend to be incorrect (yes, a recursive joke ... I went there ...) More seriously, much of business is decidedly *not* abandoning RAID (note: we don't care, we sell storage either way, with or without RAID). More to the point, many folks can't get their head around "losing" storage to RAID10 (e.g. mirroring with striping). Actually, the business folks are generally fairly averse to the concept of such replication. I explain it like this. RAID (for resiliency) is there to simply buy you time to replace a failed drive. Nothing else. RAID for performance (various combinations of striping with varying resiliency) is there to reduce the impact of a single slow drive on the RAID calculations. You can effectively parallelize the computation across multiple drives all speaking about 50-150 MB/s (in the case of spinning rust), and hide the latency of multiple writes being queued. With the RAID5/RAID6 calculation, you also have some level of erasure coding. ... but .... RAID IS NOT A BACKUP (can't say how many times I've had to say this to customers). It can (and does) occasionally fail. The only *guaranteed* way to prevent the failure from increasing entropy significantly in the universe is to have a recent copy of all the relevant data. Which is RAID1 all over again. RAID (re)builds take a long time. This has to do with the design of RAID. There are some techniques that will only rebuild used blocks, which is great, though irrelevant once you cross the 50% utilization line on your storage. Your data is at higher risk during these rebuilds unless you have a recent backup (e.g. mirror bit level copy). Neat how it always gets back to making a copy. This said, many businesses buy a single RAID and then never back it up. We try warning them. No use. That is, until something happens, and we get calls to our support line. > I think it's making a pretty wild assumption to say search engines and > HPC have the same I/O needs (and thus can use the same I/O setups). If > RAID isn't gone from the domain, there is probably a pretty good reason > for it. Also, I'd be blown away if Blekko wasn't doing it's own > striping/redundancy - even if they aren't using RAID 0 or 1 by the book, > they probably are using the same concepts (though hand-spun for search > engine needs). I don't think the "whole internet" takes up 5 petabytes, > so they probably have a couple copies for redundancy and performance or > heterogeneous disk arrays to service more/less accessed items on the net. > It almost doesn't matter how you replicate, as long as a) you do, and b) they are bit level copies, and c) they are recent enough to be meaningful. RAID1 is "instantaneous" copying. There are degrees outside of that (snapshots and backups of same). -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics Inc. email: landman at scalableinformatics.com web : http://scalableinformatics.com http://scalableinformatics.com/jackrabbit phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 From ellis at runnersroll.com Fri Oct 29 13:18:52 2010 From: ellis at runnersroll.com (Ellis H. Wilson III) Date: Fri, 29 Oct 2010 16:18:52 -0400 Subject: [Beowulf] Storage - the end of RAID? In-Reply-To: <20101029194829.GA1676@bx9.net> References: <68A57CCFD4005646957BD2D18E60667B121AD077@milexchmb1.mil.tagmclarengroup.com> <20101029171841.GC17525@bx9.net> <4CCB1A55.8090506@runnersroll.com> <20101029194829.GA1676@bx9.net> Message-ID: <4CCB2C2C.5040904@runnersroll.com> On 10/29/10 15:48, Greg Lindahl wrote: > On Fri, Oct 29, 2010 at 03:02:45PM -0400, Ellis H. Wilson III wrote: > >> I think it's making a pretty wild assumption to say search engines and >> HPC have the same I/O needs (and thus can use the same I/O setups). > > Well, I'm an HPC guy doing infrastructure for a search engine, so I'm > not assuming much. And I didn't say the setup would be the same -- > just that Lustre/PVFS would probably be more reliable and higher > performance if they stored copies on multiple servers instead of using > local or SAN RAID. (Or did they implement this while I wasn't looking?) Setting up a parallel file system on multiple servers is fine for really chunky or really independent workloads (such as independent searches where one search running slowly will not degrade the performance of a concurrent search for something else). This is not at all the case in most HPC situations, where latency between nodes during computation is the limiting factor. Yes, you might get higher reliability power-wise and better performance bandwidth-wise (assuming you have some very wide links over distances between the servers) but you won't get reliability security-wise or performance latency-wise, both of which are critical for HPC. When I said "assumption" I just meant saying "HPC is slower to abandon RAID than other kinds of computing," having just mentioned Blekko was drawing an invalid comparison between the two very different domains. Best, ellis From james.p.lux at jpl.nasa.gov Fri Oct 29 13:40:27 2010 From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C)) Date: Fri, 29 Oct 2010 13:40:27 -0700 Subject: [Beowulf] Storage - the end of RAID? In-Reply-To: <4CCB2A23.8010400@scalableinformatics.com> References: <68A57CCFD4005646957BD2D18E60667B121AD077@milexchmb1.mil.tagmclarengroup.com> <20101029171841.GC17525@bx9.net> <4CCB1A55.8090506@runnersroll.com> <4CCB2A23.8010400@scalableinformatics.com> Message-ID: > -----Original Message----- > From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Joe Landman > Sent: Friday, October 29, 2010 1:10 PM > To: beowulf at beowulf.org > Subject: Re: [Beowulf] Storage - the end of RAID? > > RAID IS NOT A BACKUP (can't say how many times I've had to say this to > customers). It can (and does) occasionally fail. The only *guaranteed* > way to prevent the failure from increasing entropy significantly in the > universe is to have a recent copy of all the relevant data. > > Which is RAID1 all over again. > Could one not design a coding strategy that uses a bit more redundancy than the (8,3) Huffman code and that essentially doesn't need to be rebuilt.. that is, say you have 12 drives to store 8 drives worth of data, and for ease of talking, one bit/byte is written across the array. A drive fails, and you can still read the data ok from the remaining 11 (and, in fact, tolerate another failure). You put in a new drive, which contains all "wrong" bits (actually, half the bits are wrong and half are right, but you don't know which are which)... You read from the full array, and the bits that are wrong on the new array just get corrected during the read in the usual way. You write to the array, and all 12 bits get written. So gradually, the new drive gets filled with "correct" bits. As long as you don't get another TWO failures before all the bits are ok, you're in good shape. (yeah, you could do writeback on error to fill in erroneous bits in the background, etc. but I assume that's not an option because of performance). From deadline at eadline.org Fri Oct 29 18:19:33 2010 From: deadline at eadline.org (Douglas Eadline) Date: Fri, 29 Oct 2010 21:19:33 -0400 (EDT) Subject: [Beowulf] SC10 Beowulf Bash In-Reply-To: References: <68A57CCFD4005646957BD2D18E60667B121AD077@milexchmb1.mil.tagmclarengroup.com> Message-ID: <43163.192.168.93.213.1288401573.squirrel@mail.eadline.org> It is that time of the year. If you are attending SC10, here is what you have been waiting for: The Big Wheels Keep On Turning Beowulf Bash http://www.xandmarketing.com/beobash10/ -- Doug -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From peter.st.john at gmail.com Sun Oct 31 22:19:58 2010 From: peter.st.john at gmail.com (Peter St. John) Date: Mon, 1 Nov 2010 01:19:58 -0400 Subject: [Beowulf] "Go" programming language Message-ID: Has anyone tried the "Go" programming language on a beowulf? The language's homepage says, " Its concurrency mechanisms make it easy to write programs that get the most out of multicore and networked machines..." (from http://golang.org/) The wiki is http://en.wikipedia.org/wiki/Go_%28programming_language%29 I'm sure I'll use MPI but Google hired some pretty cool language designers. Peter P.S. described somewhere as "merging C++ with Python" which maybe explains an odd white-space rule (open curly bracket can't begin a line because it would confuse automated semicolon line endings), Yuk. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeff.johnson at aeoncomputing.com Fri Oct 29 10:32:17 2010 From: jeff.johnson at aeoncomputing.com (Jeff Johnson) Date: Fri, 29 Oct 2010 10:32:17 -0700 Subject: [Beowulf] Storage - the end of RAID? In-Reply-To: <68A57CCFD4005646957BD2D18E60667B121AD077@milexchmb1.mil.tagmclarengroup.com> References: <68A57CCFD4005646957BD2D18E60667B121AD077@milexchmb1.mil.tagmclarengroup.com> Message-ID: <4CCB0521.8010804@aeoncomputing.com> On 10/29/10 9:42 AM, Hearns, John wrote: > Quite a perceptive article on ZDnet > > http://www.zdnet.com/blog/storage/the-end-of-raid/1154?tag=nl.e539 I don't see RAID disappearing from the computing landscape anytime soon. The architecture and performance of some RAID controllers are starting to rival the systems they are connected to. There is a case to be made for simplifying things and bringing the storage and filesystem together and ditching the RAID controller. Distributed i/o and better latency would be a great thing for demanding apps and talented administrators. One service RAID controllers provide is fencing a complex low level command/control environment from direct user interaction. Some people are better served by having their data sequestered behind a RAID controller. --Jeff -- ------------------------------ Jeff Johnson Manager Aeon Computing jeff.johnson at aeoncomputing.com www.aeoncomputing.com t: 858-412-3810 x101 f: 858-412-3845 m: 619-204-9061 4905 Morena Boulevard, Suite 1313 - San Diego, CA 92117