From gdjacobs at gmail.com Thu Feb 1 02:49:27 2007 From: gdjacobs at gmail.com (Geoff Jacobs) Date: Tue May 13 01:05:43 2008 Subject: [Beowulf] massive parallel processing application required In-Reply-To: <60632.192.168.1.1.1170268112.squirrel@mail.eadline.org> References: <45BE8E7E.4010808@brookes.ac.uk> <45C0791C.5080904@brookes.ac.uk> <60632.192.168.1.1.1170268112.squirrel@mail.eadline.org> Message-ID: <45C1C5B7.9080608@gmail.com> Douglas Eadline wrote: > If you want to do a little development and impress your friends, > try playing with pgapack (Parallel Genetic Algorithm Library) > > http://www-fp.mcs.anl.gov/CCST/research/reports_pre1998/comp_bio/stalk/pgapack.html > > You can develop a GA on single computer then run it on > a cluster. > > -- > Doug I see this and think "stock market" or "sports betting". -- Geoffrey D. Jacobs From gerry.creager at tamu.edu Thu Feb 1 03:43:32 2007 From: gerry.creager at tamu.edu (Gerry Creager) Date: Tue May 13 01:05:43 2008 Subject: [Beowulf] massive parallel processing application required In-Reply-To: <6.2.3.4.2.20070131212014.0304b408@mail.jpl.nasa.gov> References: <45BE8E7E.4010808@brookes.ac.uk> <45C04FE9.5050502@streamline-computing.com> <45C07AE4.1020508@brookes.ac.uk> <6.2.3.4.2.20070131212014.0304b408@mail.jpl.nasa.gov> Message-ID: <45C1D264.4000209@tamu.edu> Jim Lux wrote: > At 02:03 PM 1/31/2007, Robert G. Brown wrote: >> On Wed, 31 Jan 2007, Mitchell Wisidagamage wrote: >> >>> Thank you very much for the fire dynamics idea. I will have a look at >>> it. >>> >>> I did try to contact many e-science projects including some >>> researchers at Oxford. But I got no reply. Then I went to get some >>> contacts from a tutor who worked at a e-science project himself. He >>> told me people, especially scientists are "very jealous" of their >>> data. And not replying is a kind way of saying "no". And there's the >>> problem of "who's this guy wanting my data", "what will he do with it?". >>> >>> I have given up the e-science idea. Now looking for other real world >>> applications. >> >> Remember, NASA puts all (or at least a lot) of its e.g. weather data >> online. > > Well.. not exactly NASA.. operational "weather" data is the province of > NOAA. NASA does research, not operational, data, so there's typically a > time lag, especially for processed and calibrated data. > > By and large, most environmental data collected by NASA winds up in > DAACs (Distributed Active Archiving Centers). Physical Oceanography > data, for instance, winds up at PO-DAAC... > http://www-podaac.jpl.nasa.gov/ which has data for sea surface > temperature, sea surface topography, and ocean vector winds acquired by > NASA instruments. This whole process is very well documented, and the > data moves through the various levels of processing and into the > archives in a regular and stately fashion. > > But, for instance, the live data from a single instrument (e.g. QuikSCAT > for ocean winds, on which I worked) also gets fed to a realtime process > at NOAA within about an hour after it's received on the ground every 100 > minutes, and thence to folks like NCAR who run numerical models, which > then winds up at the NWS and makes the weather predictions more accurate > on the evening news. This is a bit harder to find in a reliable online > source, especially if you want things gridded into standard geographic > grids, etc. It's all out there, but since the funding stream for > distribution is more tenuous (NOAA doesn't have as much money as NASA > for this sort of thing, but they do have "real time" requirements), the > data tends to be a bit more "raw" or idiosyncratic, and not necessarily > in HDF files, etc. It tends to be in whatever format is convenient for > them, which may or may not be convenient for you. For research purposes, the National Centers for Environmental Prediction (ftp://ftpprd.ncep.noaa.gov/pub/data/nccf/com/) makes available all their model runs on a 6-hourly schedule. These data are available for ~3 days, then expire off the servers here. Historical data subsets are available via the National Climatic Data Center NOMADS portal (http://nomads.ncdc.noaa.gov/) which was designed to facilitate access to the datasets. The National Centers for Atmospheric Research (http://www.ncar.ucar.edu/) allows access to some limited historic data in their archives without restriction and facilitates scientific research with accounts to scientists. >> And there are many things one can do with it. Look for the >> NOAA sites. You can get sunspot data, proxy temperature data, and much >> more, and build your very own climate model. If you do, don't be >> surprised if it fails to agree with the current one (due to be >> re-released today, IIRC, from the IPCC). > > James Lux, P.E. > Spacecraft Radio Frequency Subsystems Group > Flight Communications Systems Section > Jet Propulsion Laboratory, Mail Stop 161-213 > 4800 Oak Grove Drive > Pasadena CA 91109 > tel: (818)354-2075 > fax: (818)393-6875 > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf -- Gerry Creager -- gerry.creager@tamu.edu Texas Mesonet -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.862.3983 Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843 From 06002352 at brookes.ac.uk Thu Feb 1 03:56:19 2007 From: 06002352 at brookes.ac.uk (Mitchell Wisidagamage) Date: Tue May 13 01:05:43 2008 Subject: [Beowulf] massive parallel processing application required In-Reply-To: <6.2.3.4.2.20070131211133.03400140@mail.jpl.nasa.gov> References: <45BE8E7E.4010808@brookes.ac.uk> <45C04FE9.5050502@streamline-computing.com> <45C07AE4.1020508@brookes.ac.uk> <6.2.3.4.2.20070131211133.03400140@mail.jpl.nasa.gov> Message-ID: <45C1D563.5060201@brookes.ac.uk> > > Optimum path routing of ships and/or airplanes, taking into account the > winds, currents, sea state, temperatures, etc. > > Large realtime and climatological databases are available. > The path optimization algorithms are simple and fairly well known (A and > A-star are two to start with). The challenge is in suitable heuristics > to prune the search space. > > You can optimize for minimum time in transit, or minimum fuel cost, or > minimum probability of delay, etc. Very nice example! Thank you I requested some JPL CDs of images when I was a teenager. I was very impressed at the time. :o) From 06002352 at brookes.ac.uk Thu Feb 1 04:04:59 2007 From: 06002352 at brookes.ac.uk (Mitchell Wisidagamage) Date: Tue May 13 01:05:43 2008 Subject: [Beowulf] massive parallel processing application required In-Reply-To: <6.2.3.4.2.20070131213744.03074e00@mail.jpl.nasa.gov> References: <45BE8E7E.4010808@brookes.ac.uk> <36397.192.168.1.1.1170189086.squirrel@mail.eadline.org> <45BFE17F.5080901@brookes.ac.uk> <45C0E921.7090600@tempemusic.com> <45C14798.9040304@brookes.ac.uk> <6.2.3.4.2.20070131213744.03074e00@mail.jpl.nasa.gov> Message-ID: <45C1D76B.5070204@brookes.ac.uk> >> >> Now I'm not sure what to do with these data sets. I should program my >> own application. But how should I be processing them?...without the >> algorithms for processing I'm lost. :o) > > > http://www.ocean-systems.com/VOSS.htm > www.weather.navy.mil/paoweb/starsams.ppt > > http://realdistance.com/ > > I'll need some time to digest these models. Hope it's not to complicated. Thank you every for taking time to help me. or Cheers as everyone here call it. Wow lots of people with scientific backgrounds on here. I thought this was geeky mailing list with programmers trying to solve cluster problems. From 06002352 at brookes.ac.uk Thu Feb 1 04:25:38 2007 From: 06002352 at brookes.ac.uk (Mitchell Wisidagamage) Date: Tue May 13 01:05:43 2008 Subject: [Beowulf] massive parallel processing application required In-Reply-To: <45C1542A.4030701@tamu.edu> References: <45BE8E7E.4010808@brookes.ac.uk> <45C04FE9.5050502@streamline-computing.com> <45C07AE4.1020508@brookes.ac.uk> <45C14358.8030800@brookes.ac.uk> <45C1542A.4030701@tamu.edu> Message-ID: <45C1DC42.90604@brookes.ac.uk> > > Please don't fall into the trap of thinking "e-Science" requires a tie > to the Globus Toolkit to be valid. > I do not think this (anymore). I queried Matthew Haynos from IBM who's an expert in this area some time ago as I'm new to grid computing. The silly questions are from me :o) Answers are his. Because at the moment distributed computing is only popular in the academic research and highly specialized part of the industry...atleast that's what I think. Any professional and personal comments from your expereince? Not true. Distributed computing is more and more mainstream. I think too that you are looking at distributed computing perhaps too narowly. Even if you are referring to supercomputing, witness that more and more of the Top 500 supercomputing sites are increasingly commerical (as opposed to academic or public institutions). Anyhow I just read it again and you stated that "Grid computing becoming more of a defacto standard for distributed computing in enterprises". May I ask why do you think that? I would say b/c of the growing ubiquity of scale-out computing (lots of machines, lots of resources, etc.) What's happening here is that scheduling, etc. is going from the machine into the network. People no longer know where things are going to run with hundreds / thousands of blade processors. This is a sea change. People use to say run this piece of work on this machine, now it's just run this work, I have no idea where. I've written an article series for IBM's grid site on developerWorks: Check out: http://www-128.ibm.com/developerworks/search/searchResults.jsp?searchType=1&pageLang=&displaySearchScope=dW&searchSite=dW&lastUserQuery1=perspectives+on+grid&lastUserQuery2=&lastUserQuery3=&lastUserQuery4=&query=perspectives+on+grid+haynos&searchScope=dW particularly the "Next-generation distributed computing" article for a primer. I think you'll find the five or so articles in the series interesting. From gerry.creager at tamu.edu Thu Feb 1 04:52:16 2007 From: gerry.creager at tamu.edu (Gerry Creager) Date: Tue May 13 01:05:43 2008 Subject: [Beowulf] massive parallel processing application required In-Reply-To: <45C1DC42.90604@brookes.ac.uk> References: <45BE8E7E.4010808@brookes.ac.uk> <45C04FE9.5050502@streamline-computing.com> <45C07AE4.1020508@brookes.ac.uk> <45C14358.8030800@brookes.ac.uk> <45C1542A.4030701@tamu.edu> <45C1DC42.90604@brookes.ac.uk> Message-ID: <45C1E280.9000502@tamu.edu> Mitchell Wisidagamage wrote: >> >> Please don't fall into the trap of thinking "e-Science" requires a tie >> to the Globus Toolkit to be valid. >> > I do not think this (anymore). I queried Matthew Haynos from IBM who's > an expert in this area some time ago as I'm new to grid computing. The > silly questions are from me :o) Answers are his. > > Because at the moment distributed computing is only popular in the > academic research and highly specialized part of the industry...atleast > that's what I think. Any professional and personal comments from your > expereince? > > Not true. Distributed computing is more and more mainstream. I think > too that you are looking at distributed computing perhaps too narowly. > Even if you are referring to supercomputing, witness that more and more > of the Top 500 supercomputing sites are increasingly commerical (as > opposed to academic or public institutions). > > Anyhow I just read it again and you stated that "Grid computing becoming > more of a defacto standard for distributed computing in enterprises". > > May I ask why do you think that? > I would say b/c of the growing ubiquity of scale-out computing (lots of > machines, lots of resources, etc.) What's happening here is that > scheduling, etc. is going from the machine into the network. People no > longer know where things are going to run with hundreds / thousands of > blade processors. This is a sea change. People use to say run this > piece of work on this machine, now it's just run this work, I have no > idea where. I've written an article series for IBM's grid site on > developerWorks: > > Check out: > http://www-128.ibm.com/developerworks/search/searchResults.jsp?searchType=1&pageLang=&displaySearchScope=dW&searchSite=dW&lastUserQuery1=perspectives+on+grid&lastUserQuery2=&lastUserQuery3=&lastUserQuery4=&query=perspectives+on+grid+haynos&searchScope=dW > > > particularly the "Next-generation distributed computing" article for a > primer. I think you'll find the five or so articles in the series > interesting. I've read the article series and it is interesting. And, I'm not completely given over to anti-grid sentiment. The problem remains, however, to be embodied by a colleague, recounting his experience in running an ocean circulation model: "We only had a 13% slowdown running this as a grid application when compared to our local cluster." Now, there are several things to consider that go unsaid here. One is the degree of coupling in the code. Another is the size of the datasets that have to be moved to the various sites to facilitate operations. some codes will perform well when distributed broadly, while others will die a horrid death waiting for pieces of the result to come back from that P3 installation in Outer Geekdom. Some will suffer simply from communications latency. Others will just continue to chug along. By way of illustration, we benchmarked my MM5 semi-production run of 72 forecast hours for 3 domains of increasing resolution across the United States. To complete in the same timeframe as a locally submitted job, we found a requirement to double the number of processors when it was distributed out to the "grid". This is an extreme example, of course, and not one I propose to repeat anytime soon... It's much easier to run MM5 and WRF locally and not have to worry quite so much about resource reservation and odd processors failing mid-run. -- Gerry Creager -- gerry.creager@tamu.edu Texas Mesonet -- AATLT, Texas A&M University Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.862.3983 Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843 From deadline at clustermonkey.net Thu Feb 1 05:28:25 2007 From: deadline at clustermonkey.net (Douglas Eadline) Date: Tue May 13 01:05:43 2008 Subject: [Beowulf] massive parallel processing application required In-Reply-To: <45C1C5B7.9080608@gmail.com> References: <45BE8E7E.4010808@brookes.ac.uk> <45C0791C.5080904@brookes.ac.uk> <60632.192.168.1.1.1170268112.squirrel@mail.eadline.org> <45C1C5B7.9080608@gmail.com> Message-ID: <36197.192.168.1.1.1170336505.squirrel@mail.eadline.org> > Douglas Eadline wrote: >> If you want to do a little development and impress your friends, >> try playing with pgapack (Parallel Genetic Algorithm Library) >> >> http://www-fp.mcs.anl.gov/CCST/research/reports_pre1998/comp_bio/stalk/pgapack.html >> >> You can develop a GA on single computer then run it on >> a cluster. >> >> -- >> Doug > > I see this and think "stock market" or "sports betting". Good "luck" with that. In any case, GA's and cluster design are not that foreign http://aggregate.org/FNN/ For those interested in other engineering and scientific uses take a look at: http://www.talkorigins.org/faqs/genalg/genalg.html -- Doug > > -- > Geoffrey D. Jacobs > > > !DSPAM:45c1c5f1232511543480883! > -- Doug From peter.st.john at gmail.com Thu Feb 1 06:08:57 2007 From: peter.st.john at gmail.com (Peter St. John) Date: Tue May 13 01:05:43 2008 Subject: [Beowulf] massive parallel processing application required In-Reply-To: <36197.192.168.1.1.1170336505.squirrel@mail.eadline.org> References: <45BE8E7E.4010808@brookes.ac.uk> <45C0791C.5080904@brookes.ac.uk> <60632.192.168.1.1.1170268112.squirrel@mail.eadline.org> <45C1C5B7.9080608@gmail.com> <36197.192.168.1.1.1170336505.squirrel@mail.eadline.org> Message-ID: Since I had wanted a cluster to run my GA, I thought about using the GA to configure the cluster. So that's a great link for me! Peter On 2/1/07, Douglas Eadline wrote: > > > Douglas Eadline wrote: > >> If you want to do a little development and impress your friends, > >> try playing with pgapack (Parallel Genetic Algorithm Library) > >> > >> > http://www-fp.mcs.anl.gov/CCST/research/reports_pre1998/comp_bio/stalk/pgapack.html > >> > >> You can develop a GA on single computer then run it on > >> a cluster. > >> > >> -- > >> Doug > > > > I see this and think "stock market" or "sports betting". > > Good "luck" with that. In any case, GA's and cluster design > are not that foreign > > http://aggregate.org/FNN/ > > For those interested in other engineering and scientific uses > take a look at: > > http://www.talkorigins.org/faqs/genalg/genalg.html > > -- > Doug > > > > > > -- > > Geoffrey D. Jacobs > > > > > > !DSPAM:45c1c5f1232511543480883! > > > > > -- > Doug > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20070201/544d16d3/attachment.html From rgb at phy.duke.edu Thu Feb 1 06:20:11 2007 From: rgb at phy.duke.edu (Robert G. Brown) Date: Tue May 13 01:05:43 2008 Subject: [Beowulf] massive parallel processing application required In-Reply-To: <36197.192.168.1.1.1170336505.squirrel@mail.eadline.org> References: <45BE8E7E.4010808@brookes.ac.uk> <45C0791C.5080904@brookes.ac.uk> <60632.192.168.1.1.1170268112.squirrel@mail.eadline.org> <45C1C5B7.9080608@gmail.com> <36197.192.168.1.1.1170336505.squirrel@mail.eadline.org> Message-ID: On Thu, 1 Feb 2007, Douglas Eadline wrote: > For those interested in other engineering and scientific uses > take a look at: > > http://www.talkorigins.org/faqs/genalg/genalg.html Fabulous article, actually. Thanks! I've actually written a parallel GA embedded in a NN training program, and have been working for years in a desultory fashion on building a "super"-GA that can get past several of the "problems" GAs have -- primarily premature convergence, which actually has a rather nasty scaling structure as one tries to find the "better" local optima in a problem with a complex/rugged fitness landscape in high dimensionality, and a few other problems that aren't well known (or at least aren't published much, possibly because they are worth a lot of money:-). This review of GAs is one of the best I've read, even better than Wikipedia's which is saying a lot! rgb -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu From hahn at mcmaster.ca Thu Feb 1 07:40:39 2007 From: hahn at mcmaster.ca (Mark Hahn) Date: Tue May 13 01:05:43 2008 Subject: [Beowulf] massive parallel processing application required In-Reply-To: <45C1DC42.90604@brookes.ac.uk> References: <45BE8E7E.4010808@brookes.ac.uk> <45C04FE9.5050502@streamline-computing.com> <45C07AE4.1020508@brookes.ac.uk> <45C14358.8030800@brookes.ac.uk> <45C1542A.4030701@tamu.edu> <45C1DC42.90604@brookes.ac.uk> Message-ID: > Not true. Distributed computing is more and more mainstream. I think too oh, one other snide comment about grid: I suspect the grid-fad could not have happened without the fraud perpetrated by worldcom and others during the internet bubble. in those days, it was popular to claim that the network was becoming truely ubiquitous and incomprehensibly fast. for instance: http://www-128.ibm.com/developerworks/grid/library/gr-heritage/#N100A6 I don't know about you, but in the 6 years since then, my home net connection has stayed the same speed, possibly a bit more expensive. desktop/LANs are still mostly at 100bT, with 1000bT in limited use. I do notice that grabbing large files off the net (ftp, RPMs, etc) often runs at O(MBps) which is about a 10x improvement over the past 10-15 years. so the doubling time turns out to be more like 3 years rather than 9 months. in-cluster networking has improved somewhat faster, but not dramatically so. From atp at piskorski.com Thu Feb 1 07:54:41 2007 From: atp at piskorski.com (Andrew Piskorski) Date: Tue May 13 01:05:43 2008 Subject: [Beowulf] clusters in gaming In-Reply-To: <20070131164304.GB21677@leitl.org> References: <20070131164304.GB21677@leitl.org> Message-ID: <20070201155441.GA46052@tehun.pair.com> On Wed, Jan 31, 2007 at 05:43:04PM +0100, Eugen Leitl wrote: > I've been looking at Second Life recently, which does most > things server-side (in fact, running a distributed world > with game physics) unlike games like WoW, where the intelligence Why? Is there some compelling underlying reason they can't make use of all those desktop cycles like other massively multiplayer games do? > What I didn't like is that most of the game is purportedly > based on a byte-compiled language, with some long-term plans What language? Some ad-hoc thing of their own? > to switch to .Net (Mono, actually), which should result in > much improved performance. Current performance is > rather ridiculous, even high-priority simulations like > private islands only tolerate few 10 avatars before severe > performance degradation, and even crashes. > Can things be compiled in realtime by passing code snippets > in conventional compiled languages, or is this always limited Well, sure, I think that's been done, although I don't know if anyone's using it for real in a production setting. Here are a few links to related subjects - tcc, CriTcl, and LuaJIT: http://fabrice.bellard.free.fr/tcc/ http://wiki.tcl.tk/2523 http://luajit.luaforge.net/luajit.html But why would you think that just-in-time compilation of C or the like would be central in fixing Second Life's performance problems, rather than just doing a better job of software engineering in general? I know nothing about Second Life, but from your description, if they're looking to change programming languages, Erlang (or something like it) might be the best fit. -- Andrew Piskorski http://www.piskorski.com/ From peter.st.john at gmail.com Thu Feb 1 08:25:12 2007 From: peter.st.john at gmail.com (Peter St. John) Date: Tue May 13 01:05:43 2008 Subject: [Beowulf] massive parallel processing application required In-Reply-To: References: <45BE8E7E.4010808@brookes.ac.uk> <45C04FE9.5050502@streamline-computing.com> <45C07AE4.1020508@brookes.ac.uk> <45C14358.8030800@brookes.ac.uk> <45C1542A.4030701@tamu.edu> <45C1DC42.90604@brookes.ac.uk> Message-ID: Moore's Law (which has grown in scope since Moore) applies to the aggregate effect of many technologies. Individual techs proceed in fits and starts. Predictions about FLOPS/dollar seem to be sustainable, but e.g. I predict a jump in chip density when the price point of vapor deposition manufactured diamond gets low enough (diamond conducts heat way better than silicon, and chips are suffering from thermodynamics limits). When AT&T divested, you could not get a decent telephone anymore; they were too expensive to make so well. Then after years of crummy phones, suddenly everyone had a cell-phone just like Captain Kirk's. Sure I want fiber optics to my house. But maybe the power company will carry data on the wasted bandwidth of power lines. Keep the faith :-) Peter On 2/1/07, Mark Hahn wrote: > > > Not true. Distributed computing is more and more mainstream. I think > too > > oh, one other snide comment about grid: I suspect the grid-fad could not > have happened without the fraud perpetrated by worldcom and others during > the internet bubble. in those days, it was popular to claim that the > network > was becoming truely ubiquitous and incomprehensibly fast. for instance: > > http://www-128.ibm.com/developerworks/grid/library/gr-heritage/#N100A6 > > I don't know about you, but in the 6 years since then, my home net > connection has stayed the same speed, possibly a bit more expensive. > desktop/LANs are still mostly at 100bT, with 1000bT in limited use. > I do notice that grabbing large files off the net (ftp, RPMs, etc) > often runs at O(MBps) which is about a 10x improvement over the past > 10-15 years. so the doubling time turns out to be more like 3 years > rather than 9 months. in-cluster networking has improved somewhat > faster, but not dramatically so. > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20070201/475922cd/attachment.html From hahn at mcmaster.ca Thu Feb 1 08:50:50 2007 From: hahn at mcmaster.ca (Mark Hahn) Date: Tue May 13 01:05:43 2008 Subject: [Beowulf] massive parallel processing application required In-Reply-To: References: <45BE8E7E.4010808@brookes.ac.uk> <45C04FE9.5050502@streamline-computing.com> <45C07AE4.1020508@brookes.ac.uk> <45C14358.8030800@brookes.ac.uk> <45C1542A.4030701@tamu.edu> <45C1DC42.90604@brookes.ac.uk> Message-ID: > Moore's Law (which has grown in scope since Moore) applies to the aggregate > effect of many technologies. Individual techs proceed in fits and starts. well, specifically it applies to fields there the primary metric is a function of density. for instance, disk capacity is on an exponential, since it's a product of in-track and inter-track density. just like chips, where each linear shrink of 1/sqrt(2) leads to a doubling of devices in the same area. in both cases, these curves are sometimes strongly modulated by "quantum" shifts in the technology (perhaps multi-gate transistors, or the succeeding generations of disk heads - perhaps patterned media upcoming.) in networking, I see generational shifts, but no area-driven exponential. so I think the application of moore's law to networking is mistaken... > Predictions about FLOPS/dollar seem to be sustainable, but e.g. I predict a > jump in chip density when the price point of vapor deposition manufactured > diamond gets low enough (diamond conducts heat way better than silicon, and > chips are suffering from thermodynamics limits). excellent example of a generational shift, rather than part of the relentless sequence of shrinks. (I guess you could argue that there are generational aspects to the shrink/area thing too, since, for instance, visible-optical gave way to UV and presumably eventually immersion litho. or maybe it'll be imprint litho next.) > When AT&T divested, you could not get a decent telephone anymore; they were > too expensive to make so well. Then after years of crummy phones, suddenly > everyone had a cell-phone just like Captain Kirk's. I guess that's more of an economic network effect. but am I alone in thinking that cellphones are one of the suckiest products on the market? (the phones themselves are OK; it's the bundling and customer-screwage I'm not fond of. imagine if your phone was an ipv6 device and contained an agent that simply negotiated quality*byte rates with whatever connectivity supplier happend to have good signal strength locally...) > Sure I want fiber optics to my house. But maybe the power company will carry > data on the wasted bandwidth of power lines. Keep the faith :-) call me an unrealistic idealist, but I'm hoping for wimax-like stuff (perhaps with some nice subversive/grassroots mesh routing) to eliminate the incredibly annoying cell monopolies. regards, mark. From jlb17 at duke.edu Thu Feb 1 09:15:57 2007 From: jlb17 at duke.edu (Joshua Baker-LePain) Date: Tue May 13 01:05:44 2008 Subject: [Beowulf] massive parallel processing application required In-Reply-To: References: <45BE8E7E.4010808@brookes.ac.uk> <45C04FE9.5050502@streamline-computing.com> <45C07AE4.1020508@brookes.ac.uk> <45C14358.8030800@brookes.ac.uk> <45C1542A.4030701@tamu.edu> <45C1DC42.90604@brookes.ac.uk> Message-ID: On Thu, 1 Feb 2007 at 11:50am, Mark Hahn wrote > but am I alone in > thinking that cellphones are one of the suckiest products on the market? No. -- Joshua Baker-LePain Department of Biomedical Engineering Duke University From peter.st.john at gmail.com Thu Feb 1 09:34:12 2007 From: peter.st.john at gmail.com (Peter St. John) Date: Tue May 13 01:05:44 2008 Subject: [Beowulf] massive parallel processing application required In-Reply-To: References: <45BE8E7E.4010808@brookes.ac.uk> <45C04FE9.5050502@streamline-computing.com> <45C07AE4.1020508@brookes.ac.uk> <45C14358.8030800@brookes.ac.uk> <45C1542A.4030701@tamu.edu> <45C1DC42.90604@brookes.ac.uk> Message-ID: Mark, On 2/1/07, Mark Hahn wrote: > > > Moore's Law ... (I guess you could argue that there are > generational aspects to the shrink/area thing too, since, for instance, > visible-optical gave way to UV and presumably eventually immersion litho. > or maybe it'll be imprint litho next.) Yeah, I'm thinking of the smooth curve (to which we can apply cubic splines) is the combined effect of many discrete step-funcitons. I guess that's more of an economic network effect. but am I alone in > thinking that cellphones are one of the suckiest products on the market? > (the phones themselves are OK; it's the bundling and customer-screwage > I'm not fond of. Yes indeed; cell phones cool, cell phone comanies less so. Voice over IP ought to be free by now :-) > Sure I want fiber optics to my house. But maybe the power company will > carry > > data on the wasted bandwidth of power lines. Keep the faith :-) > > call me an unrealistic idealist, but I'm hoping for wimax-like stuff > (perhaps with some nice subversive/grassroots mesh routing) to eliminate > the incredibly annoying cell monopolies. Me too. I want a small laser on my rooftop, with prisms splitting to receivers on the roofs of two or four neighbors, with a uucp type friendly free protocol. I guess they should be MASERs but I'm no physicist. regards, mark. regards, Pete. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20070201/713ac333/attachment.html From jmiguel at hpcc-usa.org Thu Feb 1 08:36:50 2007 From: jmiguel at hpcc-usa.org (John Miguel) Date: Tue May 13 01:05:44 2008 Subject: [Beowulf] HPCC'07 Government Supercomputer Conference April 3-5, 2007 - Please Post Message-ID: The National High Performance Computing and Communications Council will hold its 21st annual Computing and Information Technology conference April 3-5, 2007 at the Hyatt Regency Hotel and Spa in Newport, RI. Over the years, this high level conference has become known for its content, prominent speakers and networking, not commercial glitter and hype. ? The Council was established pursuant to the authority of and in accordance with the desires of the President of the United States as expressed in a White House memorandum to the heads of Departments and Agencies, dated 28 June 1966, and in the instructions of the 89th Congress as expressed in the summary of H.R. 4845. Its mission is education and training in computer and information technology for the public sector. ? The audience consists of Government, Industry and University CIOs, CTOs, CEOs, Technology and Business decision makers, IT and IRM Directors, System Managers, Department Heads, Computer Scientists, Computer Security Officers. ? Attendance is limited to 200 and all attendees receive the Government hotel room rate. Sample conference evaluation : "Hotel Awesome, Speakers Outstanding, Chocolate Fondue ... Priceless." ? Conference information is available at: WWW.HPCC-USA.ORG, or by phone at 401-624-1732. John Miguel Ph. D. President National HPCC Council 480-895-1326 ? Tentative Program for HPCC?07 January 1, 2007 HPCC?07 WWW.HPCC-USA.ORG 401-624-1723 April 3-5, 2007 Hyatt Hotel and Resort Newport, RI Theme: ?Supercomputing: Innovation, Imagination and Application? Tuesday April 3, 2007 Chairman: Steve Louis, Lawrence Livermore National Laboratory 9:00 Dr. Stephen R. Wheat Executive Director HPC Platform Organization, Intel Corporation 9:45 Dr. William Harrod (invited) High Productivity Program Office Defense Research Projects Agency ?High Productivity Computing Program? 10:30 AM Break 11:00 Dr. Douglas B. Kothe Oak Ridge National Laboratory 11:45 Dr. Andrew B. White Jr. Deputy Associate Director, Theory, Simulation and Computing Los Alamos National Laboratory ?Project Road Runner: Petaflop Computing? 12:30 Lunch 2:00 Dr. Daniel E. Atkins Director, Office of Cyberinfrastructure, National Science Foundation 2:45 Dr. Kelvin K. Droegemeier (invited) Associate Vice President for Research University of Oklahoma 3:30 PM Break 4:00 Panel Discussion ?Supercomputing: An Over the Horizon View? Dr. Kelley B. Gaither, Moderator Associate Director, TACC, University of Texas Dr. Jose Munoz, NSF Dr. Douglass E. Post, DOD HPC Modernization Office Dr. Karl Schulz, TACC, UTexas Dr. Walter Brooks, NASA Dr. Robert Graybill, USC, ISI 5:30 Networking Reception 7:00 Birds of a Feather Break Out Sessions Wednesday April 4, 2007 Chairman: Stephen Schneller, NUWC/DOD HPC MOD Office 9:00 Debra Goldfarb CEO, Tabor Communications 9:45 ?Tools For Debugging Multicore Applications? Chris Gottbrath, Product Manage Etnus, LLC 10:30 AM Break 11:00 Dr. Georges E. Karniadakis Brown University ?HPC in Medicine: The Computational Man? 11:45 Dr. John E. West Director, HPC Major Shared Resource Center U. S. Army Engineer Research and Development Center 12:30 Lunch 2:00 Dr. Charles Romine (invited) Director, National Coordination Office, Networking, IT R&D ?National Plans, Programs and Initiatives? 2:45 Microsoft ?Scaling Out Excel on Windows CCS Clusters? 3:30 PM Break 4:00 Panel Discussion ?Future Requirements for Storage and Backup? Dr. Robert Chadduck, National Archives & Records Administration Dr. Robert Ross, Argonne National Laboratory Ellen M. Salmon, NASA John L. Cole, Army Research Laboratory Lee West, Sandia National Laboratory Joshua Lubell, National Institute of Standards and Technology 6:30 Reception and Dinner Thursday April 5, 2007 8:30 Complimentary Full Breakfast 9:00 Director Data Center Technology American Power Conversion ?21st Century Data Center Design? 10:30 High-End Computing Market Trends Dan Little, CTO High-End Computing Market Services 11:00 Conference Close -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/enriched Size: 4537 bytes Desc: not available Url : http://www.scyld.com/pipermail/beowulf/attachments/20070201/780c9e7d/attachment.bin From mikhailberis at gmail.com Thu Feb 1 09:42:53 2007 From: mikhailberis at gmail.com (Dean Michael Berris) Date: Tue May 13 01:05:44 2008 Subject: [Beowulf] clusters in gaming In-Reply-To: <20070131164304.GB21677@leitl.org> References: <20070131164304.GB21677@leitl.org> Message-ID: <45C2269D.1090202@gmail.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Eugen Leitl wrote: > > While I do see what a usual C/C++ MPI approach wouldn't > be probably enough for a highly dynamic and flexible virtual > environment, the result still strikes me as inelegant, > and killing architectural deficiences by throwing enough > hardware at it (not necessary always wrong, mark, just > not in this case). > I don't see why a usual C/C++ MPI approach wouldn't work, though the scaling issues of adding a new node to the cluster is certainly a problem that may be a hindrance from the implementation -- but one that can be remedied by having local clusters "gridded" together using some protocol. As for throwing hardware at it, I don't think that's a problem -- that's actually a good solution. That being said, if the implementation was already good to start with then adding more hardware would have (supposedly) better effect on the overall performance/experience. > Can things be compiled in realtime by passing code snippets > in conventional compiled languages, or is this always limited > to highly dynamic environments like Smalltalk (which OpenCroquet > is based on) or Lisp (with sbcl and cmucl there are now great > compilers for Lisp, though I don't know about MPI support)? > Short answer is yes: it can even be done in C++. However what I would rather use in these situations would be a dynamic language like as you mention Lisp or things like Python (embedded in C++ or the other way around, see Boost.Python). I think it's an architecture problem more than anything as far as the SL server side is concerned. But then when you're faced with a problem like full-3D physics engine in the server side, that's not something "as easy as Hello, World" to implement (or fix for the matter). Though it certainly is not "impossible", "hard" would be an understatement especially now that it's in full-deployment with thousands of people getting on it at any given time. - -- Dean Michael C. Berris http://cplusplus-soup.blogspot.com/ mikhailberis AT gmail DOT com +63 928 7291459 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFFwiadBGP1EujvLlsRAqm0AJ4poLgPs0dFqGSFfoNLn5qhe3h7sgCgrIoB sbwpSOkwDAlEWHBnbxbz/Vc= =sdze -----END PGP SIGNATURE----- From eugen at leitl.org Thu Feb 1 12:19:59 2007 From: eugen at leitl.org (Eugen Leitl) Date: Tue May 13 01:05:44 2008 Subject: [Beowulf] clusters in gaming In-Reply-To: <45C2269D.1090202@gmail.com> References: <20070131164304.GB21677@leitl.org> <45C2269D.1090202@gmail.com> Message-ID: <20070201201959.GF21677@leitl.org> On Fri, Feb 02, 2007 at 01:42:53AM +0800, Dean Michael Berris wrote: > I don't see why a usual C/C++ MPI approach wouldn't work, though the In theory, there is no reason why these (or even Fortran) wouldn't be adequate either, but in practice it would be very difficult to accomodate user-contributed scripted objects into a rigid array/pointer framework. Adding new methods in C to a brand new object instantiated at runtime is certainly possible, but it sounds intensely painful. >From the point of view of running a massively parallel realtime (fake) physics simulation with many 10 k simultaneous viewers/ points of input it looks as if requires a massive numerical performance, which suggests C (less C++). Common Lisp now has very good compilers, but I wonder how well that translates into numerics, and similiar to C++ the unwary programmer can produce very slow code (CONSing, GC, etc). > scaling issues of adding a new node to the cluster is certainly a There are two types of regions, isolated islands, and addition to the main "continent". Both look quite suitable for geometric problem tesselation (one node, one region) and incremental node addition as the terrain grows. > problem that may be a hindrance from the implementation -- but one that > can be remedied by having local clusters "gridded" together using some > protocol. As far as I know SL is run on one local cluster, which is why I thought of how a Beowulf approach would help. They're complicating it by using virtual machines, and packing several VMs on one physical server. After (frequent) upgrades servers are restarted in a rolling fashion, and I presume snapshot/resume migration is a useful function here. But then, there are cluster-wide process migration available, which are a lot more fine-grained. > As for throwing hardware at it, I don't think that's a problem -- that's > actually a good solution. That being said, if the implementation was I thought the cluster had some 1000 nodes, but http://gwynethllewelyn.net/article119visual1layout1.html claims there are just 5000 virtual servers. Maybe they just run 5 VServers/node, and there are really 1 kNodes, which is a reasonably large cluster for just 16 kUsers at peak (not for your garden-variety Beowulf, but for a game server). > already good to start with then adding more hardware would have > (supposedly) better effect on the overall performance/experience. It would be really interesting to learn how current SL scales. > Short answer is yes: it can even be done in C++. However what I would > rather use in these situations would be a dynamic language like as you > mention Lisp or things like Python (embedded in C++ or the other way > around, see Boost.Python). Thanks for the link. > I think it's an architecture problem more than anything as far as the SL > server side is concerned. But then when you're faced with a problem like > full-3D physics engine in the server side, that's not something "as easy > as Hello, World" to implement (or fix for the matter). OpenCroquet uses a deterministic computation model, which replicates worlds to the end unser nodes a la P2P, and synchronizes differing inputs so that each simulation instance doesn't diverge. It can also do a master/slave type of state replication, if I understand it correctly, so in theory it could use physics accelerators, and clone state to slower nodes. SL in comparison does about anything but primitive rendering cluster-side. Given current assymetric broadband, this seems to be a superior approach to do everything P2P. (And I would imagine OpenCroquet hasn't even begun to deal with the nasty problem of NAT penetration). > Though it certainly is not "impossible", "hard" would be an > understatement especially now that it's in full-deployment with > thousands of people getting on it at any given time. It's really interesting. I wish there was more information flow out of Linden Labs, on how they're doing it. -- Eugen* Leitl leitl http://leitl.org ______________________________________________________________ ICBM: 48.07100, 11.36820 http://www.ativel.com 8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE From rgb at phy.duke.edu Thu Feb 1 12:26:17 2007 From: rgb at phy.duke.edu (Robert G. Brown) Date: Tue May 13 01:05:44 2008 Subject: [Beowulf] massive parallel processing application required In-Reply-To: References: <45BE8E7E.4010808@brookes.ac.uk> <45C04FE9.5050502@streamline-computing.com> <45C07AE4.1020508@brookes.ac.uk> <45C14358.8030800@brookes.ac.uk> <45C1542A.4030701@tamu.edu> <45C1DC42.90604@brookes.ac.uk> Message-ID: On Thu, 1 Feb 2007, Peter St. John wrote: > Moore's Law (which has grown in scope since Moore) applies to the aggregate > effect of many technologies. Individual techs proceed in fits and starts. > Predictions about FLOPS/dollar seem to be sustainable, but e.g. I predict a > jump in chip density when the price point of vapor deposition manufactured > diamond gets low enough (diamond conducts heat way better than silicon, and > chips are suffering from thermodynamics limits). > > When AT&T divested, you could not get a decent telephone anymore; they were > too expensive to make so well. Then after years of crummy phones, suddenly > everyone had a cell-phone just like Captain Kirk's. > > Sure I want fiber optics to my house. But maybe the power company will carry > data on the wasted bandwidth of power lines. Keep the faith :-) I'm not certain and am too lazy to plot it out and check, but it seems to me that communications has consistently lagged computation in the time constant used in a Moore's-type law. For CPUs it has been a fairly predictable 18-20 month doubling time (at constant cost, connected to the doubling time of VLSI for a long time but now more complex), which means a factor of ten takes somewhere around five or six years to accomplish. That's three orders of magnitude in 15 to 20 years. It took those same twenty years to go from 10 Mbps to 1 Gbps ethernet, only two orders of magnitude, at anything like constant cost. Most things I've read on the subject suggest that if anything the CPU/Communications gap is widening, forcing systems designers to use methodology developed for clusters and cluster communications even within a system (e.g. Hypertransport). Also, phone companies ARE gradually laying fiber everywhere, and while they may or may not take it right up to your house they'll certainly take it to your neighborhood, and maybe only "finish off" with copper. It's just that installing fiber is expensive, and takes time, and customers won't pay much of a premium for it. They "have" to do it anyway to compete with e.g. cable, and they are all doubtless running scared in front of the possibility that nobody will own non-cell phones anymore in a year or five so that either they are in a position to deliver streaming media to the home in competition with the cable company or they all belly right up in that market. A bit of a race, in other words, where they are ahead and behind at the same time. It won't be done for computer users, though. Not enough money in it, and what there is is already developed. Delivering entertainment, on the other hand -- there aren't any visible upper bounds on what one use there. If you treble the bandwidth, you just make HDTV cheaper and permit more stations and make it more feasible to deliver movies on demands in real time -- bleep through 4-5 GB in 1 minute or two, then display it at your liesure... rgb > > Peter > > > On 2/1/07, Mark Hahn wrote: >> >> > Not true. Distributed computing is more and more mainstream. I think >> too >> >> oh, one other snide comment about grid: I suspect the grid-fad could not >> have happened without the fraud perpetrated by worldcom and others during >> the internet bubble. in those days, it was popular to claim that the >> network >> was becoming truely ubiquitous and incomprehensibly fast. for instance: >> >> http://www-128.ibm.com/developerworks/grid/library/gr-heritage/#N100A6 >> >> I don't know about you, but in the 6 years since then, my home net >> connection has stayed the same speed, possibly a bit more expensive. >> desktop/LANs are still mostly at 100bT, with 1000bT in limited use. >> I do notice that grabbing large files off the net (ftp, RPMs, etc) >> often runs at O(MBps) which is about a 10x improvement over the past >> 10-15 years. so the doubling time turns out to be more like 3 years >> rather than 9 months. in-cluster networking has improved somewhat >> faster, but not dramatically so. >> _______________________________________________ >> Beowulf mailing list, Beowulf@beowulf.org >> To change your subscription (digest mode or unsubscribe) visit >> http://www.beowulf.org/mailman/listinfo/beowulf >> > -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu From rgb at phy.duke.edu Thu Feb 1 12:28:20 2007 From: rgb at phy.duke.edu (Robert G. Brown) Date: Tue May 13 01:05:44 2008 Subject: [Beowulf] massive parallel processing application required In-Reply-To: References: <45BE8E7E.4010808@brookes.ac.uk> <45C04FE9.5050502@streamline-computing.com> <45C07AE4.1020508@brookes.ac.uk> <45C14358.8030800@brookes.ac.uk> <45C1542A.4030701@tamu.edu> <45C1DC42.90604@brookes.ac.uk> Message-ID: On Thu, 1 Feb 2007, Peter St. John wrote: > Me too. I want a small laser on my rooftop, with prisms splitting to > receivers on the roofs of two or four neighbors, with a uucp type friendly > free protocol. I guess they should be MASERs but I'm no physicist. Oh, just chop up your microwave oven, line up an old umbrella lined with foil, and beam away. Just keep your head out from in front and don't let your children or pets anywhere near it...;-) rgb > > regards, mark. > > > regards, Pete. > -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu From coutinho at dcc.ufmg.br Thu Feb 1 12:09:26 2007 From: coutinho at dcc.ufmg.br (Bruno Rocha Coutinho) Date: Tue May 13 01:05:44 2008 Subject: [Beowulf] failure rates Message-ID: <45C248F6.6080807@dcc.ufmg.br> Most fault-tolerance literature assume that system components have exponential failure rates. But software sometimes don't have exponential failure rates if the cause of the failure is related to a timer, a overflow or resource leaks. In this case failure rate could be fixed and you end with all process failing at the same time. I think that is safe to assume exponential failure rates for hardware and in spite of most machine crashes today are OS (not hardware) related, most people assume that OSs are well behaved and don't suffer of fixed rate failures. 2007/1/30, enver ever : Hello there I am a PhD student working on mathematical looking to the availability of Beowulf clusters. I was looking whether or not it is possible to take exponential failure rates fot the nodes. Thats the case in these publications: 1- "A Realistic Evaluation of Consistency Algorithms for Replicated Files"Annual Simulation Symposium archive Proceedings of the 21st annual symposium on Simulation table of contents Tampa, Florida, United States Pages: 121 - 130 Year of Publication: 1988 ISBN:0-8186-0845-5 2-"Availability Modeling and Analysis on High Performance ClusterComputing Systems"Availability, Reliability and Security, 2006. ARES 2006. The First International Conference on Publication Date: 20-22 April 2006 3-"A Failure Predictive and Policy-Based High Availability Strategy for Linux High Performance Computing Cluster" Chokchai Leangsuksun1, Tong Liu1, Tirumala Rao1, Stephen L. Scott2, and Richard Libby Linux.com | LCI 5th International Linux Cluster Conference. I think it can be taken as exponentially distributed since in many multi-server systems this was the approach followed. I would appreciate if you could add any comments Many Regards _________________________________________________________________ MSN Hotmail is evolving ? check out the new Windows Live Mail http://ideas.live.com _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf From mikhailberis at gmail.com Thu Feb 1 13:13:48 2007 From: mikhailberis at gmail.com (Dean Michael Berris) Date: Tue May 13 01:05:44 2008 Subject: [Beowulf] clusters in gaming In-Reply-To: <20070201201959.GF21677@leitl.org> References: <20070131164304.GB21677@leitl.org> <45C2269D.1090202@gmail.com> <20070201201959.GF21677@leitl.org> Message-ID: <45C2580C.5040801@gmail.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Eugen Leitl wrote: > On Fri, Feb 02, 2007 at 01:42:53AM +0800, Dean Michael Berris wrote: > >> I don't see why a usual C/C++ MPI approach wouldn't work, though the > > In theory, there is no reason why these (or even Fortran) wouldn't be adequate either, > but in practice it would be very difficult to accomodate user-contributed > scripted objects into a rigid array/pointer framework. Adding new > methods in C to a brand new object instantiated at runtime is certainly > possible, but it sounds intensely painful. > Actually I was thinking more of having just primitive operations being implemented as either free functions or functors in C++, and having a chaining approach to making more complex functors. The idea is that once complex operations can be "generated" and "(de)serialized", new stuff is apparently just a combination of old/primitive stuff. >>From the point of view of running a massively parallel realtime > (fake) physics simulation with many 10 k simultaneous viewers/ > points of input it looks as if requires a massive numerical > performance, which suggests C (less C++). Common Lisp now has > very good compilers, but I wonder how well that translates > into numerics, and similiar to C++ the unwary programmer can > produce very slow code (CONSing, GC, etc). > With the advances in C++ optimizing compilers and using modern C++ programming approaches (template metaprogramming, policy-driven programming, lazy-functional programming, etc.) there's a very good chance that a lot of the "slow code" can be avoided. But of course, there has to be a conscious effort to profile->benchmark->optimize C++ code which can only be done if you had 1) time and 2) resources at hand. But seeing how much money's being put into SL right now, I think it's just a matter of time before the resources will be available. :) >> scaling issues of adding a new node to the cluster is certainly a > > There are two types of regions, isolated islands, and addition to the > main "continent". Both look quite suitable for geometric problem tesselation > (one node, one region) and incremental node addition as the terrain > grows. > Sounds simple, but now that leads to non-optimal resource allocation. If it was made that one node was allocated to one island, then you run into scaling problems when you have very high traffic regions. That's why an architectural solution should be found, because mapping regions to nodes 1-1 doesn't seem to work: because if you have 1000 regions 1:1 to nodes and 20k people in one region, what are the 999 nodes going to do? >> problem that may be a hindrance from the implementation -- but one that >> can be remedied by having local clusters "gridded" together using some >> protocol. > > As far as I know SL is run on one local cluster, which is why I thought > of how a Beowulf approach would help. They're complicating it by using > virtual machines, and packing several VMs on one physical server. > After (frequent) upgrades servers are restarted in a rolling fashion, > and I presume snapshot/resume migration is a useful function here. > But then, there are cluster-wide process migration available, > which are a lot more fine-grained. > I don't have this information available, though it would be interesting to note how this would really work. As early as now, they're encountering scalability problems having hundreds of people packed into a region. Apparently it does work, because people can still (somehow) bear with the performance degradation in these areas. >> As for throwing hardware at it, I don't think that's a problem -- that's >> actually a good solution. That being said, if the implementation was > > I thought the cluster had some 1000 nodes, but > http://gwynethllewelyn.net/article119visual1layout1.html > claims there are just 5000 virtual servers. Maybe they > just run 5 VServers/node, and there are really 1 kNodes, > which is a reasonably large cluster for just 16 kUsers > at peak (not for your garden-variety Beowulf, but > for a game server). > But the problem is, the physics in areas where there are a lot of objects is still performed all in the cluster. So adding more people and more objects will overload the physics engine on their end, and at 16kUsers at peak, can definitely overload certain nodes allocated for certain regions. But then I don't have any idea how they have it coded or implemented, so I can only speculate. >> already good to start with then adding more hardware would have >> (supposedly) better effect on the overall performance/experience. > > It would be really interesting to learn how current SL scales. > I'll look forward to reading something about that. >> I think it's an architecture problem more than anything as far as the SL >> server side is concerned. But then when you're faced with a problem like >> full-3D physics engine in the server side, that's not something "as easy >> as Hello, World" to implement (or fix for the matter). > > OpenCroquet uses a deterministic computation model, which replicates > worlds to the end unser nodes a la P2P, and synchronizes differing inputs > so that each simulation instance doesn't diverge. It can also do a master/slave > type of state replication, if I understand it correctly, so in > theory it could use physics accelerators, and clone state to slower > nodes. SL in comparison does about anything but primitive rendering > cluster-side. Given current assymetric broadband, this seems > to be a superior approach to do everything P2P. (And I would imagine > OpenCroquet hasn't even begun to deal with the nasty problem of NAT > penetration). > Doing everything server side is a good idea, especially for giving better client-side experience IF the server can handle it. Apparently, SL on the server side is hitting the limits that their architecture have either explicitly or implicitly defined. Sounds still like the architecture might need more help to improve current performance. >> Though it certainly is not "impossible", "hard" would be an >> understatement especially now that it's in full-deployment with >> thousands of people getting on it at any given time. > > It's really interesting. I wish there was more information flow out > of Linden Labs, on how they're doing it. > I wish the same too... They've open sourced the viewer, I just hope they open source the server too. - -- Dean Michael C. Berris http://cplusplus-soup.blogspot.com/ mikhailberis AT gmail DOT com +63 928 7291459 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFFwlgMBGP1EujvLlsRAj6ZAKCzSSXKrGU2RaKeTDhB/Tf3vgLKfwCfWszt nrL+cl7CvnRMaSm2QWQg6Tk= =owi6 -----END PGP SIGNATURE----- From James.P.Lux at jpl.nasa.gov Thu Feb 1 13:43:34 2007 From: James.P.Lux at jpl.nasa.gov (Jim Lux) Date: Tue May 13 01:05:44 2008 Subject: [Beowulf] massive parallel processing application required In-Reply-To: References: <45BE8E7E.4010808@brookes.ac.uk> <45C04FE9.5050502@streamline-computing.com> <45C07AE4.1020508@brookes.ac.uk> <45C14358.8030800@brookes.ac.uk> <45C1542A.4030701@tamu.edu> <45C1DC42.90604@brookes.ac.uk> Message-ID: <6.2.3.4.2.20070201132817.030f61e8@mail.jpl.nasa.gov> At 07:40 AM 2/1/2007, Mark Hahn wrote: >>Not true. Distributed computing is more and more mainstream. I think too > >oh, one other snide comment about grid: I suspect the grid-fad could >not have happened without the fraud perpetrated by worldcom and others during >the internet bubble. in those days, it was popular to claim that the network >was becoming truely ubiquitous and incomprehensibly fast. for instance: > >http://www-128.ibm.com/developerworks/grid/library/gr-heritage/#N100A6 In the long run, ubiquitous and fast IS going to be true (however, latency is something you can't get around... speed of light and all that). As long ago as 1993, I was at a conference where a speaker from AT&T commented that historical telecom pricing methods (longer distances cost more) were obsolete, since the dominant cost was in the termination, with, even then, a gross oversupply of fiber across the Atlantic. Hence the availability of cheap flat rate long distance (5c a minute anywhere, anytime).. the bulk of the system is no longer capacity limited. >I don't know about you, but in the 6 years since then, my home net >connection has stayed the same speed, possibly a bit more expensive. Interestingly, they've just rolled out FiOS (fiber to the home) in my area, which is a HUGE jump in potential bandwidth from the existing DSL or Cable Modem delivery methods. And, moderately competitive in price (5 Mbps is $40/month, including the bundled ISP kinds of features). What's fascinating is the faster tiers.. you can get 15 Mbps down/2 up for $50/mo and 30 M down/5 up for $180 Granted, these are consumer offerings and have all the usual network congestion caveats, but hey, at least they are offering 30 Mbps for the last mile, which is quite impressive. >desktop/LANs are still mostly at 100bT, with 1000bT in limited use. But that's more driven by replacement cycles and the lack of real demand for faster speeds to the desktop. If your facility has a 1.5 Mbps pipe to the internet, giving users a 1 Gb/s won't change their performance much compared to 100 Mb/s. There's also a wiring infrastructure issue. While desktops are typically replaced on a 3 year cycle, the wiring infrastructure cycles through a bit slower, especially in smaller businesses and residential (that is, I'm not likely to start ripping out the drywall to replace the Cat 5 wiring I put in back in 1998)... and frankly, since right now, I have maybe 700 kbps at home to the internet (one way), and then a wireless connection from laptop to home network, there's not much to be gained by improving the home wiring infrastructure. (If I go with the FiOS offering though, that may prompt some re-evaluation) Likewise, a small business with half a dozen or a dozen desktops and a couple servers isn't going to see a huge benefit from faster networking, because they're throttled by the server's disk speed, more than anything else. (assuming they're not hosting a big website, etc.) So, you're looking at GigE making a difference in two areas: replacing cable TV (all those 20 Mbps HDTV streams) and in big companies. But even in big companies, GigE to the desktop doesn't necessarily buy you much, if you're all competing for the same server resources. >I do notice that grabbing large files off the net (ftp, RPMs, etc) >often runs at O(MBps) which is about a 10x improvement over the past >10-15 years. so the doubling time turns out to be more like 3 years >rather than 9 months. Which is probably consistent with equipment refurbishment cycles. > in-cluster networking has improved somewhat faster, but not > dramatically so. >_______________________________________________ >Beowulf mailing list, Beowulf@beowulf.org >To change your subscription (digest mode or unsubscribe) visit >http://www.beowulf.org/mailman/listinfo/beowulf James Lux, P.E. Spacecraft Radio Frequency Subsystems Group Flight Communications Systems Section Jet Propulsion Laboratory, Mail Stop 161-213 4800 Oak Grove Drive Pasadena CA 91109 tel: (818)354-2075 fax: (818)393-6875 From hahn at mcmaster.ca Thu Feb 1 15:25:16 2007 From: hahn at mcmaster.ca (Mark Hahn) Date: Tue May 13 01:05:44 2008 Subject: [Beowulf] massive parallel processing application required In-Reply-To: <6.2.3.4.2.20070201132817.030f61e8@mail.jpl.nasa.gov> References: <45BE8E7E.4010808@brookes.ac.uk> <45C04FE9.5050502@streamline-computing.com> <45C07AE4.1020508@brookes.ac.uk> <45C14358.8030800@brookes.ac.uk> <45C1542A.4030701@tamu.edu> <45C1DC42.90604@brookes.ac.uk> <6.2.3.4.2.20070201132817.030f61e8@mail.jpl.nasa.gov> Message-ID: >> the internet bubble. in those days, it was popular to claim that the >> network >> was becoming truely ubiquitous and incomprehensibly fast. for instance: > > In the long run, ubiquitous and fast IS going to be true (however, latency is in the long run, everything is true ;) > gross oversupply of fiber across the Atlantic. Hence the availability of > cheap flat rate long distance (5c a minute anywhere, anytime).. the bulk of > the system is no longer capacity limited. interesting - I assumed that long-distance became cheap not due to oversupply of fiber and bandwidth, but rather transition away from old-fashioned circuit switching (ie, towards digital compressed voice over packets.) I know that buying fiber/lambdas/bandwidth is still very much not what I'd call cheap, though I have no doubt it's much better/cheaper than in the past. >> I don't know about you, but in the 6 years since then, my home net >> connection has stayed the same speed, possibly a bit more expensive. > > Interestingly, they've just rolled out FiOS (fiber to the home) in my area, > which is a HUGE jump in potential bandwidth from the existing DSL or Cable > Modem delivery methods. And, moderately competitive in price (5 Mbps is > $40/month, including the bundled ISP kinds of features). What's fascinating > is the faster tiers.. you can get 15 Mbps down/2 up for $50/mo and 30 M > down/5 up for $180 seems strange to me - what kind of residential customer would pay for that kind of thing (and remain free of the RIAA/MPAA)? some smart form of wireless seems like an obvious good solution for residential last-mile. maybe that's a disruptive innovation that will finally put the telco/cableco's out of their misery. > Likewise, a small business with half a dozen or a dozen desktops and a couple > servers isn't going to see a huge benefit from faster networking, because > they're throttled by the server's disk speed, more than anything else. if their servers disks are only 100bT speed, they're broken. it may well be that most SMB servers are that crappy, in spite of the fact that a recycled linux box and one disk will deliver 40 MB/s... > So, you're looking at GigE making a difference in two areas: replacing cable > TV (all those 20 Mbps HDTV streams) how many 20Mb streams does a typical endpoint need? either residential or commercial? > and in big companies. But even in big > companies, GigE to the desktop doesn't necessarily buy you much, if you're > all competing for the same server resources. wow, dim view of the competence of server admins, but you may be right... regards, mark hahn. From James.P.Lux at jpl.nasa.gov Thu Feb 1 16:00:18 2007 From: James.P.Lux at jpl.nasa.gov (Jim Lux) Date: Tue May 13 01:05:44 2008 Subject: [Beowulf] massive parallel processing application required In-Reply-To: References: <45BE8E7E.4010808@brookes.ac.uk> <45C04FE9.5050502@streamline-computing.com> <45C07AE4.1020508@brookes.ac.uk> <45C14358.8030800@brookes.ac.uk> <45C1542A.4030701@tamu.edu> <45C1DC42.90604@brookes.ac.uk> Message-ID: <6.2.3.4.2.20070201152945.030e8c08@mail.jpl.nasa.gov> At 08:25 AM 2/1/2007, Peter St. John wrote: >Moore's Law (which has grown in scope since Moore) applies to the >aggregate effect of many technologies. Individual techs proceed in >fits and starts. Predictions about FLOPS/dollar seem to be >sustainable, but e.g. I predict a jump in chip density when the >price point of vapor deposition manufactured diamond gets low enough >(diamond conducts heat way better than silicon, and chips are >suffering from thermodynamics limits). > >When AT&T divested, you could not get a decent telephone anymore; >they were too expensive to make so well. Then after years of crummy >phones, suddenly everyone had a cell-phone just like Captain Kirk's. > >Sure I want fiber optics to my house. But maybe the power company >will carry data on the wasted bandwidth of power lines. What wasted bandwidth on power lines? Wires of random composition and topology, some over 100 years old, strung hither and yon, above and below ground doesn't sound like a particularly good propagation medium for wideband signals. Sure, signal processing and adaptive processing can do some good, but it's still a shared medium (i.e. that same power line that serves you also serves 8 of your neighbors). Twisted pairs of wires, coaxial cable, optical waveguides.. that's a consistent broadband propagation medium. Data over powerlines might be useful for time of use electricity metering, etc... Jim, W6RMK From James.P.Lux at jpl.nasa.gov Thu Feb 1 16:24:13 2007 From: James.P.Lux at jpl.nasa.gov (Jim Lux) Date: Tue May 13 01:05:44 2008 Subject: [Beowulf] massive parallel processing application required In-Reply-To: References: <45BE8E7E.4010808@brookes.ac.uk> <45C04FE9.5050502@streamline-computing.com> <45C07AE4.1020508@brookes.ac.uk> <45C14358.8030800@brookes.ac.uk> <45C1542A.4030701@tamu.edu> <45C1DC42.90604@brookes.ac.uk> Message-ID: <6.2.3.4.2.20070201161644.03125d90@mail.jpl.nasa.gov> At 12:26 PM 2/1/2007, Robert G. Brown wrote: >On Thu, 1 Feb 2007, Peter St. John wrote: > > >Also, phone companies ARE gradually laying fiber everywhere, and while >they may or may not take it right up to your house they'll certainly >take it to your neighborhood, and maybe only "finish off" with copper. >It's just that installing fiber is expensive, About $900 per house, where I live, according to some acquaintances in the telco. >and takes time, and >customers won't pay much of a premium for it. They "have" to do it >anyway to compete with e.g. cable, and they are all doubtless running >scared in front of the possibility that nobody will own non-cell phones >anymore in a year or five so that either they are in a position to >deliver streaming media to the home in competition with the cable >company or they all belly right up in that market. A bit of a race, in >other words, where they are ahead and behind at the same time. The term of art is "triple play"... phone, entertainment, internet access all from one provider. >It won't be done for computer users, though. Not enough money in it, >and what there is is already developed. Delivering entertainment, on >the other hand -- there aren't any visible upper bounds on what one use >there. If you treble the bandwidth, you just make HDTV cheaper and >permit more stations and make it more feasible to deliver movies on >demands in real time -- bleep through 4-5 GB in 1 minute or two, then >display it at your liesure... Subject to a raft of content management requirements (maybe I don't want you fast forwarding through commercials? Maybe I want to charge you "per viewing" The big question/challenge in that business is how do you monetize individual uses of something that has previously been consumed as a utility stream e.g. rather than broadcasting a program for all, or none, to view, and charging advertisers by using statistical measures (Nielsen ratings), can I actually measure the viewership (with demographic breakdowns) and charge on that basis.. Yes, Mr. Vendor, 354,313.5 people watched your commercial, of which 516 were in your target demographic.... Or, rather than charging you $10/month for HBO, and you can watch that movie as many times as you want, we can charge you only $0.99 per viewing of Movie #A (so the we can pay studio X their fee) and $0.89 for a viewing of Movie #B (because Studio Y didn't give as many gross points to their star, so they can discount it) Jim From James.P.Lux at jpl.nasa.gov Thu Feb 1 16:56:12 2007 From: James.P.Lux at jpl.nasa.gov (Jim Lux) Date: Tue May 13 01:05:44 2008 Subject: [Beowulf] massive parallel processing application required In-Reply-To: References: <45BE8E7E.4010808@brookes.ac.uk> <45C04FE9.5050502@streamline-computing.com> <45C07AE4.1020508@brookes.ac.uk> <45C14358.8030800@brookes.ac.uk> <45C1542A.4030701@tamu.edu> <45C1DC42.90604@brookes.ac.uk> <6.2.3.4.2.20070201132817.030f61e8@mail.jpl.nasa.gov> Message-ID: <6.2.3.4.2.20070201162503.02d3be80@mail.jpl.nasa.gov> At 03:25 PM 2/1/2007, Mark Hahn wrote: >>>the internet bubble. in those days, it was popular to claim that >>>the network >>>was becoming truely ubiquitous and incomprehensibly fast. for instance: >> >>In the long run, ubiquitous and fast IS going to be true (however, latency is > >in the long run, everything is true ;) > >>gross oversupply of fiber across the Atlantic. Hence the >>availability of cheap flat rate long distance (5c a minute >>anywhere, anytime).. the bulk of the system is no longer capacity limited. > >interesting - I assumed that long-distance became cheap not due to >oversupply of fiber and bandwidth, but rather transition away from >old-fashioned circuit switching (ie, towards digital compressed voice >over packets.) Not much compression going on for voice traffic. It's carried as 64 kbps data at 8 ksamples/second, pretty much. There is some statistical multiplexing possible (TASI) because people don't talk at 100% duty cycle, but not a huge amount. >I know that buying fiber/lambdas/bandwidth is still very much not >what I'd call cheap, though I have no doubt it's much better/cheaper >than in the past. In 1993, the capital cost of "one voice channel" worth (64 kbps) of capacity across the atlantic was less than $10, as I recall. Compare that to leased line T-1 rates back then of many dollars per eighth of a mile per month, and that's before you bought the CSU/DSU to connect to the copper. Mind you, the ATT guy thought that it would be 155 Mbps ATM to the desktop, and we see where that went. >>>I don't know about you, but in the 6 years since then, my home net >>>connection has stayed the same speed, possibly a bit more expensive. >> >>Interestingly, they've just rolled out FiOS (fiber to the home) in >>my area, which is a HUGE jump in potential bandwidth from the >>existing DSL or Cable Modem delivery methods. And, moderately >>competitive in price (5 Mbps is $40/month, including the bundled >>ISP kinds of features). What's fascinating is the faster tiers.. >>you can get 15 Mbps down/2 up for $50/mo and 30 M down/5 up for $180 > >seems strange to me - what kind of residential customer would pay >for that kind of thing (and remain free of the RIAA/MPAA)? An interesting question.. I think the upper tiers are there to complement similar offerings in the commercial/business market. Or, there IS a burgeoning market for live video feeds from adult entertainment providers. Without them, the VCR market would never have taken off. Contemplate Youtube type applications, but in HD.. 20 Mbps is the basic rate for HD. I might be interested, for instance, in seeing the Mentos and Pepsi artists in HD, rather than lowfi 15 fps QCIF. But, also, consider something like streaming audio at CD quality (not MP3 compressed).. A stereo 44.1ksps 16 bit stream is about 1.5 Mbps, and say I, my wife, and my daughters all want to listen to different programs at the same time. There will also be video that is not afflicted by MPAA. NasaTV is free to all and streamed over the network as well as being shoved out over C-band transponders. I can see using 15M+ sorts of rates in bursts for myself (downloading the aforementioned climate databases, for instance...) >some smart form of wireless seems like an obvious good solution for >residential last-mile. maybe that's a disruptive innovation that will >finally put the telco/cableco's out of their misery. Nobody has come up with a *good* wireless solution that is as cheap and reliable as pulling a physical media. There's a raft of spectrum occupancy issues, etc. Let's assume you've got a neighborhood with 400 houses in it at a density of, say, 500 square meters/house (roughly 8 houses/acre). It's perhaps 10-20 meters between houses on average. Say each house needs 50 Mbps of bandwidth (e.g. two cable channels worth). If you use a short range wireless scheme (notional range of 50 m) a given transmitter is going to cover half a dozen houses, so each transmitter would need a bandwidth of about 300 Mbps (which is fairly hefty, but not out of the question). AND there would need to be some smart switching in the system that feeds that transmitter the correct subset of the Terabits/second available... And, some way to cleverly do spectrum reuse (so that if you have houses A, B,C, D, and E lined up, A can use channel 1, B can use channel 6, C can use channel 11, and by house D, the signal for Channel 1 going to house A has faded enough that we can reuse it for D, 6 for E, etc.) This is highly nontrivial, and nobody has come up with a automagic way to do it that is efficient and self organizing. Right now, though, the Cable TV folks feed 1 GHz of bandwidth to you and YOU do the channel selection, which reduces their physical plant cost... all they need is power distribution with no intelligence, just management of SNR. (this breaks down in the upstream case, which is a fundamental problem with Cable Modems) >>Likewise, a small business with half a dozen or a dozen desktops >>and a couple servers isn't going to see a huge benefit from faster >>networking, because they're throttled by the server's disk speed, >>more than anything else. > >if their servers disks are only 100bT speed, they're broken. it may well >be that most SMB servers are that crappy, in spite of the fact that >a recycled linux box and one disk will deliver 40 MB/s... Not so much a limitation as that, as the 10 desktops aren't all going to be hitting the server at exactly the same time, most of the time. Relatively few business desktops are doing things like streaming video. They're just moving documents to and from the server, and that's a sort of bursty traffic, so its not a big deal. And 40 MB/s implies 13 Megatransfers/second across a 32 bit bus, with a 33 MHz bus, a transfer from the disk and a transfer to the NIC doesn't leave a whole lot of time for fetching instructions from RAM, etc. now, if your office is comprised of diskless clients....that's another story. >>So, you're looking at GigE making a difference in two >>areas: replacing cable TV (all those 20 Mbps HDTV streams) > >how many 20Mb streams does a typical endpoint need? either residential >or commercial? I can see at least 3 streams for residential. 1 for live viewing, 1 for recording on the TIVO, 1 for the second TV. >>and in big companies. But even in big companies, GigE to the >>desktop doesn't necessarily buy you much, if you're all competing >>for the same server resources. > >wow, dim view of the competence of server admins, but you may be right... No.. it's that network traffic from desktop to server just isn't all that high in most environments. For instance, I consume almost NO network bandwidth most of the time at work, because most of what I work with is on the local machine. Even in a high transaction rate call center, there's just not that many bytes flying back and forth. "yes, Mr. Lux, and your account number is? ...." blurp there's 100 bytes to the server in a SQL query, and maybe a kilobyte coming back. 10 seconds pass, "And you'd like the bullion delivered where?" blurp.. after 30 seconds, the operator sends the delivery address with a few hundred bytes to the transaction processor. blurp...100 bytes come back "your confirmation number is 2.71828, Thank you for calling" Then, that triggers a few more kbytes of traffic to the vault and the delivery truck company, etc. But overall, that's what, an average of 1 kb per second, at most? So the call center has 1000 people.. we're up to only 1 Megabit/second. Even if they do complete screen paints at every step over the network, it's still not that much traffic. Some sort of call center where they look at scanned images might be an example of a bigger volume user. "Yes.. I'm looking at your bearer bonds now, and we'll be able to execute that sell order for 100,000 shares MSFT." or, more realistically, "I'm looking through your loan application now and on page 32, there's a problem with the property description you submitted three years ago." or "Yes, Mr. Lux, that IS a big dent that we need to fix in your bumper" But even then, a full screen image is only a few megabytes, at most, unless you're totally profligate with uncompressed 24 bit TIFF images. The big advantage of GigE to the desktop is that when you do send big files (say a full screen image), it takes less time. But the average rate is still low. >regards, mark hahn. James Lux, P.E. Spacecraft Radio Frequency Subsystems Group Flight Communications Systems Section Jet Propulsion Laboratory, Mail Stop 161-213 4800 Oak Grove Drive Pasadena CA 91109 tel: (818)354-2075 fax: (818)393-6875 From deadline at eadline.org Thu Feb 1 18:34:46 2007 From: deadline at eadline.org (Douglas Eadline) Date: Tue May 13 01:05:44 2008 Subject: [Beowulf] massive parallel processing application required In-Reply-To: <6.2.3.4.2.20070201162503.02d3be80@mail.jpl.nasa.gov> References: <45BE8E7E.4010808@brookes.ac.uk> <45C04FE9.5050502@streamline-computing.com> <45C07AE4.1020508@brookes.ac.uk> <45C14358.8030800@brookes.ac.uk> <45C1542A.4030701@tamu.edu> <45C1DC42.90604@brookes.ac.uk> <6.2.3.4.2.20070201132817.030f61e8@mail.jpl.nasa.gov> <6.2.3.4.2.20070201162503.02d3be80@mail.jpl.nasa.gov> Message-ID: <48680.192.168.1.1.1170383686.squirrel@mail.eadline.org> --snip-- > There is some > statistical multiplexing possible (TASI) because people don't talk at > 100% duty cycle, but not a huge amount. You have not met my ... (never mind, you never know where these emails end up) -- Doug From James.P.Lux at jpl.nasa.gov Thu Feb 1 19:31:57 2007 From: James.P.Lux at jpl.nasa.gov (Jim Lux) Date: Tue May 13 01:05:44 2008 Subject: [Beowulf] massive parallel processing application required In-Reply-To: <48680.192.168.1.1.1170383686.squirrel@mail.eadline.org> References: <45BE8E7E.4010808@brookes.ac.uk> <45C04FE9.5050502@streamline-computing.com> <45C07AE4.1020508@brookes.ac.uk> <45C14358.8030800@brookes.ac.uk> <45C1542A.4030701@tamu.edu> <45C1DC42.90604@brookes.ac.uk> <6.2.3.4.2.20070201132817.030f61e8@mail.jpl.nasa.gov> <6.2.3.4.2.20070201162503.02d3be80@mail.jpl.nasa.gov> <48680.192.168.1.1.1170383686.squirrel@mail.eadline.org> Message-ID: <6.2.3.4.2.20070201193107.02d30f48@mail.jpl.nasa.gov> At 06:34 PM 2/1/2007, Douglas Eadline wrote: > --snip-- > > > There is some > > statistical multiplexing possible (TASI) because people don't talk at > > 100% duty cycle, but not a huge amount. > >You have not met my ... > >(never mind, you never know where these emails end up) Hah.. I need 5 Mbps at home, just to keep with the traffic on this list. Talk about 100% duty cycle. Jim From gdjacobs at gmail.com Fri Feb 2 00:57:34 2007 From: gdjacobs at gmail.com (Geoff Jacobs) Date: Tue May 13 01:05:44 2008 Subject: [Beowulf] massive parallel processing application required In-Reply-To: <6.2.3.4.2.20070201132817.030f61e8@mail.jpl.nasa.gov> References: <45BE8E7E.4010808@brookes.ac.uk> <45C04FE9.5050502@streamline-computing.com> <45C07AE4.1020508@brookes.ac.uk> <45C14358.8030800@brookes.ac.uk> <45C1542A.4030701@tamu.edu> <45C1DC42.90604@brookes.ac.uk> <6.2.3.4.2.20070201132817.030f61e8@mail.jpl.nasa.gov> Message-ID: <45C2FCFE.7020303@gmail.com> Jim Lux wrote: > (If I go with the FiOS offering though, that may prompt > some re-evaluation) Why? Only a third of the bandwidth of fast ethernet at peak speeds (which you aren't going to see). Hell, an rtl8139 could handle that. > Likewise, a small business with half a dozen or a dozen desktops and a > couple servers isn't going to see a huge benefit from faster networking, > because they're throttled by the server's disk speed, more than anything > else. (assuming they're not hosting a big website, etc.) More likely throttled by the operators. > So, you're looking at GigE making a difference in two areas: replacing > cable TV (all those 20 Mbps HDTV streams) and in big companies. But > even in big companies, GigE to the desktop doesn't necessarily buy you > much, if you're all competing for the same server resources. Certain areas, such as digital video content development, are much more accessible with high speed interconnect going commodity. However, very few companies have the concentrated, high volume databases which would really tax a network. > James Lux, P.E. > Spacecraft Radio Frequency Subsystems Group > Flight Communications Systems Section > Jet Propulsion Laboratory, Mail Stop 161-213 > 4800 Oak Grove Drive > Pasadena CA 91109 > tel: (818)354-2075 > fax: (818)393-6875 -- Geoffrey D. Jacobs From rgb at phy.duke.edu Fri Feb 2 04:29:06 2007 From: rgb at phy.duke.edu (Robert G. Brown) Date: Tue May 13 01:05:44 2008 Subject: [Beowulf] massive parallel processing application required In-Reply-To: <6.2.3.4.2.20070201193107.02d30f48@mail.jpl.nasa.gov> References: <45BE8E7E.4010808@brookes.ac.uk> <45C04FE9.5050502@streamline-computing.com> <45C07AE4.1020508@brookes.ac.uk> <45C14358.8030800@brookes.ac.uk> <45C1542A.4030701@tamu.edu> <45C1DC42.90604@brookes.ac.uk> <6.2.3.4.2.20070201132817.030f61e8@mail.jpl.nasa.gov> <6.2.3.4.2.20070201162503.02d3be80@mail.jpl.nasa.gov> <48680.192.168.1.1.1170383686.squirrel@mail.eadline.org> <6.2.3.4.2.20070201193107.02d30f48@mail.jpl.nasa.gov> Message-ID: On Thu, 1 Feb 2007, Jim Lux wrote: > At 06:34 PM 2/1/2007, Douglas Eadline wrote: > >> --snip-- >> >> > There is some >> > statistical multiplexing possible (TASI) because people don't talk at >> > 100% duty cycle, but not a huge amount. >> >> You have not met my ... >> >> (never mind, you never know where these emails end up) > > > Hah.. I need 5 Mbps at home, just to keep with the traffic on this list. > > Talk about 100% duty cycle. Aw, c'mon, I don't type THAT fast... rgb > > > Jim > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu From James.P.Lux at jpl.nasa.gov Fri Feb 2 06:19:47 2007 From: James.P.Lux at jpl.nasa.gov (Jim Lux) Date: Tue May 13 01:05:44 2008 Subject: [Beowulf] massive parallel processing application required In-Reply-To: <45C2FCFE.7020303@gmail.com> References: <45BE8E7E.4010808@brookes.ac.uk> <45C04FE9.5050502@streamline-computing.com> <45C07AE4.1020508@brookes.ac.uk> <45C14358.8030800@brookes.ac.uk> <45C1542A.4030701@tamu.edu> <45C1DC42.90604@brookes.ac.uk> <6.2.3.4.2.20070201132817.030f61e8@mail.jpl.nasa.gov> <45C2FCFE.7020303@gmail.com> Message-ID: <6.2.3.4.2.20070202061200.030f4090@mail.jpl.nasa.gov> At 12:57 AM 2/2/2007, Geoff Jacobs wrote: >Jim Lux wrote: > > > (If I go with the FiOS offering though, that may prompt > > some re-evaluation) >Why? Only a third of the bandwidth of fast ethernet at peak speeds >(which you aren't going to see). Hell, an rtl8139 could handle that. > > > Likewise, a small business with half a dozen or a dozen desktops and a > > couple servers isn't going to see a huge benefit from faster networking, > > because they're throttled by the server's disk speed, more than anything > > else. (assuming they're not hosting a big website, etc.) >More likely throttled by the operators. The operators of the desktops, I assume. The business offerings have commited information rates, etc. > > So, you're looking at GigE making a difference in two areas: replacing > > cable TV (all those 20 Mbps HDTV streams) and in big companies. But > > even in big companies, GigE to the desktop doesn't necessarily buy you > > much, if you're all competing for the same server resources. >Certain areas, such as digital video content development, are much more >accessible with high speed interconnect going commodity. However, very >few companies have the concentrated, high volume databases which would >really tax a network. One comment the guy from ATT made back in the 90s was that it's impossible to predict what really might happen when you do have real ubiquitous high speed access to the desktop (which is only just now becoming available, in the sense that the network connection is faster than the disk or CPU). It's that paradigm shift thing. The current software model and the conceptual models of the vast majority of application developers (or users who want things done) tends to be framed by the assumption that network access is slow and/or expensive(hence my comment about having everything locally) If you have a very fat, low latency, cheap pipe, all of a sudden, there are classes of applications (some of which we, by definition, cannot anticipate) that might become possible. For instance, the vision of "per use pricing" for office tools with very thin clients becomes possible. With a fat pipe, you could go back to the 60s timesharing model, with the desktop being just a display and a keyboard. Jim From rgb at phy.duke.edu Mon Feb 5 04:04:32 2007 From: rgb at phy.duke.edu (Robert G. Brown) Date: Tue May 13 01:05:44 2008 Subject: [Beowulf] wulfware update... Message-ID: In case anybody on list currently cares, I spent the weekend repackaging the wulfware suite (not to be confused with the warewulf suite:-): xmlsysd, libwulf, wulfstat, wulflogger, and wulf2html (previously wulfweb). They are now in a single source tree in a single source rpm or tarball, which builds rpms for each of these packages all at once. I also worked on making wulf2html into a chkconfig controllable service. Basically, if you install it and configure it (edit scripts and the wulfhosts file in /etc/wulfware) on a system that can write to webspace, you can chkconfig it on and it will automatically start up on boot. It's probably not the most robust application of this sort ever written yet but it works automagically for me -- it comes up with a page that shows localhost only by default. This repackaging should make it easier to develop UIs in a single tree that also contains the library, even on systems that don't have the rpms installed. It was a bit of a pain to work on the library and a UI for testing it at the same time. Hopefully this will facilitate my work (long suspended) on gwulfstat. The new one-stop shop link is: http://www.phy.duke.edu/~rgb/Beowulf/wulfware.php and the old links have gone away. I bumped the revision numbers to a notch above the highest number in the collective tree so that the rpms can be dropped into a yum repo and update happily -- from now on all numbers will advance together as a unit. rgb -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu From walid.shaari at gmail.com Mon Feb 5 09:48:33 2007 From: walid.shaari at gmail.com (Walid) Date: Tue May 13 01:05:44 2008 Subject: [Beowulf] failure rates In-Reply-To: <45C248F6.6080807@dcc.ufmg.br> References: <45C248F6.6080807@dcc.ufmg.br> Message-ID: Hi, I do not know if i can help answering the original question really. but most of the failures we see from the system side are in that order hard disks interconnect cards misconfigured node Uncorrected Memory errors system board failures Unexplainable failures failures related to the application itself we do not see them as the user will resubmit his job and will correct their mistakes quietly. The question is cluster by definition are not highly available systems, they are made up of commodity hardware, and if most of these clusters are using the standard mpi implementation then they will work on the principle if it fails stop. and in most of the time failure investigation is minimal as the importance is getting the node back to work. so is failure rate really of concern? if it was so we would see more of fault tolerance layers in clusters and failure rate metrics in monitoring tools and reports. I am interested in reducing these failure rates as user demands are growing instead of using few nodes, now they are using as much as possible and requesting for even more, and the more you give them, the more failures we will get! What will you be trying to achieve with your thesis? will the question of how the reduce or manage the failures be part of it? regards Walid. From i.kozin at dl.ac.uk Tue Feb 6 10:57:23 2007 From: i.kozin at dl.ac.uk (Kozin, I (Igor)) Date: Tue May 13 01:05:44 2008 Subject: [Beowulf] benchmarking database Message-ID: Dear All, we have launched recently an on-line benchmarking database http://www.cse.clrc.ac.uk/disco/dbd/ The emphasis is on clusters but not exclusively so. At the moment it has two simple interfaces: one for applications http://www.cse.clrc.ac.uk/disco/dbd/search-parallel.php and another for communication benchmarks IMB/PMB http://www.cse.clrc.ac.uk/disco/dbd/search-pmb.php Internally the database treats everything equally. The most basic unit is an independent "processing element" (PE) which can be a single-core CPU, a core in a multi-core CPU, GPU, cell or whatever. PEs can be oversubscribed ie run more than one thread (e.g. when HT or SMT is enabled). PEs aggregate into nodes between which communication takes place via some sort of interconnect. Application performance is compared against the same number of PEs. Hopefully we will improve the interface eventually and grow the number of applications and benchmarks. All your feedback is highly appreciated. If you would like to share your benchmarking data please contact me off the list. We are happy to accommodate results from trusted sources. Regards, Igor I. Kozin (i.kozin at dl.ac.uk) CCLRC Daresbury Laboratory, WA4 4AD, UK skype: in_kozin tel: +44 (0) 1925 603308 http://www.cse.clrc.ac.uk/disco From mathog at caltech.edu Tue Feb 6 11:08:41 2007 From: mathog at caltech.edu (David Mathog) Date: Tue May 13 01:05:44 2008 Subject: [Beowulf] S2466 permanent poweroff, round 2 Message-ID: The Tyan S2466N-4M motherboards bite once again. For various reasons I need to upgrade the software on our cluster from Mandrake 10.1 to something a bit more modern, so I tried Mandriva 2007. Wiped / and /boot on a test node, did a clean install of Mandriva 2007, and pretty much everything worked as it should. Unfortunately this resurrected the old problem where "poweroff" leaves the machine in a dead state: it doesn't respond to the front panel button until the power is unplugged, 20 seconds pass, and the power is restored. This problem was resolved the first time it showed up many years ago by upgrading to BIOS 4.06. There's no newer BIOS, so that isn't going to fix it this time. It isn't a Mandriva 2007 problem per se because we have another machine (a very old Athlon 850 with a Gigabyte motherboard) running that OS and it does "poweroff" correctly. The two machines (poweroff working and not working) have exactly the same versions of every RPM package. LILO is pretty basic on both of them too: image=/boot/vmlinuz label="linux" root=/dev/hda5 initrd=/boot/initrd.img append="resume=/dev/hda2" I suppose I could try a vanilla kernel next, but maybe there's some way to diagnose what state (S5, S0, whatever) the machine is going to on poweroff, and why? The documentation for ACPI itself is humongous, and for the linux implementations essentially absent, so I don't know what tool to run to find or modify this info. Interestingly /proc/acpi/sleep is missing on Mandriva 2007, but that doesn't seem to hurt anything on the working machine, so maybe that's (yet another) change in the ACPI interface? Thanks, David Mathog mathog@caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech From landman at scalableinformatics.com Tue Feb 6 11:34:02 2007 From: landman at scalableinformatics.com (Joe Landman) Date: Tue May 13 01:05:44 2008 Subject: [Beowulf] S2466 permanent poweroff, round 2 In-Reply-To: References: Message-ID: <45C8D82A.2090101@scalableinformatics.com> David Mathog wrote: > The Tyan S2466N-4M motherboards bite once again. > > For various reasons I need to upgrade the software on our cluster > from Mandrake 10.1 to something a bit more modern, so I tried > Mandriva 2007. Wiped / and /boot on a test node, did > a clean install of Mandriva 2007, and pretty much > everything worked as it should. Unfortunately > this resurrected the old problem where "poweroff" leaves the machine > in a dead state: it doesn't respond to the front panel button > until the power is unplugged, 20 seconds pass, and the power is > restored. This problem was resolved the first time it showed u Hi Dave: We have seen this on lots of Tyan boards in general. Kind of hard to recommend steering clear if you have a room full of them. > p > many years ago by upgrading to BIOS 4.06. There's no newer BIOS, > so that isn't going to fix it this time. It isn't a Mandriva 2007 > problem per se because we have another machine (a very old > Athlon 850 with a Gigabyte motherboard) running that OS and it > does "poweroff" correctly. The two machines (poweroff working and > not working) have exactly the same versions of every RPM package. > LILO is pretty basic on both of them too: > > > image=/boot/vmlinuz > label="linux" > root=/dev/hda5 > initrd=/boot/initrd.img > append="resume=/dev/hda2" I seem to remember having to do a noacpi option to make them behave. Something about acpi on these boards were horribly broken. FWIW we have seen this with a few late model Opteron (Tyan) boards as well. :( -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com phone: +1 734 786 8423 fax : +1 734 786 8452 or +1 866 888 3112 cell : +1 734 612 4615 From ballen at gravity.phys.uwm.edu Tue Feb 6 12:17:03 2007 From: ballen at gravity.phys.uwm.edu (Bruce Allen) Date: Tue May 13 01:05:44 2008 Subject: [Beowulf] important WDC firmware update Message-ID: Here is an important firmware update for WDC WDXXXXYS series drives on RAID controllers. Without this update you will see period drive dropouts and rebuilds on the RAID sets. We've been seeing this a lot with some Areca controllers; I am hoping that this firmware update will fix the problem. http://wdc.custhelp.com/cgi-bin/wdc.cfg/php/enduser/std_adp.php?p_faqid=1493&p_created=1168299631&p_sid=La1EsAti&p_accessibility=0&p_redirect=&p_lva=&p_sp=cF9zcmNoPTEmcF9zb3J0X2J5PSZwX2dyaWRzb3J0PSZwX3Jvd19jbnQ9MjImcF9wcm9kcz0mcF9jYXRzPSZwX3B2PSZwX2N2PSZwX3NlYXJjaF90eXBlPXNlYXJjaF9mbmwmcF9wYWdlPTEmcF9zZWFyY2hfdGV4dD1maXJtd2FyZQ**&p_li=&p_topview=1 Cheers, Bruce From jlb17 at duke.edu Tue Feb 6 12:34:29 2007 From: jlb17 at duke.edu (Joshua Baker-LePain) Date: Tue May 13 01:05:44 2008 Subject: [Beowulf] important WDC firmware update In-Reply-To: References: Message-ID: On Tue, 6 Feb 2007 at 2:17pm, Bruce Allen wrote > Here is an important firmware update for WDC WDXXXXYS series drives on RAID > controllers. Without this update you will see period drive dropouts and > rebuilds on the RAID sets. We've been seeing this a lot with some Areca > controllers; I am hoping that this firmware update will fix the problem. Again!? You'd think they would have learned from last time. http://www.3ware.com/KB/article.aspx?id=10240 Note that more drives than just those mentioned in that link were affected, and note the date -- 2003. *sigh* What's old is new again, apparently. Thanks for the heads up. -- Joshua Baker-LePain Department of Biomedical Engineering Duke University From ballen at gravity.phys.uwm.edu Tue Feb 6 12:40:14 2007 From: ballen at gravity.phys.uwm.edu (Bruce Allen) Date: Tue May 13 01:05:44 2008 Subject: [Beowulf] important WDC firmware update In-Reply-To: References: Message-ID: On Tue, 6 Feb 2007, Joshua Baker-LePain wrote: > On Tue, 6 Feb 2007 at 2:17pm, Bruce Allen wrote > >> Here is an important firmware update for WDC WDXXXXYS series drives on RAID >> controllers. Without this update you will see period drive dropouts and >> rebuilds on the RAID sets. We've been seeing this a lot with some Areca >> controllers; I am hoping that this firmware update will fix the problem. > > Again!? You'd think they would have learned from last time. > > http://www.3ware.com/KB/article.aspx?id=10240 > > Note that more drives than just those mentioned in that link were affected, > and note the date -- 2003. *sigh* What's old is new again, apparently. > > Thanks for the heads up. You're welcome! The old problem was the infamous acoustic noise reduction setting. Here I think the onlly change needed was to modify the default value of the firmware setting, which could also have been done with hdparm. The new problem seems to be related to the SMART auto offline test which the drive periodically runs to update its SMART data. But this is just an educated guess based on what WDC has written in their FAQ. Cheers, Bruce From rgb at phy.duke.edu Tue Feb 6 18:33:52 2007 From: rgb at phy.duke.edu (Robert G. Brown) Date: Tue May 13 01:05:44 2008 Subject: [Beowulf] S2466 permanent poweroff, round 2 In-Reply-To: References: Message-ID: On Tue, 6 Feb 2007, David Mathog wrote: > The Tyan S2466N-4M motherboards bite once again. > > I suppose I could try a vanilla kernel next, but maybe there's some > way to diagnose what state (S5, S0, whatever) the machine is going > to on poweroff, and why? The documentation for ACPI itself is > humongous, and for the linux implementations essentially absent, > so I don't know what tool to run to find or modify this info. > Interestingly /proc/acpi/sleep is missing on Mandriva 2007, but that > doesn't seem to hurt anything on the working machine, so maybe that's > (yet another) change in the ACPI interface? Basically, good luck. This is why we left our 2466N's running RH 7.3 basically "forever". They were so damn touchy and difficult to get running so that they actually were stable and so that the buttons worked and so on that once we finally got there, I'd have taken a hammer to the head of anybody that tried to change them. Besides, they worked. Quite well and all the time. They were isolated so kernel security wasn't a major issue, so why change? Just put back the old OS image. Or is there some specific thing that you need to do that you can't on the old kernels? rgb > > Thanks, > > David Mathog > mathog@caltech.edu > Manager, Sequence Analysis Facility, Biology Division, Caltech > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu From rgb at phy.duke.edu Tue Feb 6 19:06:15 2007 From: rgb at phy.duke.edu (Robert G. Brown) Date: Tue May 13 01:05:44 2008 Subject: [Beowulf] important WDC firmware update In-Reply-To: References: Message-ID: On Tue, 6 Feb 2007, Bruce Allen wrote: >>> Here is an important firmware update for WDC WDXXXXYS series drives on >>> RAID controllers. Without this update you will see period drive dropouts ... > The new problem seems to be related to the SMART auto offline test which the > drive periodically runs to update its SMART data. But this is just an > educated guess based on what WDC has written in their FAQ. I take it it isn't an issue with md raid? Linux can monitor via smartd and not get confused, we can hope? rgb > > Cheers, > Bruce > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu From i.kozin at dl.ac.uk Wed Feb 7 02:59:13 2007 From: i.kozin at dl.ac.uk (Kozin, I (Igor)) Date: Tue May 13 01:05:44 2008 Subject: [Beowulf] benchmarking database In-Reply-To: Message-ID: Apologies for an error. It was pointed out to me that the 2nd and 3rd links are incorrect. They should read as http://www.cse.clrc.ac.uk/disco/database/search-parallel.php http://www.cse.clrc.ac.uk/disco/database/search-pmb.php respectively. You may have found them from the main page away. > -----Original Message----- > From: beowulf-bounces@beowulf.org > [mailto:beowulf-bounces@beowulf.org]On > Behalf Of Kozin, I (Igor) > Sent: 06 February 2007 18:57 > To: Beowulf Mailing List (E-mail) > Subject: [Beowulf] benchmarking database > > > Dear All, > we have launched recently an on-line benchmarking database > http://www.cse.clrc.ac.uk/disco/dbd/ > The emphasis is on clusters but not exclusively so. > At the moment it has two simple interfaces: one for applications > http://www.cse.clrc.ac.uk/disco/dbd/search-parallel.php > and another for communication benchmarks IMB/PMB > http://www.cse.clrc.ac.uk/disco/dbd/search-pmb.php > > Internally the database treats everything equally. > The most basic unit is an independent "processing element" (PE) > which can be a single-core CPU, a core in a multi-core CPU, GPU, cell > or whatever. PEs can be oversubscribed ie run more than one thread > (e.g. when HT or SMT is enabled). PEs aggregate into nodes > between which communication takes place via some sort of interconnect. > Application performance is compared against the same number of PEs. > > Hopefully we will improve the interface eventually and grow the > number of applications and benchmarks. > All your feedback is highly appreciated. > > If you would like to share your benchmarking data please contact > me off the list. We are happy to accommodate results from trusted > sources. > > Regards, > Igor > > I. Kozin (i.kozin at dl.ac.uk) > CCLRC Daresbury Laboratory, WA4 4AD, UK > skype: in_kozin > tel: +44 (0) 1925 603308 > http://www.cse.clrc.ac.uk/disco From mathog at caltech.edu Wed Feb 7 11:15:59 2007 From: mathog at caltech.edu (David Mathog) Date: Tue May 13 01:05:44 2008 Subject: [Beowulf] S2466 permanent poweroff, round 2 Message-ID: > We have seen this on lots of Tyan boards in general. This probably doesn't help, on the problem machine: % cd /tmp % cp /proc/acpi/dsdt . % iasl -d dsdt % iasl -tc dsdt.dsl Intel ACPI Component Architecture ASL Optimizing Compiler version 20060707 [Sep 8 2006] Copyright (C) 2000 - 2006 Intel Corporation Supports ACPI Specification Revision 3.0a dsdt.dsl 234: Store (Local0, Local0) Error 4049 - ^ Method local variable is not initialized (Local0) dsdt.dsl 239: Store (Local0, Local0) Error 4049 - ^ Method local variable is not initialized (Local0) dsdt.dsl 244: Store (Local0, Local0) Error 4049 - ^ Method local variable is not initialized (Local0) dsdt.dsl 295: Method (\_WAK, 1, NotSerialized) Warning 1079 - ^ Reserved method must return a value (_WAK) dsdt.dsl 309: Store (Local0, Local0) Error 4049 - ^ Method local variable is not initialized (Local0) dsdt.dsl 314: Store (Local0, Local0) Error 4049 - ^ Method local variable is not initialized (Local0) ASL Input: dsdt.dsl - 2550 lines, 85340 bytes, 671 keywords Compilation complete. 5 Errors, 1 Warnings, 0 Remarks, 346 Optimizations The _WAK warning is suspicious but I see that on other machines where the powerbutton does work, so that alone is not sufficient to cause the permanent poweroff. There's a note somewhere that at least older versions of linux ACPI did not check the return value in any case. However the 5 instances where uninitialized variables are used would go a long way towards explaining the flakiness of this Tyan board. That said, to date I've *never* seen a BIOS whose DSDT could be dumped and then recompiled cleanly. The best so far was a SuperMicro motherboard with only 1 error and 7 warnings. This is what comes, I believe, of 500 page specs like that for ACPI. Regards, David Mathog mathog@caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech From mathog at caltech.edu Wed Feb 7 11:26:42 2007 From: mathog at caltech.edu (David Mathog) Date: Tue May 13 01:05:44 2008 Subject: [Beowulf] S2466 permanent poweroff, round 2 Message-ID: > However the 5 > instances where uninitialized variables are used would go a long > way towards explaining the flakiness of this Tyan board. On second thought, no. I checked these code sections and each instance is like this one: Method (_MSG, 1, NotSerialized) { Store (Local0, Local0) } Apparently they had to put something into the body of the method and used "store a value back onto itself" as a sort of no-op. Regards, David Mathog mathog@caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech From peter.st.john at gmail.com Wed Feb 7 11:59:42 2007 From: peter.st.john at gmail.com (Peter St. John) Date: Tue May 13 01:05:44 2008 Subject: [Beowulf] S2466 permanent poweroff, round 2 In-Reply-To: References: Message-ID: That makes it sound like instead of NOOP, it's "test if value is initialized, and raise an error if not" which may not have been intended. You might try commenting out that line. Peter On 2/7/07, David Mathog wrote: > > > However the 5 > > instances where uninitialized variables are used would go a long > > way towards explaining the flakiness of this Tyan board. > > On second thought, no. I checked these code sections and > each instance is like this one: > > Method (_MSG, 1, NotSerialized) > { > Store (Local0, Local0) > } > > Apparently they had to put something into the body of the method > and used "store a value back onto itself" as a sort of no-op. > > Regards, > > David Mathog > mathog@caltech.edu > Manager, Sequence Analysis Facility, Biology Division, Caltech > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20070207/bc900307/attachment.html From landman at scalableinformatics.com Wed Feb 7 18:11:50 2007 From: landman at scalableinformatics.com (Joe Landman) Date: Tue May 13 01:05:44 2008 Subject: [Beowulf] fast compiler question (pathscale/portland group/gcc) Message-ID: <45CA86E6.5030201@scalableinformatics.com> Folks: Rebuilding a code that uses sse2 inlines. Apart from setting up the appropriate include path for the intrinsic headers, are there any magic switches I need to set? I had done this a while ago, and now I am rebuilding someone-elses-code, and trying to remember what I did before. Most interested in gcc/pathscale. Have PGI locally, and others on remote system. Pointers, clues, and larts welcome. Joe -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://www.scalableinformatics.com phone: +1 734 786 8423 fax : +1 734 786 8452 or +1 866 888 3112 cell : +1 734 612 4615 From mathog at caltech.edu Thu Feb 8 13:18:18 2007 From: mathog at caltech.edu (David Mathog) Date: Tue May 13 01:05:44 2008 Subject: [Beowulf] RE: S2466 permanent poweroff, round 2 Message-ID: After a lot of work, and much help at the kernel level from Alexey Starikovskiy, the solution turned out to be using chkconfig --del to turn off all of these: acpi, acpid, harddrake, haldaemon, wltool, messagebus, mandi and also to move asus_acpi.ko out of the /lib/modules tree. I have no idea why the asus module was loading (this being a Tyan motherboard) but it was. Along the way, with various combinations of the above services turned on I observed some incredibly bizarre misbehavior on this system. While logged onto the console (not in X11) either "reboot" or "poweroff" would often lock at "Sending all processes the KILL signal...", which is killall5. Once or twice it locked at the message before, "Sending all processes the TERM signal...". In one instance it rebooted and then crashed in the BIOS. With all of these services disabled it seems to run reliably now. Additionally, when acpid was running it was possible to shutdown the system by pushing the front panel button, but then the next "poweroff" would lock at the "KILL signal" message. Regards, David Mathog mathog@caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech From elken at pathscale.com Thu Feb 8 14:07:40 2007 From: elken at pathscale.com (Tom Elken) Date: Tue May 13 01:05:44 2008 Subject: [Beowulf] Re: fast compiler question (pathscale/portland group/gcc) Message-ID: <45CB9F2C.6040805@pathscale.com> > Date: Wed, 07 Feb 2007 21:11:50 -0500 > From: Joe Landman > Subject: [Beowulf] fast compiler question (pathscale/portland > group/gcc) > Rebuilding a code that uses sse2 inlines. Apart from setting up the > appropriate include path for the intrinsic headers, are there any magic > switches I need to set? I had done this a while ago, and now I am > rebuilding someone-elses-code, and trying to remember what I did before. > > Most interested in gcc/pathscale. Hi Joe, Regarding PathScale Compilers, I have this from one of our compiler engineers: ----------------------- If the code already uses SSE2 intrinsics, the PathScale compiler does not need any "magic switches" for SSE2 intrinsics to be enabled. Some applications may need a configuration switch like --enable-sse for the application to use sse intrinsics. This is just a configure switch for the application, and not an option for the compiler. ----------------------- Cheers, Tom -- ~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Tom Elken Manager, Performance Engineering tom.elken@qlogic.com QLogic Corporation 650.934.8056 System Interconnect Group From dkondo at lri.fr Wed Feb 7 01:40:54 2007 From: dkondo at lri.fr (Derrick Kondo) Date: Tue May 13 01:05:44 2008 Subject: [Beowulf] [CFP] EuroPVM/MPI'07 Message-ID: <60ec14620702070140n5eb27a1frf3f68d798b68cfc1@mail.gmail.com> ************************************************************************ *** *** *** CALL FOR PAPERS *** *** *** ************************************************************************ EuroPVM/MPI 2007 14th European PVMMPI Users' Group Meeting Paris, France, September 30 - October 3, 2007 web: http://www.pvmmpi07.org e-mail: chairs@pvmmpi07.org organized by Project Grand-Large (http://grand-large.lri.fr/index.php/Accueil) from INRIA Futurs (http://www-futurs.inria.fr) BACKGROUND AND TOPICS PVM (Parallel Virtual Machine) and MPI (Message Passing Interface) have evolved into the standard interfaces for high-performance parallel programming in the message-passing paradigm. EuroPVM/MPI is the most prominent meeting dedicated to the latest developments of PVM and MPI such as new support tools, implementation and applications using these interfaces. The EuroPVM/MPI meeting naturally encourages discussions of new message-passing and other parallel and distributed programming paradigms beyond MPI and PVM. The 14th European PVM/MPI Users' Group Meeting will be a forum for users and developers of PVM, MPI, and other message-passing programming environments. Through the presentation of contributed papers, vendor presentations, poster presentations and invited talks, attendees will have the opportunity to share ideas and experiences to contribute to the improvement and furthering of message-passing and related parallel programming paradigms. Topics of interest for the meeting include, but are not limited to: * PVM and MPI implementation issues and improvements * Latest extensions to PVM and MPI * PVM and MPI for high-performance computing, clusters and grid environments * New message-passing and hybrid parallel programming paradigms * Interaction between message-passing software and hardware * Fault tolerance in message-passing programs * Performance evaluation of PVM and MPI applications * Tools and environments for PVM and MPI * Algorithms using the message-passing paradigm * Applications in science and engineering based on message-passing This year special emphasis will be put on large-scale issues, such as those related to hardware and interconnect techologies, or the potential or demonstrated shortcomings of PVM or MPI. As in the preceding years, the special session 'ParSim' will focus on numerical simulation for parallel engineering environments. EuroPVM/MPI 2007 will also hold the new 'Outstanding Papers' session introduced in 2006, where the best papers selected by the program committee will be presented. SUBMISSION INFORMATION Contributors are invited to submit a full paper as a PDF (or Postscript) document not exceeding 8 pages in English (2 pages for poster abstracts and Late and Breaking Results). The title page should contain an abstract of at most 100 words and five specific keywords. The paper needs to be formatted according to the Springer LNCS guidelines [2]. The usage of LaTeX for preparation of the contribution as well as the submission in camera ready format is strongly recommended. Style files can be found at the URL [2]. New work that is not yet mature for a full paper, short observations, and similar brief announcements are invited for the poster session. Contributions to the poster session should be submitted in the form of a two-page abstract. All these contributions will be fully peer reviewed by the program committee. Submissions to the special session 'Current Trends in Numerical Simulation for Parallel Engineering Environments' (ParSim 2007) are handled and reviewed by the respective session chairs. For more information please refer to the ParSim website [1]. All accepted submissions are expected to be presented at the conference by one of the authors, which requires registration for the conference. IMPORTANT DATES Submission of full papers and poster abstracts May 7th, 2007 Notification of authors June 11th, 2007 Camera-ready papers July 2nd, 2007 Submission of Late and Breaking Results September 15th, 2007 Tutorials September 30th, 2007 Conference October 1st-3rd, 2007 For up-to-date information, visit the conference web site at http//www.pvmmpi07.org. PROCEEDINGS In addition, selected papers of the conference, including those from the 'Outstanding Papers' session, will be considered for publication in a special issue of Parallel Computing in an extended format. GENERAL CHAIR * Jack Dongarra (University of Tennessee) PROGRAM CHAIRS * Franck Cappello (INRIA Futurs) * Thomas Herault (Universite Paris Sud-XI / INRIA Futurs) CONFERENCE VENUE The conference will be held in the historical, cultural and economic center of Paris, the capital of France. The city, which is renowned for its neo-classical architecture, hosts many museums and galleries and has an active nightlife. The symbol of Paris is the 324 metre (1,063 ft) Eiffel Tower on the banks of the Seine. Dubbed "the City of Light" (la Ville Lumiere) since the 19th century, Paris is regarded by many as one of the most beautiful and romantic cities in the world. It is also the most visited city in the world with more than 30 million foreign visitors per year. Paris is easily reachable from any European capital and most of the large European, American and Asian cities. It is an ideal starting point for visiting european institutes and cities. REFERENCES [1] ParSim 2007: http://wwwbode.in.tum.de/Par/arch/events/parsim07/ [2] Springer Guidelines: http://www.springer.de/comp/lncs/authors.html From =?utf-8?Q?Pablo_Hern=C3=A1n_Rodr?= Wed Feb 7 07:10:09 2007 From: =?utf-8?Q?Pablo_Hern=C3=A1n_Rodr?= (=?utf-8?Q?Pablo_Hern=C3=A1n_Rodr?=) Date: Tue May 13 01:05:44 2008 Subject: [Beowulf] mpich ch_p4mpd problem.. Message-ID: Hello, my name is Pablo. I'm having problems with MPI. When i execute a MPI program this error ocurs MPI_INIT : MPIRUN chose the wrong device ch_p4; program needs device ch_p4mpd From your post, I believe that you know how to change from using ch_p4 mpd to ch_p4. I'd be glad if you could tell me how did you do that. Thanks Pablo -- Using Opera's revolutionary e-mail client: http://www.opera.com/mail/ __________________________________________________ Preguntá. Respondé. Descubrí. Todo lo que querías saber, y lo que ni imaginabas, está en Yahoo! Respuestas (Beta). ¡Probalo ya! http://www.yahoo.com.ar/respuestas From wavelet at iutlecreusot.u-bourgogne.fr Wed Feb 7 08:24:10 2007 From: wavelet at iutlecreusot.u-bourgogne.fr (Wavelet colloque) Date: Tue May 13 01:05:44 2008 Subject: [Beowulf] Call for papers : Wavelet Applications in Industrial Processing V Message-ID: *** Call for Papers and Announcement *** Wavelet Applications in Industrial Processing V (SA109) Part of SPIE?s International Symposium on Optics East 2007 9-12 September 2007 ? Seaport World Trade Center ? Boston, MA, USA --- Abstract Due Date: 26 February 2007 --- --- Manuscript Due Date: 13 August 2007 --- Web site http://spie.org/Conferences/Calls/07/oe/submitAbstract/index.cfm? fuseaction=SA109 Conference Chairs: Fr?d?ric Truchetet, Univ. de Bourgogne (France); Olivier Laligant, Univ. de Bourgogne (France) Program Committee: Patrice Abry, ?cole Normale Sup?rieure de Lyon (France); Radu V. Balan, Siemens Corporate Research; Atilla M. Baskurt, Univ. Claude Bernard Lyon 1 (France); Amel Benazza-Benyahia, Ecole Sup?rieure des Communications de Tunis (Tunisia); Albert Bijaoui, Observatoire de la C?te d'Azur (France); Seiji Hata, Kagawa Univ. (Japan); Henk J. A. M. Heijmans, Ctr. for Mathematics and Computer Science (Netherlands); William S. Hortos, Associates in Communication Engineering Research and Technology; Jacques Lewalle, Syracuse Univ.; Wilfried R. Philips, Univ. Gent (Belgium); Alexandra Pizurica, Univ. Gent (Belgium); Guoping Qiu, The Univ. of Nottingham (United Kingdom); Hamed Sari-Sarraf, Texas Tech Univ.; Peter Schelkens, Vrije Univ. Brussel (Belgium); Paul Scheunders, Univ. Antwerpen (Belgium); Kenneth W. Tobin, Jr., Oak Ridge National Lab.; G?nther K. G. Wernicke, Humboldt-Univ. zu Berlin (Germany); Gerald Zauner, Fachhochschule Wels (Austria) The wavelet transform, multiresolution analysis, and other space- frequency or space-scale approaches are now considered standard tools by researchers in image and signal processing. Promising practical results in machine vision and sensors for industrial applications and non destructive testing have been obtained, and a lot of ideas can be applied to industrial imaging projects. This conference is intended to bring together practitioners, researchers, and t