new SGI Origin & Onyx 3k?
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Robert G. Brown rgb at phy.duke.eduWed Jul 26 21:56:16 PDT 2000
- Previous message: new SGI Origin & Onyx 3k?
- Next message: initial ramdisk
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Wed, 26 Jul 2000, Kragen Sitaker wrote: > "W Bauske" bloviates: > > Exactly how often do you think people want to run 1000 nodes on > > a single problem? When was the last time you did that? > > > > Realistically, the majority of parallel jobs are in range 4-128 > > processors, depending on what you want to do and how the code scales. > > SGI's do fine in that space. So do other non-beowulf systems or > > no one with sense would buy them. > > "I think there is a world market for maybe five computers." > - Thomas Watson, Chairman of IBM, 1943 > > Presumably people were getting by with tabulating machines and desk > calculators when he made this remark. > > He probably would have been right. Except that computers vastly > increased the market for calculations by decreasing their cost. > > Numerically, the majority of parallel jobs are probably "in the range > of" 2 processors, and probably always will be. That doesn't make > 4-processor machines or 1000-processor machines useless. I'm not sure that I'd make >>that<< categorical a statement. Remember, most single CPU computers already use 2-4 processors in parallel -- CPU, video, controllers... a lot of this is just hidden firmware based parallelism. We are also at a relatively primitive stage of parallel hardware and software development. There have been hardware development efforts described on this list in the past that might drop the cost of "a processor" to put in a system to $5 or $10, so that the "power" of a future system is completely derived from multiprocessor parallelism instead of clockspeed increase, in a beowulfish architecture. In the future, upgrading a computer may be snapping another "node" into a slot on a commodity bus/switch with a master node handling I/O and primary interface. I also don't have any idea what "the majority of parallel jobs" will consist of in the future. Of course, I'm not even sure what it is now;-) For example, let us all not forget that even now there are whole LARGE categories of code/problems that scale "to infinity and beyond". Well, not REALLY infinity, but anything in the embarassingly parallel (EP) category (e.g. SETI, RC5, a heck of a lot of importance sampling Monte Carlo, rendering, image processing, etc.) can be scaled as high as you like, sometimes using a floppy disk and a pair of rollerblades as a COTS IPC/job management channel. I'd cheerfully use 1000 systems if I could afford them (including their care and feeding) to do my Monte Carlo. I'd just get work done in 1/10 the time it would take me using 100 machines and 1/100 the time it would take me using 10 -- scaling is no problem even with 10BT networks. The only thing that limits my scaling now is the tools I use to manage runs, and I'd cheerfully modify them to work on 1000 nodes if somebody wants to give me a couple or three $M for the hardware and the nontrivial infrastructure...;-) There are far more 1000 node problems (real or potential) than have been mentioned on the list. There is at least one 1000 node GA system (at Stanford, IIRC) although I could scale my own GA's up that far and have tremendous fun if somebody pops for the boxes. Der Ubercracker would love to have 1000 nodes or even more, as many encryption problems are just a matter of how fast you can work through the keyspace and can be run EP. Rendering (making movies like Toy Story, Dinosaur, etc.) can burn 1000 or 10,000 nodes if somebody will pay for the suckers. More nodes = faster development time = more money. Indeed, rendering alone might one day be THE cycle consumer on a desktop workstation -- in that future day where our operating systems and GUI's present an entire 2d/3d rendered view in a virtual space, with tools that make authoring an animated rendered GUI a straightforward matter. That "desktop" rendering might well occur on a whole array of cheap parallel processors rather than a single expensive one, interconnected with a COTS network (built into the system) and running a descendent of bproc -- a desktop beowulf. So there are (at least) TWO categories of beowulf user that are interested in very large beowulfish systems. One is the class that is using beowulf-type architectures for HPC problems, like those that Greg has referred to. These users have LARGE problems with some nontrivial granularity and require high speed processors and high end IPC channels to profitably scale out as far as possible. They buy/build systems that are inexpensive compared to dedicated boxes (usually) but are still quite costly compared to the "standard beowulf design" of the Intel-du-jour PC node plus switched 100BT. Their design decisions are dominated by performance and scaling, not cost per se except the factor of ten or so cost advantage relative to dedicated iron. The other class is those folks doing relatively low-tech EP work, who have always been able to scale out to fill their available cluster resources. There are likely a lot more of the latter than the former, although their problems are not necessarily as sexy and don't get as much grant money to support them as they are "too easy" (forgetting the fact that they may be easy computationally but certainly aren't necessarily "done"). These users often get by with systems that are cheap even as beowulfs go on a cost/node basis -- stripped Celeron nodes on cheap hubs or switches. Their design decisions are dominated by economics -- how can I get the most work done for my inevitably limited budget. In between are the rest of the beowulf users, who have a wide range of possible constraints that limit the size of the beowulf they can use, usually with respect to their fixed budget but not always. In a lot of cases their problems may well scale only to <O(100) nodes, at least given their budgets. Still, if you gave those same folks 1000 nodes and a team of programmers, postdocs, and grad students to help out, I'll bet a whole lot of them would find some way of running 10 EP independent threads of O(100) scaled calculations. Or something...the resource would be used. To conclude, while many people are indeed limited in the size of the beowulf they can use by the scaling characteristics of their problem and hardware, there are ALSO many, many people who are limited primarily by the size of their pocketbook, at least until their budgets are increased by orders of magnitude over where they are now and they approach their problem's real scaling limits. Like me. (I'd pass a hat around to collect just a wee bit from everybody if only I could:-). rgb Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
- Previous message: new SGI Origin & Onyx 3k?
- Next message: initial ramdisk
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
