HD cloning
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Robert G. Brown rgb at phy.duke.eduTue Dec 5 11:59:33 PST 2000
- Previous message: HD cloning
- Next message: Queue Software
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Wed, 6 Dec 2000, Bruce Janson wrote: > And to you Robert Brown: speak for yourself please when you say > (in your message of Sun, 3 Dec 2000 16:03:39 -0500 (EST)): > > A beowulf is a high performance computing > cluster, not a data or web server cluster. > > This kind of supercomputer elitism, this fascination with fine- > grained parallelism and linear speed-up has held back the progress > of single system image multicomputing for long enough. I disagree > with your claim, so much so that I wouldn't even fight for your > right to make it (well, not with much conviction). Whew! Harsh words! I don't know how you read "fine-grained parallelism" into the phrase "high performance computing cluster" -- I'm actually an embarrassingly parallel Monte Carlo kind of guy (and linear speedup is what you get when you're NOT fine-grained, so I make out just fine there;-). I'd therefore be deeply hurt if I weren't wearing teflon-coated asbestos over a kevlar vest;-) However, I'm not speaking for myself and it's not what I say that matters. It is what e.g. Sterling and Becker say: http://www.beowulf.org/intro.html (first paragraph). As the constructors of the original beowulf and coiners of the very term, their definition and utilization is the one that matters, although there is also a consensual element of the list participants associated with it. You clearly need to read a bit while properly sedated. I recommend a nice cold beer, or even two or three, so go grab a few and have a seat (and so will I;-). Now, here's a reading list. Let's see, sitting right next to me I have: In Search of Clusters, by Greg Pfister (http://www.phptr.com/ptrbooks/ptr_0138997098.html) This book will teach you that "cluster computing" is a venerable and generic term that includes high availability and failover clusters (suitable for use as webservers and distributed databases) as well as at least certain kinds of (parallel as opposed to vector) supercomputers or generic SMP systems themselves, which Pfister views correctly as being a cluster of processors united by some sort of distributed/common IPC and memory system. Beowulfs are a specific subclass of cluster computers and (IIRC) aren't even discussed (at least in any detail -- they didn't make the index) in Pfister's book, which was probably largely (being) written at the time the original beowulf was being built and winning its builders Bell prizes for the most cost-beneficial high performance computer design. Of course all the beowulf "glue" -- PVM, MPI, and all that -- did exist and was in use by me among many many others years before the beowulf project, so it isn't surprising that beowulfs are described in all but name. It does (amusingly to me, at least:-) mention Microsoft's Cluster Services "Wolfpack". Consequently we find that pre-empting the word "cluster" in favor of "beowulf" for all linux clusters seems a bit presumptive. Clusters were around and long before PC's, linux, and even ethernet. As I said, all beowulfs are clusters, but not all clusters are beowulfs. Not even all linux clusters. Not even all rackmount or shelfmount dedicated linux clusters. Frankly, not even the mostly shelfmounted linux cluster on a private network I'm sitting at right now at home is technically a "real beowulf", although I usually do speak of it as my "home beowulf" because it "almost" qualifies. However, it contains 2-3 workstations that double as compute nodes and on the net I find: The Beowulf FAQ, compiled and maintained by Kragen Sitaker: (http://www.dnaco.net/~kragen/beowulf-faq.txt) I quote: 1. What's a Beowulf? [1999-05-13] It's a kind of high-performance massively parallel computer built primarily out of commodity hardware components, running a free-software operating system like Linux or FreeBSD, interconnected by a private high-speed network. It consists of a cluster of PCs or workstations dedicated to running high-performance computing tasks. The nodes in the cluster don't sit on people's desks; they are dedicated to running cluster jobs. It is usually connected to the outside world through only a single node. Some Linux clusters are built for reliability instead of speed. These are not Beowulfs. This is a summary of a fairly extended list discussion -- it is a fair representation of the consensual view of the list at that time. This is of course a fairly particular (although accurate enough) definition and there are those on the list with a broader view. In fact (amusingly enough) I'm one of them and have had some interesting discussions with the more passionate defenders of the original "tight" definition in the past. From the Beowulf HowTo, for example, we get: There are probably as many Beowulf definitions as there are people who build or use Beowulf Supercomputer facilities. Some claim that one can call their system Beowulf only if it is built in the same way as the NASA's original machine. Others go to the other extreme and call Beowulf any system of workstations running parallel code. My definition of Beowulf fits somewhere between the two views described above, and is based on many postings to the Beowulf mailing list... By no great coincidence, another book I happen to have at hand is "How to Build a Beowulf", by Sterling, Salmon, Becker and Savarese (SSBS), which has to be viewed as a sort of "horse's mouth" view of beowulfery. In it, they take a surprisingly inclusionary view of beowulfery at the application level, while being much stricter on the hardware architecture side (where they clearly differentiate a "true beowulf" from an e.g. NOW or COW or POPCs like the one I run at home and mislabel a "beowulf":-). For example, in section 10.2, "New Opportunities" the authors acknowledge that while historically beowulfs have been primarily used for scientific and technological applications (traditional "supercomputing chores") the hardware architecture itself is amenable to new domains of application including databases and web servers and hyperrealistic simulational online gaming and virtual realities and process control and AI and genetic programming. Who could argue? A rackmount/shelfmount linux cluster is an (undeniably useful in all of these venues) rackmount/shelfmount linux cluster, and the architectural glue that they view as being a core element of the beowulf can be used to stick together many kinds of parallel applications. I even do some AI and work on parallel genetic optimization code on my home 'wulf, and my gateway node has a (non-parallelized:-( webserver running on it (oops, I did it again). One still has to ask if it is fair to call any old rack of linux boxes in an ISP a "beowulf", or a rack of linux boxes in a webfarm, or a rack of linux boxes running any sort of distributed database, or office full of linux workstations running a background computation at the same time they provide console access and word processing to foreground users. In the past, this has has been a point where list opinions have diverged somewhat, partly because not everybody uses the same glue to the same extent. A beowulf >>can<< be viewed as nearly any dedicated rack or stack of linux boxes on a private network because there is no sine qua non of beowulf on the kernel/glue level. With Scyld as a sort of unifying glue, that may slowly become less of an issue, although I doubt it. It isn't clear that a unified process id space is desirable in all circumstances, for example, and one does give up one sort of the power and flexibility of a node in exchange for another sort of power when one configures nodes so that they can no longer support a login process (for example). There is also no real advantage in being >>too<< narrow in a definition. I personally think (and this is now MY opinion, not accepted definitions or even necessarily in agreement with SSBS), from being on the list for years and reading all of these books and more fairly carefully, that it isn't really fair. ISP's discovered stacks of linux boxes independently and have written their own glue. So did a lot of the webserver folks. They use a largely independent software base and overlap remarkably little with the message-passing sort of software/networking technology that seems to be an essential element to beowulfery. Databases I'm more open minded about, but again the problems being solved often transcend just parallel computation and communication on COTS hardware. I'm just not comfortable with every rackmount or shelfmount cluster known to man that happens to run linux (or freebsd, or WinNT, or DOS, or Solaris -- where does one draw a line?) being suddenly relabelled "a beowulf". It would be like calling toilet paper and paper towels and paper napkins and even wet naps "Kleenex" just because one kind of facial tissue was particularly successful at branding. Diversity keeps us from having rolls of toilet paper on hand for use as picnic napkins, specificity keeps me from bringing home paper napkins instead of facial tissue. There are also practical reasons to maintain at least a teeny bit of focus on the list, regardless of just what a beowulf "really" is. For one, this list has one of the best signal-to-noise ratios of any list I've ever been on (and I'm probably singlehandedly some of the worst of the noise;-). Parallel database discussions alone could make this list as bad as the linux kernel list (which is practically unusable at this point without an attack-dog adaptive procmail filter) and would be utterly uninteresting to, um, "many" of the list paticipants. Possibly violently uninteresting. For better or worse a lot of the list members are a) Physical Scientists >>using<< beowulfs for numerical computations. We tend to be less concerned about whether a given cluster is really a "beowulf" in the precise sense defined by SSBS and in the FAQ and more concerned with whether the particular cluster we are working with or trying to design will accomplish the real work at hand that we need done at affordable cost. In other people's money, of course;-) b) Real Computer Scientists working beowulfery as their primary research interest -- see the Clemson group, with Rob Ross, Walter Ligon, and others and the PVFS they are building as well as the beowulf underground site, for example. There are papers published on this stuff and prizes awarded for this stuff. Heck, I dabble in this as well myself as it is quite fun, but it is really an avocation, not a vocation. c) Professionals running (turnkey and otherwise) beowulf support businesses -- Paralogics' Doug Eadline and HPTi's Greg Lindahl, for example. Scyld's Dan Ridge, Erik Hendriks, Don Becker and others. Note that many of these guys are Real Computer Scientists who graduated to the "real world" to get rich on their startups. They are NOT competing on e.g. webserver RFPs (as far as I know, anyway). d) Sundry Interested Parties. At Expos and online I've met corporate folks from IBM, publishers, entrepreneurs, oil prospectors, and many, many students. Many of these folks just listen and learn and ask rare questions. e) A very significant fraction of the list membership is overseas with much of it in developing countries. This makes sense -- the beowulf concept is by definition and design >>the<< bleeding edge cost/benefit winner for supercomputer design. The USA still has deep pocket funding agencies that will spring for a multimillion dollar big iron supercomputer to solve a grand challenge problem in four or five years four or five years before the current generation of PC's can do it as a screensaver. Overseas, they often don't. Students in Korea, in Pakistan, in Malaysia can put together a real supercomputer for a few thousand dollars or even less, using recycled or obsolete parts. Many of these folks (outside of d) and e) have been on the list almost since the beginning. They provide both perspective and continuity and an instant, free consulting service to those in groups d) and e), who come and go. They'll tolerate a discussion of embarrassingly parallel applications and grid computing and so forth because it is high performance computing on a COTS open source architecture (in the sense that many flops are being expended on a legitimate parallel supercomputing application, at least). They'll (mostly) tolerate my calling my heterogeneous beowulfish cluster at home or in the Duke physics department a "beowulf" because it is too tedious to write "heterogeneous beowulfish cluster consisting of a mix of dedicated and desktop nodes running an appropriate parallel task mix based loosely on e.g. MOSIX, PVM, MPI, sockets and other IPC mechanicsms that keeps it busy" every time I want to refer to it. BUT do they have to listen to every person who builds a web farm and wants to call it a beowulf talk about transparent forwarding of connections? Do they want to delve into the wonders of (parallel) MCIF systems, SQL statements (plus extensions) executed in parallel on a distributed database? Do they want to hear about failover in web-based distributed applications, about (parallel) CRM systems, about (parallel) B2B efforts and about (parallel) server appliances? Even if they think some of this interesting (as I obviously do) is there time to cover all this on this list? Not in two lifetimes. A moderate degree of focus is what allows me to write long answers to newbie questions. A moderate degree of focus allows me to read the discussion written by others and find it useful instead of Huh? I'd be happy to listen to a discussion of e.g. PVFS as a platform for a database (I already have, actually). I think integrating a webfarm with a distributed computatational base (beowulf) is a totally nifty idea for providing certain kinds of services (and have a startup company working on the idea). However, high availability, failover, ISP issues per se, webfarm issues per se, I just don't have time to learn about all of this and get ANYTHING done. Is this fair? Am I being crazy here? I'm not even a Real Computer Scientist, so casting me as a fine-grained computing fiend who is somehow obstructing "true, unbridled" beowulf development is just not correct... rgb -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
- Previous message: HD cloning
- Next message: Queue Software
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
