COTS was Re: [Beowulf] 96 Processors Under Your Desktop
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Robert G. Brown rgb at phy.duke.eduThu Sep 2 07:35:38 PDT 2004
- Previous message: COTS was Re: [Beowulf] 96 Processors Under Your Desktop
- Next message: COTS was Re: [Beowulf] 96 Processors Under Your Desktop
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Wed, 1 Sep 2004, Jim Lux wrote: > Interesting point.. > At what point does "turnkey" turn into COTS? > Maybe if COTS were really Consumer Commercial Off The Shelf or Consumer Off > The Shelf? > I would think that the intent of COTS is a sort of non-customized, > non-unique, catalog item. > > But RGB makes a valid point that one can buy a complete turnkey cluster, > software all installed, etc. However, these are not really COTS, in that, > I doubt any of the vendors has a warehouse full of them all sitting on the > shelf ready to be shipped. Nowadays, a lot of vendors (e.g. Dell) don't have a warehouse full of >>PCs<< sitting on the shelf ready to be shipped. Just in time, semi-custom assembly has become a commodity itself; even vanilla pc vendors often have a web configurator. Although this is all really nit-picking about whether or not there needs to be a real shelf and a commodity market with brokers and everything for something to be COTS as it stands in the original beowulf definition. Obviously there doesn't. Let's instead go with the fairly clear purpose for including COTS in the original definition of beowulf. The idea was, and still is, to exploit the fact that computers built out of components that are sold by the tens and hundreds of millions of units, ideally components that are available from several competing manufacturers, have profit margins determined by the scale benefits of large scale manufacturing AND surpressed from above my competition in the marketplace. This was in direct contrast to the big iron supercomputers of the day, which were generally hand engineered and used many parts that were custom engineered for just the one system and manufactured in limited runs at consequent high cost. Even six or seven years ago there was nothing contradictory about building and selling a turnkey beowulf. Rackmount or not, they were built out of readily available COTS PARTS (not necessarily readily available COTS SYSTEMS, as cluster people have nearly always microspecified the configurations of their nodes in such a way that you would be very unlikely to find them on the actual shelf of an actual store. We add memory, alter disk, select a faster CPU, dump the high end video card, add a better NIC or even a cluster-specific custom NIC. So I'd have to disagree that beowulf cluster nodes have EVER been "catalog items" in practice. They have ALWAYS been more or less custom assembled according to specification, but they have been built out of COTS >>parts<<. No fair using a fancy motherboard with exotic communications or memory pathways of use only to cluster builders. No fair doing a custom ASIC and designing your own motherboard or card just for your one cluster. Just an off the shelf disk (however nice or cheap a shelf), an off the shelf motherboard (sold by the hundreds of thousands in nice identical boxes) equipped with memory and network etc ditto, and packed up in a standard case. Now, in recent years, the COTS concept has been bent a little in both beowulf and generic cluster engineering in several respects. Custom cases have become a commonality among vendors selling turnkey clusters or selling vanilla "cluster nodes" (as always, built to your specification within reason). This is partly because server class motherboards run hotter than a 1U packaging permits. I'd argue that they are still, barely, COTS in the sense that matters because there are lots of competing manufacturers, lots of units sold, and multiple markets that use them (HPC clusters and server clusters, very different markets at that). The other is the network, where as we know the cluster market HAS sustained the development of a handful of "speciality" networks, e.g. myrinet, sci. These "cluster networks" are really the one place where I see the definition of COTS bent to the breaking point (if one wants to be picky -- I personally don't think there is any reason to be religious about it especially for this particular component). Yes, the network interfaces have to plug into a standard bus. Yes, they are "mass marketed" (to all the cluster builders in the universe). Yes, there is even competition -- between the few, totally incompatible and proprietary alternatives. Where gigabit ethernet is clearly COTS, gigabit myrinet is clearly not. Yet who amongst us would argue against implementing the latter (or its more recent faster decendents and cousins) in a cluster design that required it? Not me, that's for sure. One day this may change. Perhaps a new network will emerge that is used (as is ethernet today) in a wide range of systems and not just clusters, that has high bandwidth and low latency, and that is built to "open" standards at least in the sense that (like ethernet) anybody can pay a modest fee and design an interoperable interface on the basis of published specifications. In the meantime, COTS is an important element of beowulf or cluster design, and it is the overwhelming cost-benefit of COTS vs non-COTS that has cause the explosive growth in the number of cluster nodes in recent years. But it shouldn't be carried to a fault -- if non-COTS components end up being part of the most cost-effective solution to a particular problem, obviously a sane buyer will use them. Now, regarding turnkey clusters -- what the cluster buyer gets from the deal is BOTH a pile of COTS hardware (possibly "corrupted" with a COTS-grey custom 1U case and openly non-COTS network) AND the human expertise required to install it to the requirements of the customer. This latter step has nothing whatsoever to do with the beowulf or general cluster definition. Most cluster users hire somebody to install their cluster these days, I'll bet. Folks like myself who both build the cluster and use it are an archaic holdover from the early days of the beowulf list, although I have no doubt that the list itself overrepresents this part of the cluster-operating population for obvious reasons. At Duke, MOST of the clusters on campus are built and maintained by systems people and actually used by research faculty that never touch a screwdriver or wire and that never have root privileges on a node. With that, how could it matter who does the assembly and installation? It is "hired out" (relative to the actual user of the hardware) either way. Obviously one should (again) go with the most cost-effective solution. In some cases this will be turnkey solutions, for example if a cluster is needed by a group with little local expertise or opportunity cost sysadmin labor available. In others, it will be a locally engineered, installed, maintained cluster, typically where there is a lot of local expertise or a surplus of opportunity cost sysadmin labor so that economy of scale can be exploited. Universities and certain government labs fit the latter pattern; corporations and other government labs more often fit the other. Both are using COTS clusters and seeking to maximize utility and minimize cost. > It would also be interesting to know how many of those turnkey clusters are > being delivered to total cluster newbies who will use them with minimal > cluster specific training (obviously, they need to know where the power > switch is, etc.). I'd guess that most of the turnkey clusters are going to > either someone who has used a cluster before, possibly having built one > themselves and recognizing they have better things to do with their time, or > to someone who will take a class or specific training on cluster use. I'm not at all sure about this, but I'm sure that some of the turnkey vendors will respond. From what Joe has told me, for example, I think that many turnkey clusters are custom engineered for specific (software) applications and sold along with training and support to a group that has minimal local skill or experience with clustering. Somebody that really knows what they are doing knows that the marginal cost of turnkey is far greater than the cost of the week or so (tops) that it takes to set up and install most cluster configurations, less in an environment already running multiple clusters. The additional cost per node means fewer nodes, and cluster users tend to be node-hungry. I think that they usually spend nodes for turnkey systems when they have little choice (or when they have very deep pockets behind them). > I see the COTS model is more consumerish, in that the seller doesn't expect > to have to provide much customization and support. Not many people take a > class on operating their TV or VCR or cellphone. Some people take classes on > PCs, but most sort of get on the job training from someone else who knows > more about what to do. I don't think there's enough cluster folk about to > go for that model, though. There are now numerous vendors selling "cluster nodes" as more or less commodity items. Penguin, for example. Penguin owns Scyld, and will sell you a turnkey cluster with scyld preinstalled on it I'm certain. It will sell you a cluster with SuSE preinstalled on it (their default, I believe) where with PVM or MPI you are left needing to install accounts, set up NFS, and go "poof, you're a cluster". It will sell you a cluster with SuSE preinstalled on it (for free, why not) that you can subsequently PXE-start into your own cluster configuration. They charge according to what you get, they use OTC parts in custom cases (important as noted to get a reliable dual anything in a 1U form factor), they have plenty of competition e.g. Appro as noted by Gerry, IBM, Dell, etc... who ALSO use OTC parts in custom cases, will in some cases preinstall an OS for you, and so on. Then there are groups that JUST assemble nodes or buy them from these vendors and put together a serious cluster for you and deliver it all racked up to your very door, and will provide training or custom softare installs or even site management -- for a fee. I think these examples make it clear that the marketplace has clearly differentiated the (mostly) COTS systems themselves that make up "cluster nodes" in all sorts of clusters and the installation, maintenance, and operation of those nodes. A beowulf need not be a DIY enterprise, and if it isn't it is just a matter of how and who does the work you don't do yourself INDEPENDENT of the COTS issue. > And, thinking of things where the complexity is between toaster oven and PC, > there was some sort of self-instructional video built into my new HD-DVR > cable box, and given the "rev zero" ness of the operating software, maybe I > should have watched it. (totally off the subject, but it's supposedly a > Linux based system running on a 733 MHz x86, with an integrated cable modem > and ethernet interface, etc..... There's a vehicle for "grid computing".... > The cable company can sell spare cycles on my box, and I'm paying for the > electricity AND the box too. Maybe that's why they gripe so much when I > unplug it all the time (it draws about 100W, 24/7, so it costs more for the > electricity to run it than I pay for the HD cable service)) > > And, I have one of those $200 WalMart cluster nodes at home..it's OK, but I > wouldn't buy another one, for a variety of reasons. Precisely. Cluster builders have >>always<< specified node configuration and only >>rarely<< actually used systems purchased off of some shelf the way they come out of the box. That doesn't make their nodes any less of a COTS item; it only recognizes that the COTS part refers to the components, not the particular assembly. So I reiterate -- it will be interesting indeed to see if the cluster market is finally large enough to support "cluster specific" CPUs and supporting chipsets that are not used at all in non-cluster applications the way it supports cluster specific NICs. COTS or not, if they are price/performance winners they are likely to succeed, if not they are likely to fail (or at least fail to grow beyond a specialty market-within-a-market scale). rgb > > ----- Original Message ----- > From: "Robert G. Brown" <rgb at phy.duke.edu> > To: "Greg Lindahl" <lindahl at pathscale.com> > Cc: <beowulf at beowulf.org> > Sent: Wednesday, September 01, 2004 2:16 PM > Subject: Re: COTS was Re: [Beowulf] 96 Processors Under Your Desktop > > > > On Wed, 1 Sep 2004, Greg Lindahl wrote: > > > > > On Wed, Sep 01, 2004 at 12:41:24PM -0700, Jim Lux wrote: > > > > > > > "requires time and expertise to set up" is of course what makes > clusters (as > > > > a completed system) not COTS, even though the components or > subassemblies > > > > may be COTS. > > > > > > I learn a new definition of COTS every day. I hadn't seen this one > > > before. I suppose all the parents struggling to assemble toys on Xmas > > > eve can console themselves that the mass-market item they bought at > > > Wal-Mart isn't COTS... > > > > And turnkey beowulf systems (built of COTS components) have been around > > for many years now. In fact, some list members (ahem;-) have built and > > sold them. > > > > So the new computer cluster (orien?), with a new CPU, is actually > > FARTHER from COTS -- especially if the new CPU is designed only for use > > in the cluster market. Hopefully it isn't -- it's questionable as to > > whether the cluster market can sustain a specialty CPU with so many COTS > > alternatives that stay cheap because they are mass marketed. > > > > Wal Mart sells compute nodes, too, if you want to use their cheap > > systems for that purpose. > > -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
- Previous message: COTS was Re: [Beowulf] 96 Processors Under Your Desktop
- Next message: COTS was Re: [Beowulf] 96 Processors Under Your Desktop
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
