[Beowulf] hpl size problems
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Robert G. Brown rgb at phy.duke.eduWed Sep 28 14:54:01 PDT 2005
- Previous message: [Beowulf] hpl size problems
- Next message: [Beowulf] hpl size problems
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
In re: the xfs example: xfs 2648 0.0 0.1 11804 1336 ? Ss Jun20 0:00 xfs -droppriv -daemon or the accumulation of less than a second of CPU time since June 20 when the node was rebooted after its FC4 upgrade. In fact, on this kickstart-installed X86_64 (dual opteron) node all the processes but cluster-specific processes (applications or monitoring tools) put together have used less than 30 minutes of CPU time over three months. Almost all of THAT is by one daemon (hald) that is, as it turns out, probably not necessary. So in our next cluster kickstart we leave it off (or turn it off) or we use one of our drop-in tools to have it turned off/removed overnight. xfs is indeed a perfect example -- of something that it isn't worth worrying about. It's not clear that it's worth the effort to even turn it off. In re: the comment on e.g. installation time: Most of the clusters I've run have spent anywhere from 100% to 70-80% running a handful of applications. Sometimes only one application (the primary research tool of the cluster owner). The efficiency of that application has generally not been a strong function of operating system -- we put a reasonably but not pathologically stripped image on the nodes and then just plain forget them (aside from yum-driven automagical updates of their installed base) for as long as their entire service lifetime, although more often we'll update them once or twice, when convenient. So while very short install times, instant dynamical provisioning and perfectly minimal application support sounds very reasonable (and is, for the most part, what the kernel already does, right?) in practice it would save us pretty much zero time and very likely wouldn't make our CPU-bound code run significantly faster. YMMV, of course -- I'm not claiming that our operation is typical or ideal. Then, to continue... Donald Becker writes: >> So I agree, I agree -- thin is good, thin is good. Just avoid needless >> anorexia in the name of being thin -- thin to the point where it saps >> your nodes' strength. > > You've got the wrong perspective: you don't build a thin compute node from > a fat body. You develop a system that dynamically provisions only the > needed elements for the applications actually run. That takes more than a > single mechanism to do correctly, but you end up with a design that has > many advantages. (Sub-second provisioning, automatic update consistency, > no version skew, high security, simplicity...) > > -- > Donald Becker becker at scyld.com > Scyld Software Scyld Beowulf cluster systems > 914 Bay Ridge Road, Suite 220 www.scyld.com > Annapolis MD 21403 410-990-9993 Ah! THAT'S the problem. I have the wrong perspective. I knew it was something like that...;-) Just to be clear about my mistake, if one builds a cluster that is a single computer as you describe, either it is on an open network or it is inside a head node that is a de facto firewall. Since the motivation for such a cluster is often to enable it to scale effectively with tightly coupled code, one usually chooses the latter. In such a cluster (as you've pointed out a number of times over the years) one doesn't think of the cluster as having "nodes" that you can login to, one thinks of it being a single computer you can login to with lots of CPUs and a message passing paradigm for IPCs across the CPUs. In such a cluster internal security is usually all but nonexistent, because security costs performance and because it is a "computer", all real security lives on the head node. One doesn't provide internal interCPU security in an MP operating system as processor 3 isn't generally trying to take over processor 4 -- it is more a question of who owns the kernel and root process structure that controls what runs on ANY processor. You work very hard (usually) to create such a system that scales well and maintain it on top of or on the side of an update stream connected back to the kernel, to the libraries, to hardware drivers, to essential tools and utilities, and to provide users of the system with specialized tools and utilities to replace standard tools and utilities to provide the illusion (or rather the layered, network-implemented reality) of a single MP computer. This work doesn't end -- you are basically supporting a separate distribution that shares memes with linux/gnu and you MUST co-develop with the base code set or in a couple of years you're screwed as the code base you rely on to not have to reinvent or coinvent numberless wheels diverges. Very labor intensive, very expertise dependent -- not a lot of people CAN do the work. Very, if you like, "expensive" in every sense of the word, but justified by the absolute benefits it provides to users of the appropriate kind of code and maybe (less certainly) by the relative benefits it can confer in terms of ease of use or installation. Where "less certainly" is because there you have to do a CBA against all the various competing paradigms, and there are a lot of them. Not an easy assessment given wildly varying real costs, opportunity costs, and cost scalings per environment per paradigm. Given also a rather wide range of applications that one might run on a cluster where the marginal benefits (in terms of speedup, time to completion of a project, etc.) are likely to be very slender or even (in for example the case of an application that doesn't easily build and run because of divergence from or a lack of support for some tool or library available in one of the competing distributions) negative benefits, at least pending the investment of still more labor. Which economically can easily turn out to be more labor than it is worth for a small enough class of tasks, which leads to yet ANOTHER economic problem -- do you invest your limited resources working to support a small class of applications that cannot in some sense "pay their own way" in terms of overall contribution to cost-benefit... In competition with this there is (primarily) the "use an existing operating system distribution for your cluster" paradigm, which can be implemented on top of pretty much any Unixoid operating system distribution or even (shudder) on WinXX. This approach deliberately minimizes the the utilization of cluster-specific modifications of the actual core operating system -- kernel, libraries, binaries, documentation -- because in this way it avoids the immense cost of that divergence. It seeks instead to add "portable" cluster tools on TOP of the distribution -- ideally tools that will build and run on ANY distribution with minimal hacking and #ifdef'ing and autoconf'ing (a fantasy that of course sometimes works and sometimes doesn't:-). This general paradigm is broad enough to encompass fat nodes and thin nodes, diskfull nodes and diskless nodes, and even internal protected synchronous clusters with no internal security and clusters that span Internet-wide authentication and administrative domains where security is a sine qua non. BECAUSE the "cost" of the approach is deliberately held to a minimum and there is a reasonable degree of portability in the cluster toolset, there is much competition between fat and thin, between diskless and diskful, between "beowulf" and NOW/COW and Grid. With this degree of competition one gets many benefits. First, a would-be cluster builder can choose what makes sense to them in their particular environment for their particular application space. There is a high probability that they'll be able to build a cluster that uses any hardware supported by the general distribution(s) that might form a base, with any library(s) that it might require, with installation, administration, maintenance, and authentication mechanisms that are familiar to them because of their EXISTING investment in e.g. supporting a LAN on top of some distribution. This in turn may minimize retraining costs or permit them to leverage existing resources or may permit them to comply with some externally imposed constraint: the use of some particular authentication mechanism, integration with some organization-wide resource, the ability to run particular applications with strong constraints on their support libraries. Where the latter is no joke -- I'm not KIDDING about the RH 7.1 in the ATLAS gridware, even though it SEEMS like a joke. Maybe they've finally bitten the bullet and started the arduous process of porting their code to a modern linux. Maybe not. Really, this is just another example of the danger of forking something off into a separately maintained branch without the resources to properly codevelop it with the main branch. If you don't CONSTANTLY feed the source base money and human time, you ultimately pay even more to realign them. Second, the competition in an open source environment with free user choice (where they vote by using or not using anything that is developed and offered up by developers) is a near-optimal genetic optimization process. Code (memes) are shared. Successful code fragments survive and are written into more complex applications and spawn new species of application; unsuccessful ones or obsolete ones are gradually pruned from the tree. Warewulf is but one example although there are other emerging diskless linuces and it may well be that in a year or two diskless linux will emerge as the paradigm of choice for LAN workstations. Cross-fertilization works both ways. CaOSity, Scientific Linux, Mandriva, FC, RHEL, SuSE, Debian -- you can build a cluster on top of any or all of these, using (in many cases) familiar tools and installation/maintenance paradigms AND being certain that any advance or improvement or bugfix or security patch that makes it into your base distro will make it into your cluster as well. Third, the alternatives have very different up-front costs so that at least some of the alternatives are not de facto exclusionary -- too expensive for students, for cluster builders in the third world, for hobbyists. Exclusionary expense is one of the biggest PROBLEMS with the high cost of divergence -- only the "rich" can afford it, so it ends up being useful only to those with high-benefit problems in real cash terms. Distributions that serve as the base for clusters are such that at any given time a few of them cost money and have sufficiently large installed bases and associated revenue streams that they attract commericial developers and can get certified for various kinds of government restricted usage. Others are free and more suited for hacker/hobbyists. Some are in between. The key thing to remember is that hacker/hobbyists with limited disposable resources for software BUILT linux and most of gnu and a lot of the existing cluster support software -- in part because they didn't want to or weren't able to afford the cost of paying for the commercial alternatives of the day -- and continue to be a rich source of new ideas and applications today. So this isn't a bad thing. So finally -- my perspective is, in a nutshell, this last statement. This isn't a bad thing. In my opinion, when one considers the range of applications to be run on ANY kind of cluster, the range of administrative expertise likely to be available to a would-be cluster builder, the costs and benefits of the two primary cluster paradigms (on top of a minimally diverged distro mostly in userspace or as a signifanctly deviated and customized codeveloped branch with significant alterations in the kernel and rootspace), most builders of and users of clusters will benefit from sticking close to a distro and THEN making their choices -- diskless or not, network isolated or not, rsh or ssh, and so on. Building a cluster on "top" of a standard distro, as fat or thin as they like, with minimal divergence (most clusterware on TOP of the distro, as in warewulf). Whether they run the cluster as a "beowulf" with a batch queueing system and a head node or as an undifferentiated NOW, MOST of their operating environment will update from the primary tree/mirrors of their distribution and will hence evolve along with that distribution (which rate may be fast or may be slow according to what they choose) and remain as patch-current as its maintainers can keep it with a much larger user base participating. Heck, I think that this is a GOOD thing, and that most, but not all, would-be cluster builders are well served by it. rgb -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://www.scyld.com/pipermail/beowulf/attachments/20050928/b62b1a53/attachment.bin
- Previous message: [Beowulf] hpl size problems
- Next message: [Beowulf] hpl size problems
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
