[Beowulf] A Good Linux Distribution to Start with?
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Robert G. Brown rgb at phy.duke.eduThu Sep 9 07:55:01 PDT 2004
- Previous message: [Beowulf] A Good Linux Distribution to Start with?
- Next message: [Beowulf] A Good Linux Distribution to Start with?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Wed, 8 Sep 2004, Dragovich, Jeff wrote: > I am at the point in my beowulf cluster construction where I need to > pick the Linux distribution to use. I have a small cluster (10 nodes, > CISCO switch, a single control machine). The cluster will support > parallelizing/benchmarking a finite element program using MPI. I am > currently the only prospective user, and don't need sendmail and a bunch > of that stuff. Just dev tools. > > Any comments on which Linux flavor to start with? I've read some jabs at > Fedora. Can't find a FAQ (after about 4 hours of searching) that really > discusses the pros and cons of each Linux variant related to Beowulf > clusters. I know religion is a hot topic, but please don't flame the > agnostic. :) > > Jeff I don't think this is addressed not because it gets religious but because you can do perfectly satisfactory clustering with any distribution. Many or even most distributions support at least basic clustering (with PVM and MPI) right in the distribution itself, so it is just a matter of selecting the packages for installation and then learning to use them. Of course all of them support raw network programming at the socket level. Higher end cluster tools are often also available for many of the distributions or are at worst a rebuild away. Fedora core 1 had issues, but FC2 is working pretty well for us here, both on desktops and (so far) on cluster nodes including opterons. Centos (logo-free RHEL rebuild, stays within hours to days of RHEL at the logoless package level) should work as well as RHEL, obviously, and Red Hat itself (if you don't mind paying them on a per-node basis at fairly absurd rates) has always been a decent package to use for clustering. SuSE ditto -- lots of turnkey vendors use SuSE as a basis. Mandrake ditto -- it has its (IIRC) "CLIC" cluster-specific packaging. In both of these latter two cases there are again issues of licensing and charges on a per node or per cluster basis. On the non-RPM-based front, Debian is totally open and free and is certainly used in clusters. On the non-linux (but still totally open source) front, freebsd is used successfully in clusters. There are a number of so-called "cluster distributions" to choose from as well. OSCAR is an older one that I'm not sure is still being loved by anyone. ROCKS is a newer one, built on top of (IIRC) a RH 9 (?) release and maybe moving towards centos or FC? CLIC I mentioned. Scyld is a commercial but very powerful and well-supported "beowulf in a box" distribution, I believe derived from a RH variant these days but am not sure. Scyld can cost a lot for full support and everything, but for somebody doing what you are doing (basically learning/playing more or less out of pocket) they might give you a significant break. Clustermatic/bproc is a way of getting a lot of what scyld offers in a fully open source DIY way. I'm probably leaving a bunch out -- nice diskless cluster projects, smaller and lesser known linux variants (which would still work), Caosity... So you have a plethora of choices, and I'm not about to tell you which one is "best" as the answer is none of them -- they are mostly pretty good, with various constellations of advantages and disadvantages. Not to let any good opportunity for editorializing pass, though... ...my major beef with most of the cluster distributions is that they really require one to bend the simplicity and scalability and customizability of repository-based, package-based installation and maintenance schema. In my opinion, the "best" way to install a cluster is from a repository via PXE and something like kickstart if not kickstart itself, where the only thing that differentiates a cluster node from a desktop workstation is the selection of packages installed and some post-install configuration. An acceptable variant of this would be the newer diskless cluster approaches, provided that the exported/cloned node image is package-level controllable and can be kept up to date relative to a well-maintained mirror tree of repositories with a tool like yum or apt. This opinion extends down to some of the best known cluster packages, many of which are still distributed via tarball and #ifdef'd to hell and back or worse, built on top of evil such as e.g. aimk so they'll build on every single variant of Unixoid or non-Unixoid operating system known to mankind. Tarball distribution (except to hackers or people working on the code) is Evil. Heavy code instrumentation to cope with non-(e.g. posix)-compliant operating systems, ancient operating system, commercial operating systems with non-open or non-compliant libraries is Evil. Proper packaging is Good. Compliance with standards (to the point where one has a clean build on an ANSI/POSIX system) is Very Good. These things make it >>easy<< to move a package between linux distributions and permit linux distributions to be built and rebuilt without breaking like hell all over the place. RPM isn't perfect, but it isn't bad and it is in wide use and has smart people actively working on it to improve it further. APT is similarly strongly driven by smart people, and the religious and technical differences between the two simply serve to keep both on their mettle ina competitive and evolutionary world (where in open source they can easily steal the best ideas of their competitors until one day maybe they converge -- or not). Both are adequate as a basis of source level distribution of entire packages that can be easily rebuilt for specific distributions and repositories and purposes. Alas some very useful cluster tools continue to eschew packaging (which would SIMPLIFY their build process and help them tremendously to debug their code and make it fully functional) and continue to waste energy getting their stuff not only to build, but to build after each little debugging change, from tarball, on thirty distinct systems, twenty of which are FUBAR at the library level and remain broken anyway. So (editorializing done) -- take a look at some of the stuff mentioned above or linked to the various main linux clustering sites. If you want the "simplest" approach and are already experienced with some linux distro, just use that distro as a base and install clustering packages and get started that way fairly painlessly. If you want full automation (and to devote a fair bit of time learning to use it) look hard at stuff like ROCKS. Hope this helps, rgb Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
- Previous message: [Beowulf] A Good Linux Distribution to Start with?
- Next message: [Beowulf] A Good Linux Distribution to Start with?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
